Re: Cassandra on RocksDB experiment result

2017-04-19 Thread Jon Haddad
I have no clue what it would take to accomplish a pluggable storage engine, but 
I love this idea.  

Obviously the devil is in the details, & a simple K/V is very different from 
supporting partitions, collections, etc, but this is very cool & seems crazy 
not to explore further.  Will you be open sourcing this work?

Jon


> On Apr 19, 2017, at 9:21 AM, Dikang Gu  wrote:
> 
> Hi Cassandra developers,
> 
> This is Dikang from Instagram, I'd like to share you some experiment
> results we did recently, to use RocksDB as Cassandra's storage engine. In
> the experiment, I built a prototype to integrate Cassandra 3.0.12 and
> RocksDB on single column (key-value) use case, shadowed one of our
> production use case, and saw about 4-6X P99 read latency drop during peak
> time, compared to 3.0.12. Also, the P99 latency became more predictable as
> well.
> 
> Here is detailed note with more metrics:
> 
> https://docs.google.com/document/d/1Ztqcu8Jzh4USKoWBgDJQw82DBurQm
> sV-PmfiJYvu_Dc/edit?usp=sharing
> 
> Please take a look and let me know your thoughts. I think the biggest
> latency win comes from we get rid of most Java garbages created by current
> read/write path and compactions, which reduces the JVM overhead and makes
> the latency to be more predictable.
> 
> We are very excited about the potential performance gain. As the next step,
> I propose to make the Cassandra storage engine to be pluggable (like Mysql
> and MongoDB), and we are very interested in providing RocksDB as one
> storage option with more predictable performance, together with community.
> 
> Thanks.
> 
> -- 
> Dikang



Re: CASSANDRA-9472 Reintroduce off heap memtables - patch to 3.0

2017-07-31 Thread Jon Haddad
+1.  IMO there’s very little reason to use 3.0 at this point.  If someone wants 
to back port and make a 3.0 patch publicly available, cool, but merging it into 
3.0 after 2 years doesn’t make much sense to me.

> On Jul 31, 2017, at 9:26 AM, Jeremiah D Jordan  
> wrote:
> 
> 
>> On Jul 31, 2017, at 12:17 PM, Jeff Jirsa  wrote:
>> On 2017-07-29 10:02 (-0700), Jay Zhuang  
>> wrote: 
>>> Should we consider back-porting it to 3.0 for the community? I think
>>> this is a performance regression instead of new feature. And we have the
>>> feature in 2.1, 2.2.
>>> 
>> 
>> Personally / individually, I'd much rather see 3.0 stabilize.
> 
> +1.  The feature is there in 3.11.x if you are running one of the use cases 
> where this helps, and for most existing things 3.0 and 3.11 are about the 
> same stability, so you can go to 3.11.x if you want to keep using the off 
> heap stuff.
> 
> -Jeremiah
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
> 


-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: [VOTE] Release Apache Cassandra 3.11.1

2017-10-02 Thread Jon Haddad
Same boat as Jeff, +1 since it’s not a regression.

> On Oct 2, 2017, at 2:04 PM, Steinmaurer, Thomas 
> <thomas.steinmau...@dynatrace.com> wrote:
> 
> Jeff of course, not Jon. Sorry :-)
> 
> 
> -Original Message-
> From: Steinmaurer, Thomas [mailto:thomas.steinmau...@dynatrace.com]
> Sent: Montag, 02. Oktober 2017 22:58
> To: dev@cassandra.apache.org
> Subject: RE: [VOTE] Release Apache Cassandra 3.11.1
> 
> Jon,
> 
> yes I also did see it with 3.11.0.
> 
> Thomas
> 
> 
> -Original Message-
> From: Jeff Jirsa [mailto:jji...@gmail.com]
> Sent: Montag, 02. Oktober 2017 22:47
> To: Cassandra DEV <dev@cassandra.apache.org>
> Subject: Re: [VOTE] Release Apache Cassandra 3.11.1
> 
> Thomas, did you see this on 3.11.0 as well, or have you not tried 3.11.0 (I 
> know you probably want fixes from 3.11.1, but let's just clarify that this is 
> or is not a regression).
> 
> If it's not a regression, we should ship this and then hopefully we'll spin a 
> 3.11.2 as soon as this is fixed.
> 
> If it is a regression, I'll flip my vote to -1.
> 
> 
> 
> On Mon, Oct 2, 2017 at 1:29 PM, Steinmaurer, Thomas < 
> thomas.steinmau...@dynatrace.com> wrote:
> 
>> Jon,
>> 
>> please see my latest comment + attached screen from our monitoring here:
>> https://issues.apache.org/jira/browse/CASSANDRA-13754?
>> focusedCommentId=16188758=com.atlassian.jira.
>> plugin.system.issuetabpanels:comment-tabpanel#comment-16188758
>> 
>> Thanks,
>> Thomas
>> 
>> -Original Message-
>> From: Jon Haddad [mailto:jonathan.had...@gmail.com] On Behalf Of Jon
>> Haddad
>> Sent: Montag, 02. Oktober 2017 22:09
>> To: dev@cassandra.apache.org
>> Subject: Re: [VOTE] Release Apache Cassandra 3.11.1
>> 
>> You’re saying the same memory leak happens under 3.11?
>> 
>>> On Oct 2, 2017, at 1:04 PM, Aleksey Yeshchenko <alek...@apple.com>
>> wrote:
>>> 
>>> Thomas,
>>> 
>>> I would maybe agree with waiting for a while because of it, if we
>>> had a
>> proper fix at least under review - or in progress by someone.
>>> 
>>> But this is not a regression, and there’s been a lot of fixes
>> accumulated and not released yet. Arguable worse to hold them back :\
>>> 
>>> —
>>> AY
>>> 
>>> On 2 October 2017 at 20:54:38, Steinmaurer, Thomas (
>> thomas.steinmau...@dynatrace.com) wrote:
>>> 
>>> Jeff,
>>> 
>>> even if it is not a strict regression, this currently forces us to
>>> do a
>> rolling restart every ~ 72hrs to be on the safe-side with -Xmx8G.
>> Luckily this is just a loadtest environment. We don't have 3.11 in 
>> production yet.
>>> 
>>> I can offer to immediately deploy a snapshot build into our loadtest
>> environment, in case this issue gets attention and a fix needs
>> verification at constant load.
>>> 
>>> Thanks,
>>> Thomas
>>> 
>>> -Original Message-
>>> From: Jeff Jirsa [mailto:jji...@gmail.com]
>>> Sent: Montag, 02. Oktober 2017 20:04
>>> To: Cassandra DEV <dev@cassandra.apache.org>
>>> Subject: Re: [VOTE] Release Apache Cassandra 3.11.1
>>> 
>>> +1
>>> 
>>> ( Somewhat concerned that
>>> https://issues.apache.org/jira/browse/CASSANDRA-13754 may not be
>>> fixed,
>> but it's not a strict regression ? )
>>> 
>>> 
>>> 
>>> On Mon, Oct 2, 2017 at 10:58 AM, Michael Shuler
>>> <mich...@pbandjelly.org>
>>> wrote:
>>> 
>>>> I propose the following artifacts for release as 3.11.1.
>>>> 
>>>> sha1: 983c72a84ab6628e09a78ead9e20a0c323a005af
>>>> Git:
>>>> http://git-wip-us.apache.org/repos/asf?p=cassandra.git;a=
>>>> shortlog;h=refs/tags/3.11.1-tentative
>>>> Artifacts:
>>>> https://repository.apache.org/content/repositories/
>>>> orgapachecassandra-1151/org/apache/cassandra/apache-cassandra/3.11.
>>>> 1/
>>>> Staging repository:
>>>> https://repository.apache.org/content/repositories/
>>>> orgapachecassandra-1151/
>>>> 
>>>> The Debian packages are available here:
>>>> http://people.apache.org/~mshuler
>>>> 
>>>> The vote will be open for 72 hours (longer if needed).
>>>> 
>>>> [1]: (CHANGES.txt) https://goo.gl/dZCRk8
>>>> [2]: (NEWS.txt) https

Re: Proposal to retroactively mark materialized views experimental

2017-10-02 Thread Jon Haddad
Developers are also not the only people that are able to make decisions.  
Keeping it in the YAML means an operator can disable it vs a developer *maybe* 
seeing the warning.  Keep in mind not everyone creates tables through CQLSH.

> On Oct 2, 2017, at 2:05 PM, Blake Eggleston <beggles...@apple.com> wrote:
> 
> The message isn't materially different, but it will reach fewer people, 
> later. People typically aren't as attentive to logs as they should be. 
> Developers finding out about new warnings in the logs later than they could 
> have, sometimes even after it's been deployed, is not uncommon. It's happened 
> to me. Requiring a flag will reach everyone trying to use MVs as soon as they 
> start developing against MVs. Logging a warning will reach a subset of users 
> at some point, hopefully. The only downside I can think of for the flag is 
> that it's not as polite.
> 
> On October 2, 2017 at 1:16:10 PM, Josh McKenzie (jmcken...@apache.org) wrote:
> 
> "Nobody is talking about removing MVs."  
> Not precisely true for this email thread:  
> 
> "but should there be some point in the  
> future where we consider removing them from the code base unless they have  
> gotten significant improvement as well?"  
> 
> IMO a .yaml change requirement isn't materially different than barfing a  
> warning on someone's screen during the dev process when they use the DDL  
> for MV's. At the end of the day, it's just a question of how forceful you  
> want that messaging to be. If the cqlsh client prints 'THIS FEATURE IS NOT  
> READY' in big bold letters, that's not going to miscommunicate to a user  
> that 'feature X is ready' when it's not.  
> 
> Much like w/SASI, this is something that's in the code-base that for  
> certain use-cases apparently works just fine. Might be worth considering  
> the approach of making boundaries around those use-cases more rigid instead  
> of throwing the baby out with the bathwater.  
> 
> On Mon, Oct 2, 2017 at 3:32 PM, DuyHai Doan <doanduy...@gmail.com> wrote:  
> 
>> Ok so IF there is a flag to enable MV (à-la UDA/UDF in cassandra.yaml) then  
>> I'm fine with it. I initially understood that we wanted to disable it  
>> definitively. Maybe we should then add an explicit error message when MV is  
>> disabled and someone tries to use it, something like:  
>> 
>> "MV has been disabled, to enable it, turn on the flag  in  
>> cassandra.yaml" so users don't spend 3h searching around  
>> 
>> 
>> On Mon, Oct 2, 2017 at 9:07 PM, Jon Haddad <j...@jonhaddad.com> wrote:  
>> 
>>> There’s a big difference between removal of a protocol that every single  
>>> C* user had to use and disabling a feature which is objectively broken  
>> and  
>>> almost nobody is using. Nobody is talking about removing MVs. If you  
>> want  
>>> to use them you can enable them very trivially, but it should be an  
>>> explicit option because they really aren’t ready for general use.  
>>> 
>>> Claiming disabling by default == removal is not helpful to the  
>>> conversation and is very misleading.  
>>> 
>>> Let’s be practical here. The people that are most likely to put MVs in  
>>> production right now are people new to Cassandra that don’t know any  
>>> better. The people that *should* be using MVs are the contributors to  
>> the  
>>> project. People that actually wrote Cassandra code that can do a patch  
>> and  
>>> push it into prod, and get it submitted upstream when they fix something.  
>>> Yes, a lot of this stuff requires production usage to shake out the bugs,  
>>> that’s fine, but we shouldn’t lie to people and say “feature X is ready”  
>>> when it’s not. That’s a great way to get a reputation as “unstable” or  
>>> “not fit for production."  
>>> 
>>> Jon  
>>> 
>>> 
>>>> On Oct 2, 2017, at 11:54 AM, DuyHai Doan <doanduy...@gmail.com> wrote:  
>>>> 
>>>> "I would (in a patch release) disable MV CREATE statements, and emit  
>>>> warnings for ALTER statements and on schema load if they’re not  
>>> explicitly  
>>>> enabled"  
>>>> 
>>>> --> I find this pretty extreme. Now we have an existing feature sitting  
>>>> there in the base code but forbidden from version xxx onward.  
>>>> 
>>>> Since when do we start removing feature in a patch release ?  
>> (forbidding  
>>> to  
>>>> create new MV == removing the feature, defacto)  
>>>> 
>>>> Eve

Re: [VOTE] Release Apache Cassandra 3.11.1

2017-10-02 Thread Jon Haddad
You’re saying the same memory leak happens under 3.11?  

> On Oct 2, 2017, at 1:04 PM, Aleksey Yeshchenko  wrote:
> 
> Thomas,
> 
> I would maybe agree with waiting for a while because of it, if we had a 
> proper fix at least under review - or in progress by someone.
> 
> But this is not a regression, and there’s been a lot of fixes accumulated and 
> not released yet. Arguable worse to hold them back :\
> 
> —
> AY
> 
> On 2 October 2017 at 20:54:38, Steinmaurer, Thomas 
> (thomas.steinmau...@dynatrace.com) wrote:
> 
> Jeff,  
> 
> even if it is not a strict regression, this currently forces us to do a 
> rolling restart every ~ 72hrs to be on the safe-side with -Xmx8G. Luckily 
> this is just a loadtest environment. We don't have 3.11 in production yet.  
> 
> I can offer to immediately deploy a snapshot build into our loadtest 
> environment, in case this issue gets attention and a fix needs verification 
> at constant load.  
> 
> Thanks,  
> Thomas  
> 
> -Original Message-  
> From: Jeff Jirsa [mailto:jji...@gmail.com]  
> Sent: Montag, 02. Oktober 2017 20:04  
> To: Cassandra DEV   
> Subject: Re: [VOTE] Release Apache Cassandra 3.11.1  
> 
> +1  
> 
> ( Somewhat concerned that  
> https://issues.apache.org/jira/browse/CASSANDRA-13754 may not be fixed, but 
> it's not a strict regression ? )  
> 
> 
> 
> On Mon, Oct 2, 2017 at 10:58 AM, Michael Shuler   
> wrote:  
> 
>> I propose the following artifacts for release as 3.11.1.  
>> 
>> sha1: 983c72a84ab6628e09a78ead9e20a0c323a005af  
>> Git:  
>> http://git-wip-us.apache.org/repos/asf?p=cassandra.git;a=  
>> shortlog;h=refs/tags/3.11.1-tentative  
>> Artifacts:  
>> https://repository.apache.org/content/repositories/  
>> orgapachecassandra-1151/org/apache/cassandra/apache-cassandra/3.11.1/  
>> Staging repository:  
>> https://repository.apache.org/content/repositories/  
>> orgapachecassandra-1151/  
>> 
>> The Debian packages are available here:  
>> http://people.apache.org/~mshuler  
>> 
>> The vote will be open for 72 hours (longer if needed).  
>> 
>> [1]: (CHANGES.txt) https://goo.gl/dZCRk8  
>> [2]: (NEWS.txt) https://goo.gl/rh24MX  
>> 
>> -  
>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org  
>> For additional commands, e-mail: dev-h...@cassandra.apache.org  
>> 
>> 
> The contents of this e-mail are intended for the named addressee only. It 
> contains information that may be confidential. Unless you are the named 
> addressee or an authorized designee, you may not copy or use it, or disclose 
> it to anyone else. If you received it in error please notify us immediately 
> and then destroy it. Dynatrace Austria GmbH (registration number FN 91482h) 
> is a company registered in Linz whose registered office is at 4040 Linz, 
> Austria, Freistädterstraße 313  
> 
> -  
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org  
> For additional commands, e-mail: dev-h...@cassandra.apache.org  
> 


-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: Weekly Cassandra Wrap-Up: Oct 16 Edition

2017-10-16 Thread Jon Haddad
Regarding the stress tests, if you’re willing to share, I’m starting a repo 
where we can keep a bunch of different stress profiles.  I’d like to start 
running them on releases before we agree to push them out.  If anyone has a 
stress test they are willing to share, please get in touch with me!



> On Oct 16, 2017, at 8:37 AM, Jeff Jirsa  wrote:
> 
> I got some feedback last week that I should try this on Monday morning, so
> let's see if we can nudge a few people into action this week.
> 
> 3.0.15 and 3.11.1 are released. This is a dev list, so that shouldn't be a
> surprise to anyone here - you should have seen the votes and release
> notifications. The people working directly ON Cassandra every day are
> probably very aware of the number and nature of fixes in those versions -
> if you're not aware, the Change lists are HUGE, and some of the fixes are
> VERY IMPORTANT. So this week's wrap-up is really a reflection on the size
> of those two release changelogs.
> 
> One of the advantages of the Cassandra project is the size of the user base
> - I don't know if we have accurate counts (and some of the "surveys" are
> laughable), but we know it's on the order of thousands (probably tens of
> thousands) of companies, and some huge number of instances (not willing to
> speculate here, we know it's at least in the hundreds of thousands, may be
> well into the millions). Historically, the best stabilizer of a release was
> people upgrading their unusual use cases, finding bugs that the developers
> hadn't anticipated (and therefore tests didn't exist for those edge cases),
> reporting them, and the next release would be slightly better than the one
> before it. The chicken/egg problem here is pretty obvious, and while a lot
> of us are spending a lot of time making things better, I want to use this
> email to ask a favor (in 3 parts):
> 
> 1) If you haven't tried 3.0 or 3.11 yet, please spin it up on a test
> cluster. 3.11 would be better, 3.0 is ok too. It doesn't need to be a
> thousand node cluster, most of the weird stuff we've seen in the post-3.0
> world deals with data, not cluster size. Grab some of your prod data if you
> can, throw it into a test cluster, add a node/remove a node, tell us if it
> doesn't work.
> 2) Please run a stress workload against that test cluster, even if it's
> 5-10 minutes. Purpose here is two-fold: like #1, it'll help us find some
> edge cases we haven't seen before, but it'll also help us identify holes in
> stress coverage. We have some tickets to add UDTs to stress (
> https://issues.apache.org/jira/browse/CASSANDRA-13260 ) and LWT (
> https://issues.apache.org/jira/browse/CASSANDRA-7960 ). Ideally your stress
> profile should be more than "80% reads 20% writes" - try to actually model
> your schema and query behavior. Do you use static columns? Do you use
> collections?  If you're unable to model your use case because of a
> deficiency in stress, open a JIRA. If things break, open a JIRA. If it
> works perfectly, I'm interested in seeing your stress yaml and results
> (please send it to me privately, don't spam the list).
> 3) If you're somehow not able to run stress because you don't have hardware
> for a spare cluster, profiling your live cluster is also incredibly useful.
> TLP has some notes on how to generate flame graphs -
> https://github.com/thelastpickle/lightweight-java-profiler - I saw one
> example from a cluster that really surprised me. There are versions and use
> cases that we know have been heavily profiled, but there are probably
> versions and use cases where nobody's ever run much in the way of
> profiling. If you're running openjdk in prod, and you're able to SAFELY
> attach a profiler to generate some flame graphs, please send those to me
> (again, privately please, I don't think the whole list needs a copy).
> 
> My hope in all of this is to build up a corpus of real world use cases (and
> real current state via profiling) that we can leverage to make testing and
> performance better going forward. If I get much in the way of response to
> either of these, I'll try to send out a summary in next week's email).
> 
> - Jeff


-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: [VOTE] Release Apache Cassandra 3.11.1

2017-10-02 Thread Jon Haddad
The comment at the end of CASSANDRA-13754 
 is a bit concerning, as 
it was from yesterday and the user is seeing heap issues.  It would be 
unfortunate to have to pull the release if it’s suffering from a major memory 
leak.

> On Oct 2, 2017, at 11:01 AM, Aleksey Yeshchenko  wrote:
> 
> +1
> 
> —
> AY
> 
> On 2 October 2017 at 18:58:57, Michael Shuler (mich...@pbandjelly.org) wrote:
> 
> I propose the following artifacts for release as 3.11.1.  
> 
> sha1: 983c72a84ab6628e09a78ead9e20a0c323a005af  
> Git:  
> http://git-wip-us.apache.org/repos/asf?p=cassandra.git;a=shortlog;h=refs/tags/3.11.1-tentative
>   
> Artifacts:  
> https://repository.apache.org/content/repositories/orgapachecassandra-1151/org/apache/cassandra/apache-cassandra/3.11.1/
>   
> Staging repository:  
> https://repository.apache.org/content/repositories/orgapachecassandra-1151/  
> 
> The Debian packages are available here: http://people.apache.org/~mshuler  
> 
> The vote will be open for 72 hours (longer if needed).  
> 
> [1]: (CHANGES.txt) https://goo.gl/dZCRk8  
> [2]: (NEWS.txt) https://goo.gl/rh24MX  
> 
> -  
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org  
> For additional commands, e-mail: dev-h...@cassandra.apache.org  
> 



Re: [VOTE] Release Apache Cassandra 3.0.15

2017-10-02 Thread Jon Haddad
+1

> On Oct 2, 2017, at 11:00 AM, Brandon Williams  wrote:
> 
> +1
> 
> On Oct 2, 2017 12:58 PM, "Aleksey Yeshchenko"  wrote:
> 
>> +1
>> 
>> —
>> AY
>> 
>> On 2 October 2017 at 18:18:26, Michael Shuler (mich...@pbandjelly.org)
>> wrote:
>> 
>> I propose the following artifacts for release as 3.0.15.
>> 
>> sha1: b32a9e6452c78e6ad08e371314bf1ab7492d0773
>> Git:
>> http://git-wip-us.apache.org/repos/asf?p=cassandra.git;a=
>> shortlog;h=refs/tags/3.0.15-tentative
>> Artifacts:
>> https://repository.apache.org/content/repositories/
>> orgapachecassandra-1150/org/apache/cassandra/apache-cassandra/3.0.15/
>> Staging repository:
>> https://repository.apache.org/content/repositories/
>> orgapachecassandra-1150/
>> 
>> The Debian packages are available here: http://people.apache.org/~mshuler
>> 
>> The vote will be open for 72 hours (longer if needed).
>> 
>> [1]: (CHANGES.txt) https://goo.gl/RyuPpw
>> [2]: (NEWS.txt) https://goo.gl/qxwUti
>> 
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
>> For additional commands, e-mail: dev-h...@cassandra.apache.org
>> 
>> 


-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: Proposal to retroactively mark materialized views experimental

2017-10-02 Thread Jon Haddad
Having now helped a few folks that have put MVs into prod without realizing 
what they got themselves into, I’m +1 on a flag disabling the feature by 
default.  A WARN message would not have helped them.


> On Oct 2, 2017, at 10:56 AM, Blake Eggleston  wrote:
> 
> Yeah I’m not sure that just emitting a warning is enough. The point is to be 
> super explicit that bad things will happen if you use MVs. I would (in a 
> patch release) disable MV CREATE statements, and emit warnings for ALTER 
> statements and on schema load if they’re not explicitly enabled. Only 
> emitting a warning really reduces visibility where we need it: in the 
> development process.
> 
> By only emitting warning, we're just protecting users that don't run even 
> rudimentary tests before upgrading their clusters. If an operator is going to 
> blindly deploy a database update to prod without testing, they’re going to 
> poke their eye out on something anyway. Whether it’s an MV flag or something 
> else. If we make this change clear in NEWS.txt, and the user@ list, I think 
> that’s the best thing to do.
> 
> 
> On October 2, 2017 at 10:18:52 AM, Jeremiah D Jordan 
> (jeremiah.jor...@gmail.com) wrote:
> 
> Hindsight is 20/20. For 8099 this is the reason we cut the 2.2 release before 
> 8099 got merged.  
> 
> But moving forward with where we are now, if we are going to start adding 
> some experimental flags to things, then I would definitely put SASI on this 
> list as well.  
> 
> For both SASI and MV I don’t know that adding a flags in the cassandra.yaml 
> which prevents their use is the right way to go. I would propose that we emit 
> WARN from the native protocol mechanism when a user does an ALTER/CREATE what 
> ever that tries to use an experiment feature, and probably in the system.log 
> as well.  So someone who is starting new development using them will get a 
> warning showing up in cqlsh “hey the thing you just used is experimental, 
> proceed with caution” and also in their logs.  
> 
> These things are live on clusters right now, and I would not want someone to 
> upgrade their cluster to a new *patch* release and suddenly something that 
> may have been working for them now does not function. Anyway, we need to be 
> careful about how this gets put into practice if we are going to do it 
> retroactively.  
> 
> -Jeremiah  
> 
> 
>> On Oct 1, 2017, at 5:36 PM, Josh McKenzie  wrote:  
>> 
>>> 
>>> I think committing 8099, or at the very least, parts of it, behind an  
>>> experimental flag would have been the right thing to do.  
>> 
>> With a major refactor like that, it's a staggering amount of extra work to  
>> have a parallel re-write of core components of a storage engine accessible  
>> in parallel to the major based on an experimental flag in the same branch.  
>> I think the complexity in the code-base of having two such channels in  
>> parallel would be an altogether different kind of burden along with making  
>> the work take considerably longer. The argument of modularizing a change  
>> like that, however, is something I can get behind as a matter of general  
>> principle. As we discussed at NGCC, the amount of static state in the C*  
>> code-base makes this an aspirational goal rather than a reality all too  
>> often, unfortunately.  
>> 
>> Not looking to get into the discussion of the appropriateness of 8099 and  
>> other major refactors like it (nio MessagingService for instance) - but  
>> there's a difference between building out new features and shielding the  
>> code-base and users from their complexity and reliability and refactoring  
>> core components of the code-base to keep it relevant.  
>> 
>> On Sun, Oct 1, 2017 at 5:01 PM, Dave Brosius  wrote:  
>> 
>>> triggers  
>>> 
>>> 
>>> On 10/01/2017 11:25 AM, Jeff Jirsa wrote:  
>>> 
 Historical examples are anything that you wouldn’t bet your job on for  
 the first release:  
 
 Udf/uda in 2.2  
 Incremental repair - would have yanked the flag following 9143  
 SASI - probably still experimental  
 Counters - all sorts of correctness issues originally, no longer true  
 since the rewrite in 2.1  
 Vnodes - or at least shuffle  
 CDC - is the API going to change or is it good as-is?  
 CQL - we’re on v3, what’s that say about v1?  
 
 Basically anything where we can’t definitively say “this feature is going  
 to work for you, build your product on it” because companies around the  
 world are trying to make that determination on their own, and they don’t  
 have the same insight that the active committers have.  
 
 The transition out we could define as a fixed number of releases or a dev@ 
  
 vote, I don’t think you’ll find something that applies to all experimental 
  
 features, so being flexible is probably the best bet there  
 
 
 
>>> 
>>> 

Re: Proposal to retroactively mark materialized views experimental

2017-10-02 Thread Jon Haddad
There’s a big difference between removal of a protocol that every single C* 
user had to use and disabling a feature which is objectively broken and almost 
nobody is using.  Nobody is talking about removing MVs.  If you want to use 
them you can enable them very trivially, but it should be an explicit option 
because they really aren’t ready for general use.

Claiming disabling by default == removal is not helpful to the conversation and 
is very misleading.  

Let’s be practical here.  The people that are most likely to put MVs in 
production right now are people new to Cassandra that don’t know any better.  
The people that *should* be using MVs are the contributors to the project.  
People that actually wrote Cassandra code that can do a patch and push it into 
prod, and get it submitted upstream when they fix something.  Yes, a lot of 
this stuff requires production usage to shake out the bugs, that’s fine, but we 
shouldn’t lie to people and say “feature X is ready” when it’s not.  That’s a 
great way to get a reputation as “unstable” or “not fit for production."

Jon


> On Oct 2, 2017, at 11:54 AM, DuyHai Doan  wrote:
> 
> "I would (in a patch release) disable MV CREATE statements, and emit
> warnings for ALTER statements and on schema load if they’re not explicitly
> enabled"
> 
> --> I find this pretty extreme. Now we have an existing feature sitting
> there in the base code but forbidden from version xxx onward.
> 
> Since when do we start removing feature in a patch release ? (forbidding to
> create new MV == removing the feature, defacto)
> 
> Even the Thrift protocol has gone through a long process of deprecation and
> will be removed on 4.0
> 
> And if we start opening the Pandora box like this, what's next ? Forbidding
> to create SASI index too ? Removing Vnodes ?
> 
> 
> 
> 
> On Mon, Oct 2, 2017 at 8:16 PM, Jeremiah D Jordan > wrote:
> 
>>> Only emitting a warning really reduces visibility where we need it: in
>> the development process.
>> 
>> How does emitting a native protocol warning reduce visibility during the
>> development process?  If you run CREATE MV and cqlsh then prints out a
>> giant warning statement about how it is an experimental feature I think
>> that is pretty visible during development?
>> 
>> I guess I can see just blocking new ones without a flag set, but we need
>> to be careful here.  We need to make sure we don’t cause a problem for
>> someone that is using them currently, even with all the edge cases issues
>> they have now.
>> 
>> -Jeremiah
>> 
>> 
>>> On Oct 2, 2017, at 2:01 PM, Blake Eggleston 
>> wrote:
>>> 
>>> Yeah, I'm not proposing that we disable MVs in existing clusters.
>>> 
>>> 
>>> On October 2, 2017 at 10:58:11 AM, Aleksey Yeshchenko (alek...@apple.com)
>> wrote:
>>> 
>>> The idea is to check the flag in CreateViewStatement, so creation of new
>> MVs doesn’t succeed without that flag flipped.
>>> 
>>> Obviously, just disabling existing MVs working in a minor would be silly.
>>> 
>>> As for the warning - yes, that should also be emitted. Unconditionally.
>>> 
>>> —
>>> AY
>>> 
>>> On 2 October 2017 at 18:18:52, Jeremiah D Jordan (
>> jeremiah.jor...@gmail.com) wrote:
>>> 
>>> These things are live on clusters right now, and I would not want
>> someone to upgrade their cluster to a new *patch* release and suddenly
>> something that may have been working for them now does not function.
>> Anyway, we need to be careful about how this gets put into practice if we
>> are going to do it retroactively.
>> 
>> 
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
>> For additional commands, e-mail: dev-h...@cassandra.apache.org
>> 
>> 


-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: Proposal to retroactively mark materialized views experimental

2017-09-29 Thread Jon Haddad
I’m very much +1 on this, and to new features in general.  

I think having a clear line in which we classify something as production ready 
would be nice.  It would be great if committers were using the feature in prod 
and could vouch for it’s stability.

> On Sep 29, 2017, at 1:09 PM, Blake Eggleston  wrote:
> 
> Hi dev@,
> 
> I’d like to propose that we retroactively classify materialized views as an 
> experimental feature, disable them by default, and require users to enable 
> them through a config setting before using.
> 
> Materialized views have several issues that make them (effectively) unusable 
> in production. Some of the issues aren’t just implementation problems, but 
> problems with the design that aren’t easily fixed. It’s unfair of us to make 
> features available to users in this state without providing a clear warning 
> that bad or unexpected things are likely to happen if they use it.
> 
> Obviously, this isn’t great news for users that have already adopted MVs, and 
> I don’t have a great answer for that. I think that’s sort of a sunk cost at 
> this point. If they have any MV related problems, they’ll have them whether 
> they’re marked experimental or not. I would expect this to reduce the number 
> of users adopting MVs in the future though, and if they do, it would be 
> opt-in.
> 
> Once MVs reach a point where they’re usable in production, we can remove the 
> flag. Specifics of how the experimental flag would work can be hammered out 
> in a forthcoming JIRA, but I’d imagine it would just prevent users from 
> creating new MVs, and maybe log warnings on startup for existing MVs if the 
> flag isn’t enabled.
> 
> Let me know what you think.
> 
> Thanks,
> 
> Blake


-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: Proposal to retroactively mark materialized views experimental

2017-10-04 Thread Jon Haddad
The default part I was referring to incremental repair.

SASI still has a pretty fatal issue where nodes OOM: 
https://issues.apache.org/jira/browse/CASSANDRA-12662 
<https://issues.apache.org/jira/browse/CASSANDRA-12662> 



> On Oct 4, 2017, at 12:21 PM, Pavel Yaskevich <pove...@gmail.com> wrote:
> 
> On Wed, Oct 4, 2017 at 12:09 PM, Jon Haddad <j...@jonhaddad.com 
> <mailto:j...@jonhaddad.com>> wrote:
> 
>> MVs work fine for *some use cases*, not the general use case.  That’s why
>> there should be a flag.  To opt into the feature when the behavior is only
>> known to be correct under a certain set of circumstances.  Nobody is saying
>> the flag should be “enable_terrible_feature_nobody_tested_and_we_all_hate”,
>> or something ridiculous like that.  It’s not an attack against the work
>> done by anyone, the level of effort put in, or minimizing the complexity of
>> the problem.  “enable_materialized_views” would be just fine.
>> 
>> We should be honest to people about what they’re getting into.  You may
>> not be aware of this, but a lot of people still believe Cassandra isn’t a
>> DB that you should put in prod.  It’s because features like SASI, MVs,  or
>> incremental repair get merged in prematurely (or even made the default),
>> without having been thoroughly tested, understood and vetted by trusted
>> community members.  New users hit the snags because they deploy the
>> bleeding edge code and hit the bugs.
>> 
> 
> I beg to differ in case of SASI, it has been tested and vetted and ported
> to different versions. I'm pretty sure it still has better test coverage
> then most of the project does, it's not a "default" and you actually have
> to opt-in to it by creating a custom index, how is that premature or
> misleading to users?
> 
> 
>> 
>> That’s not how the process should work.
>> 
>> Ideally, we’d follow a process that looks a lot more like this:
>> 
>> 1. New feature is built with an opt in flag.  Unknowns are documented, the
>> risk of using the feature is known to the end user.
>> 2. People test and use the feature that know what they’re doing.  They are
>> able to read the code, submit patches, and help flush out the issues.  They
>> do so in low risk environments.  In the case of MVs, they can afford to
>> drop and rebuild the view over a week, or rebuild the cluster altogether.
>> We may not even need to worry as much about backwards compatibility.
>> 3. The feature matures.  More tests are written.  More people become aware
>> of how to contribute to the feature’s stability.
>> 4. After a while, we vote on removing the feature flag and declare it
>> stable for general usage.
>> 
>> If nobody actually cares about a feature (why it was it written in the
>> first place?), then it would never get to 2, 3, 4.  It would take a while
>> for big features like MVs to be marked stable, and that’s fine, because it
>> takes a long time to actually stabilize them.  I think we can all agree
>> they are really, really hard problems to solve, and maybe it takes a while.
>> 
>> Jon
>> 
>> 
>> 
>>> On Oct 4, 2017, at 11:44 AM, Josh McKenzie <jmcken...@apache.org> wrote:
>>> 
>>>> 
>>>> So you’d rather continue to lie to users about the stability of the
>>>> feature rather than admitting it was merged in prematurely?
>>> 
>>> 
>>> Much like w/SASI, this is something that's in the code-base that for
>>>> certain use-cases apparently works just fine.
>>> 
>>> I don't know of any outstanding issues with the feature,
>>> 
>>> There appear to be varying levels of understanding of the implementation
>>>> details of MV's (that seem to directly correlate with faith in the
>>>> feature's correctness for the use-cases recommended)
>>> 
>>> We have users in the wild relying on MV's with apparent success (same
>> holds
>>>> true of all the other punching bags that have come up in this thread)
>>> 
>>> You're right, Jon. That's clearly exactly what I'm saying.
>>> 
>>> 
>>> On Wed, Oct 4, 2017 at 2:39 PM, Jon Haddad <j...@jonhaddad.com> wrote:
>>> 
>>>> So you’d rather continue to lie to users about the stability of the
>>>> feature rather than admitting it was merged in prematurely?  I’d rather
>>>> come clean and avoid future problems, and give people the opportunity to
>>>> stop using MVs rather than let them keep taking risks they’re unaware
>> of.
>>>> This is incredib

Re: Proposal to retroactively mark materialized views experimental

2017-10-04 Thread Jon Haddad
So you’d rather continue to lie to users about the stability of the feature 
rather than admitting it was merged in prematurely?  I’d rather come clean and 
avoid future problems, and give people the opportunity to stop using MVs rather 
than let them keep taking risks they’re unaware of.  This is incredibly 
irresponsible in my opinion.  

> On Oct 4, 2017, at 11:26 AM, Josh McKenzie  wrote:
> 
>> 
>> Oh, come on. You're being disingenuous.
> 
> Not my intent. MV's (and SASI, for example) are fairly well isolated; we
> have a history of other changes that are much more broadly and higher
> impact risk-wise across the code-base.
> 
> If I were an operator and built a critical part of my business on a
> released feature that developers then decided to default-disable as
> 'experimental' post-hoc, I'd think long and hard about using any new
> features in that project in the future (and revisit my confidence in all
> other features I relied on, and the software as a whole). We have users in
> the wild relying on MV's with apparent success (same holds true of all the
> other punching bags that have come up in this thread) and I'd hate to see
> us alienate them by being over-aggressive in the way we handle this.
> 
> I'd much rather we continue to aggressively improve and continue to analyze
> MV's stability before a 4.0 release and then use the experimental flag in
> the future, if at all possible.
> 
> On Wed, Oct 4, 2017 at 2:01 PM, Benedict Elliott Smith 
> <_...@belliottsmith.com>
> wrote:
> 
>> Can't we promote these behavioural flags to keyspace properties (with
>> suitable permissions to edit necessary)?
>> 
>> I agree that enabling/disabling features shouldn't require a rolling
>> restart, and nor should switching their consistency safety level.
>> 
>> I think this would be the most suitable equivalent to ALLOW FILTERING for
>> MVs.
>> 
>> 
>> 
>>> On 4 Oct 2017, at 12:31, Jeremy Hanna 
>> wrote:
>>> 
>>> Not to detract from the discussion about whether or not to classify X or
>> Y as experimental but https://issues.apache.org/jira/browse/CASSANDRA-8303
>>  was originally
>> about operators preventing users from abusing features (e.g. allow
>> filtering).  Could that concept be extended to features like MVs or SASI or
>> anything else?  On the one hand it is nice to be able to set those things
>> dynamically without a rolling restart as well as by user.  On the other
>> it’s less clear about defaults.  There could be a property file or just in
>> the yaml, the operator could specify the default features that are enabled
>> for users and then it could be overridden within that framework.
>>> 
 On Oct 4, 2017, at 10:24 AM, Aleksey Yeshchenko 
>> wrote:
 
 We already have those for UDFs and CDC.
 
 We should have more: for triggers, SASI, and MVs, at least. Operators
>> need a way to disable features they haven’t validated.
 
 We already have sufficient consensus to introduce the flags, and we
>> should. There also seems to be sufficient consensus on emitting warnings.
 
 The debate is now on their defaults for MVs in 3.0, 3.11, and 4.0. I
>> agree with Sylvain that flipping the default in a minor would be invasive.
>> We shouldn’t do that.
 
 For trunk, though, I think we should default to off. When it comes to
>> releasing 4.0 we can collectively decide if there is sufficient trust in
>> MVs at the time to warrant flipping the default to true. Ultimately we can
>> decide this in a PMC vote. If I misread the consensus regarding the default
>> for 4.0, then we might as well vote on that. What I see is sufficient
>> distrust coming from core committers, including the author of the v1
>> design, to warrant opt-in for MVs.
 
 If we don’t trust in them as developers, we shouldn’t be cavalier with
>> the users, either. Not until that trust is gained/regained.
 
 —
 AY
 
 On 4 October 2017 at 13:26:10, Stefan Podkowinski (s...@apache.org)
>> wrote:
 
 Introducing feature flags for enabling or disabling different code paths
 is not sustainable in the long run. It's hard enough to keep up with
 integration testing with the couple of Jenkins jobs that we have.
 Running jobs for all permutations of flags that we keep around, would
 turn out impractical. But if we don't, I'm pretty sure something will
 fall off the radar and it won't take long until someone reports that
 enabling feature X after the latest upgrade will simply not work
>> anymore.
 
 There may also be some more subtle assumptions and cross dependencies
 between features that may cause side effects by disabling a feature (or
 parts of it), even if it's just e.g. a metric value that suddenly won't
 get updated anymore, but is used somewhere else. We'll also have to
 consider migration paths for turning a 

Re: Proposal to retroactively mark materialized views experimental

2017-10-04 Thread Jon Haddad
MVs work fine for *some use cases*, not the general use case.  That’s why there 
should be a flag.  To opt into the feature when the behavior is only known to 
be correct under a certain set of circumstances.  Nobody is saying the flag 
should be “enable_terrible_feature_nobody_tested_and_we_all_hate”, or something 
ridiculous like that.  It’s not an attack against the work done by anyone, the 
level of effort put in, or minimizing the complexity of the problem.  
“enable_materialized_views” would be just fine.

We should be honest to people about what they’re getting into.  You may not be 
aware of this, but a lot of people still believe Cassandra isn’t a DB that you 
should put in prod.  It’s because features like SASI, MVs,  or incremental 
repair get merged in prematurely (or even made the default), without having 
been thoroughly tested, understood and vetted by trusted community members.  
New users hit the snags because they deploy the bleeding edge code and hit the 
bugs. 

That’s not how the process should work.  

Ideally, we’d follow a process that looks a lot more like this:

1. New feature is built with an opt in flag.  Unknowns are documented, the risk 
of using the feature is known to the end user.  
2. People test and use the feature that know what they’re doing.  They are able 
to read the code, submit patches, and help flush out the issues.  They do so in 
low risk environments.  In the case of MVs, they can afford to drop and rebuild 
the view over a week, or rebuild the cluster altogether.  We may not even need 
to worry as much about backwards compatibility.
3. The feature matures.  More tests are written.  More people become aware of 
how to contribute to the feature’s stability.
4. After a while, we vote on removing the feature flag and declare it stable 
for general usage.

If nobody actually cares about a feature (why it was it written in the first 
place?), then it would never get to 2, 3, 4.  It would take a while for big 
features like MVs to be marked stable, and that’s fine, because it takes a long 
time to actually stabilize them.  I think we can all agree they are really, 
really hard problems to solve, and maybe it takes a while.

Jon



> On Oct 4, 2017, at 11:44 AM, Josh McKenzie <jmcken...@apache.org> wrote:
> 
>> 
>> So you’d rather continue to lie to users about the stability of the
>> feature rather than admitting it was merged in prematurely?
> 
> 
> Much like w/SASI, this is something that's in the code-base that for
>> certain use-cases apparently works just fine.
> 
> I don't know of any outstanding issues with the feature,
> 
> There appear to be varying levels of understanding of the implementation
>> details of MV's (that seem to directly correlate with faith in the
>> feature's correctness for the use-cases recommended)
> 
> We have users in the wild relying on MV's with apparent success (same holds
>> true of all the other punching bags that have come up in this thread)
> 
> You're right, Jon. That's clearly exactly what I'm saying.
> 
> 
> On Wed, Oct 4, 2017 at 2:39 PM, Jon Haddad <j...@jonhaddad.com> wrote:
> 
>> So you’d rather continue to lie to users about the stability of the
>> feature rather than admitting it was merged in prematurely?  I’d rather
>> come clean and avoid future problems, and give people the opportunity to
>> stop using MVs rather than let them keep taking risks they’re unaware of.
>> This is incredibly irresponsible in my opinion.
>> 
>>> On Oct 4, 2017, at 11:26 AM, Josh McKenzie <jmcken...@apache.org> wrote:
>>> 
>>>> 
>>>> Oh, come on. You're being disingenuous.
>>> 
>>> Not my intent. MV's (and SASI, for example) are fairly well isolated; we
>>> have a history of other changes that are much more broadly and higher
>>> impact risk-wise across the code-base.
>>> 
>>> If I were an operator and built a critical part of my business on a
>>> released feature that developers then decided to default-disable as
>>> 'experimental' post-hoc, I'd think long and hard about using any new
>>> features in that project in the future (and revisit my confidence in all
>>> other features I relied on, and the software as a whole). We have users
>> in
>>> the wild relying on MV's with apparent success (same holds true of all
>> the
>>> other punching bags that have come up in this thread) and I'd hate to see
>>> us alienate them by being over-aggressive in the way we handle this.
>>> 
>>> I'd much rather we continue to aggressively improve and continue to
>> analyze
>>> MV's stability before a 4.0 release and then use the experimental flag in
>>> the future, if at all possible.
>>> 
&g

Re: [PROPOSAL] Migrate to pytest from nosetests for dtests

2017-11-28 Thread Jon Haddad
+1

I stopped using nose a long time ago in favor of py.test.  It’s a significant 
improvement.

> On Nov 28, 2017, at 10:49 AM, Michael Kjellman  wrote:
> 
> I'd like to propose we move from nosetest to pytest for the dtests. It looks 
> like nosetests is basically abandoned, the python community doesn't like it, 
> it hasn't been updated since 2015, and pytest even has nosetests support 
> which would help us greatly during migration 
> (https://docs.pytest.org/en/latest/nose.html).
> 
> Thoughts?
> 
> best,
> kjellman


-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



duration based config settings

2017-12-04 Thread Jon Haddad
I ways back I had entire CASSANDRA-13976 out of sheer annoyance to change the 
hint time to be in minutes instead of ms.  Millisecond based resolution is a 
bit absurd for things like hints.  I figured minutes would be better, but after 
some back and forth realized durations (3h, 30m, etc) would be a lot easier to 
work with, and would probably be appropriate across the board.

I’ve dealt with quite a few clusters in the last year, and I’ve seen a handful 
of fat fingered config files, or non-standard times that make me bust out a 
calculator to be sure I’ve got things sorted out right, hence the original 
issue.

Jeff Jirsa suggested migrating to duration types would result in migration 
pain, and I’m not disagreeing with him.  I think we if were to move to duration 
types, we’d want something like the following:

1. add a blah_blah for every blah_blah_ms setting which accepts a duration
2. convert every setting to use blah_blah 
3. if blah_blah_ms is present, use that for blah_blah and set the duration to ms
4. internally everything converts to ms
5. make all node tool commands use durations 
6. for ever setting that’s switch to blah_blah, leave a note that says the 
setting it’s replacing
7. put a warning when people use blah_blah_ms and suggest the conversation to 
the new config field 
8. *sometime* in the future remove _ms.  Maybe as far as a year or two down the 
line.

This seems to me like a significant change and I wanted to get some more 
opinions on the topic before pushing forward.  Thoughts?

Jon
-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: duration based config settings

2017-12-04 Thread Jon Haddad
Sure, I’m fine w/ letting the _ms settings work indefinitely.  Can revisit 
retiring them if there’s ever a real need, am OK if that never happens.

I’ll create the JIRA.

> On Dec 4, 2017, at 5:19 PM, Nate McCall  wrote:
> 
>> I'd be in favour of never retiring the _ms format though - it's almost
>> free, there's no backward compatibility problems, and it's fairly intuitive
>> so long as we're consistent about it.
>> 
> 
> Agreed on this point. Overall, this will be excellent for usability.
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
> 


-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: custom validation before replication

2017-11-16 Thread Jon Haddad
Looks like you’ve got this thread going on the user & dev ML.  This list is the 
dev one, and is meant for discussion of the Cassandra project.  Would everyone 
mind replying to the thread of the same name on the user list instead?

> On Nov 16, 2017, at 1:36 PM, Abdelkrim Fitouri  wrote:
> 
> ok please find bellow an example:
> 
> Lets suppose that i have a cassandra cluster of 4 nodes / one DC /
> replication factor = 4, So in this architecture i have on full copy of the
> data on each node.
> 
> Imagine now that one node have been hacked and in some way with full access
> to cqlsh session, if data is changed on that node, data will be changed on
> the three other, am i right ?
> 
> imagine now that i am able to know (using cryptographic bases) if one
> column was modified by my API ( => normal way) or not ( => suspicious way),
> and i want to execute this check function just before any replication of a
> keyspace to avoid that all the replica will be affected by that and so a
> rollback will be not easy and the integrity of all the system will be down,
> the check will for example kill the local cassandra service ...
> 
> Hope that my question is more clear now.
> 
> Many thanks for any help.
> 
> 2017-11-16 21:59 GMT+01:00 Nate McCall :
> 
>> On Fri, Nov 17, 2017 at 9:11 AM, Abdelkrim Fitouri 
>> wrote:
>>> Trigger does not resolve my problem because it is not a format validation
>>> issue but an integrity constraint ...
>>> 
>>> My purpose is to check data integrity before replication, by returning an
>>> error and killing the service, so i am killing the node that is supposed
>> to
>>> replicate data after a write action ...
>> 
>> I'm a little confused. Can you provide some specific examples of your
>> requirements?
>> 
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
>> For additional commands, e-mail: dev-h...@cassandra.apache.org
>> 
>> 
> 
> 
> -- 
> 
> Cordialement / Best Regards.
> 
> *Abdelkarim FITOURI*
> 
> LPIC/CEH/ITIL
> 
> System And Security Engineer


-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: Apache Cassandra Wiki access

2017-12-07 Thread Jon Haddad
The wiki is effectively dead.

Please contribute to the in tree docs section: 
https://github.com/apache/cassandra/tree/trunk/doc 


I recently merged in an improvement that uses Docker to generate the docs.  The 
short version:

cd ./doc

# build the Docker image
docker-compose build build-docs

# build the documentation
docker-compose run build-docs

Jon

> On Dec 7, 2017, at 11:25 AM, Russell Bateman  wrote:
> 
> It appears that deeper access to the wiki is available for the asking? 
> https://wiki.apache.org/cassandra/FrontPage states that, "most of the 
> information on this Wiki is being deprecated." Is this already done? Please 
> advise.
> 
> If so, please grant this to me. I don't know that I have a "wiki username". 
> If I need one, and need to give it to you, please choose from:
> 
>   my e-mail address
>   russell.bateman
>   windofkeltia
> 
> 
> Note: I'm specifically looking to write a custom/secondary index plug-in, 
> similar to Stratio's Lucene index.
> 
> Thanks,
> 
> Russ



Re: CASSANDRA-8527

2018-01-05 Thread Jon Haddad
I think it’s reasonable to count the number of range tombstones towards the 
total tombstone count / threshold.  

I agree the number of rows shadowed by the tombstones should be tracked 
separately, and we should probably think a bit more about how to add 
configuration / limits around this without burdening the user with a bunch of 
new flags they have to think about.  I would prefer to avoid any more 
configuration settings as complex as back_pressure_strategy going forward.

> On Jan 5, 2018, at 9:36 AM, Alexander Dejanovski <a...@thelastpickle.com> 
> wrote:
> 
> Hi Aleksey,
> 
> ok we'll split the work and only deal with row level tombstones in
> CASSANDRA-8527 <https://issues.apache.org/jira/browse/CASSANDRA-8527> and
> create a follow up ticket to work on the separate counts you're suggesting.
> My understanding of what you say is that you would not include range
> tombstones in the warn/fail threshold, but row level tombstones have
> somewhat an impact that is similar to cell tombstones. They will be
> retained in memory and will be sent to replicas.
> If we don't count them in the thresholds (at least for warnings), people
> will miss the fact that they may be reading a lot of tombstones.
> Are you ok with including those row tombstones as part of the thresholds ?
> This was the original intent for creating this ticket, which was a
> follow-up to CASSANDRA-8477
> <https://issues.apache.org/jira/browse/CASSANDRA-8477>.
> 
> For the follow up ticket, we may want to move the discussion in JIRA once
> we've create the ticket, but a merge listener seems like the right way to
> detect rows shadowed by range tombstones. That would force to change the
> UnfilteredRowIterator interface to include the tombstones/rows/cells stats
> as this is what is returned from the lower levels of the read path.
> Is there any easier way to achieve this that you can think of, as that
> interface is used in many parts of the code ?
> 
> On Wed, Dec 27, 2017 at 1:35 PM Aleksey Yeshchenko <alek...@apple.com>
> wrote:
> 
>> As Jeff says, the number of actual tombstones is no less relevant than the
>> number of
>> cells and rows shadowed by them. And the way row and cell tombstones
>> affect a read
>> query can be very different from the way a big range tombstone might: you
>> potentially
>> have to retain every (regular) tombstone in memory and have replicas ship
>> them to
>> a coordinator, while you can discard everything shadowed by a big RT and
>> only serialize
>> a tiny bit of the data between the replicas and the coordinator.
>> 
>> So a mixed metric that just mixes up rows and cells shadowed by all three
>> kinds of tombstones
>> without any differentiation, while better than not having that visibility
>> at all, is worse than having
>> a detailed rundown. If possible, I’d love to see proper separate counts
>> tracked: range, row, and single-cell tombstones encountered, and # of rows
>> or cells obsoleted by each type.
>> 
>> I know that this is a non-trivial change, however, but hey, it doesn’t
>> have to be a trivial patch if it’s going into 4.0.
>> 
>> In the meantime I think it’d be helpful to report that single count. But I
>> don’t like the idea of redefining what
>> tombstone_warn_threshold and tombstone_failure_threshold mean, even in a
>> major release, as RTs are
>> qualitatively different from other tombstones, and have a much lower
>> impact per dead row.
>> 
>> —
>> AY
>> 
>> On 22 December 2017 at 03:53:47, kurt greaves (k...@instaclustr.com)
>> wrote:
>> 
>> I think that's a good idea for 4.x, but not so for current branches. I
>> think as long as we document it well for 4.0 upgrades it's not so much of a
>> problem. Obviously there will be cases of queries failing that were
>> previously succeeding but we can already use
>> tombstone_failure|warn_threshold to tune around that already. I don't think
>> we need another yaml setting to enable/disable counting deleted rows for
>> these thresholds, especially because it's going into 4.0. It *might* be a
>> good idea to bump the tombstone failure threshold default as a safety net
>> though (as well as put something in the NEWS.txt).
>> 
>> On 21 December 2017 at 20:11, Jon Haddad <j...@jonhaddad.com> wrote:
>> 
>>> I had suggested to Alex we kick this discussion over to the ML because
>> the
>>> change will have a significant impact on the behavior of Cassandra when
>>> doing reads with range tombstones that cover a lot of rows. The behavior
>>> now is a little weird, a single tombstone could shadow hundreds of

Re: CASSANDRA-8527

2017-12-21 Thread Jon Haddad
The question isn’t so much about reporting them (we should), it’s about the 
behavior of tombstone_warn_threshold and tombstone_failure_threshold.  The 
patch changes the behavior to include the number of rows that are passed over 
due to the range tombstones.  We’re interested in feedback on if it makes sense 
to change the current behavior.  I’m a +.5 on the change, it makes sense to me, 
but am wondering if there’s a case we haven’t considered.  At the very least 
we’d need a NEWS entry since it is a behavior change.


> On Dec 21, 2017, at 12:33 PM, DuyHai Doan <doanduy...@gmail.com> wrote:
> 
> +1 to report range tombstones. This one is quite tricky indeed to track
> 
> +1 to Mockito too, with the reserve that it should be used wisely
> 
> On Thu, Dec 21, 2017 at 9:11 PM, Jon Haddad <j...@jonhaddad.com> wrote:
> 
>> I had suggested to Alex we kick this discussion over to the ML because the
>> change will have a significant impact on the behavior of Cassandra when
>> doing reads with range tombstones that cover a lot of rows.  The behavior
>> now is a little weird, a single tombstone could shadow hundreds of
>> thousands or even millions of rows, and the query would probably just time
>> out.  Personally, I’m in favor of the change in behavior of this patch but
>> I wanted to get some more feedback before committing to it.  Are there any
>> objections to what Alex described?
>> 
>> Regarding Mockito, I’ve been meaning to bring this up for a while, and I’m
>> a solid +1 on introducing it to help with testing.  In an ideal world we’d
>> have no singletons and could test everything in isolation, but
>> realistically that’s a multi year process and we just aren’t there.
>> 
>> 
>>> On Dec 19, 2017, at 11:07 PM, Alexander Dejanovski <
>> a...@thelastpickle.com> wrote:
>>> 
>>> Hi folks,
>>> 
>>> I'm working on CASSANDRA-8527
>>> <https://issues.apache.org/jira/browse/CASSANDRA-8527> and would need to
>>> discuss a few things.
>>> 
>>> The ticket makes it visible in tracing and metrics that rows shadowed by
>>> range tombstones were scanned during reads.
>>> Currently, scanned range tombstones aren't reported anywhere which hides
>>> the cause of performance issues during reads when the users perform
>> primary
>>> key deletes.
>>> As such, they do not count in the warn and failure thresholds.
>>> 
>>> While the number of live rows and tombstone cells is counted in the
>>> ReadCommand class, it is currently not possible to count the number of
>>> range tombstones there as they are merged with the rows they shadow
>> before
>>> reaching the class.
>>> Instead, we can count the number of deleted rows that were read , which
>>> already improves diagnosis and show that range tombstones were scanned :
>>> 
>>> if (row.hasLiveData(ReadCommand.this.nowInSec(), enforceStrictLiveness))
>>>   ++liveRows;
>>> else if (!row.primaryKeyLivenessInfo().isLive(ReadCommand.this.
>> nowInSec()))
>>> {
>>>   // We want to detect primary key deletions only.
>>>   // If all cells have expired they will count as tombstones.
>>>  ++deletedRows;
>>> }
>>> 
>>> Deleted rows would be part of the warning threshold so that we can spot
>> the
>>> range tombstone scans in the logs and tracing would look like this :
>>> 
>>> WARN  [ReadStage-2] 2017-12-18 18:22:31,352 ReadCommand.java:491 -
>>> Read 2086 live rows, 2440 deleted rows and 0 tombstone cells for
>>> query..
>>> 
>>> 
>>> Are there any caveats to that approach ?
>>> Should we include the number of deleted rows in the failure threshold or
>>> make it optional, knowing that it could make some queries fail while they
>>> were passing before ?
>>> 
>>> On a side note, is it ok to bring in Mockito in order to make it easier
>>> writing tests ? I would like to use a Spy in order to write some tests
>>> without starting the whole stack.
>>> 
>>> Thanks,
>>> 
>>> 
>>> --
>>> -
>>> Alexander Dejanovski
>>> France
>>> @alexanderdeja
>>> 
>>> Consultant
>>> Apache Cassandra Consulting
>>> http://www.thelastpickle.com
>> 
>> 
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
>> For additional commands, e-mail: dev-h...@cassandra.apache.org
>> 
>> 


-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: [VOTE] Release Apache Cassandra 2.1.20

2018-02-12 Thread Jon Haddad
+1

> On Feb 12, 2018, at 12:30 PM, Michael Shuler  wrote:
> 
> I propose the following artifacts for release as 2.1.20.
> 
> sha1: b2949439ec62077128103540e42570238520f4ee
> Git:
> http://git-wip-us.apache.org/repos/asf?p=cassandra.git;a=shortlog;h=refs/tags/2.1.20-tentative
> Artifacts:
> https://repository.apache.org/content/repositories/orgapachecassandra-1152/org/apache/cassandra/apache-cassandra/2.1.20/
> Staging repository:
> https://repository.apache.org/content/repositories/orgapachecassandra-1152/
> 
> Debian and RPM packages are available here:
> http://people.apache.org/~mshuler
> 
> *** This release addresses an important fix for CASSANDRA-14092 ***
>"Max ttl of 20 years will overflow localDeletionTime"
>https://issues.apache.org/jira/browse/CASSANDRA-14092
> 
> The vote will be open for 72 hours (longer if needed).
> 
> [1]: (CHANGES.txt) https://goo.gl/5i2nw9
> [2]: (NEWS.txt) https://goo.gl/i9Fg2u
> 


-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: [VOTE] Release Apache Cassandra 3.11.2

2018-02-12 Thread Jon Haddad
I’m a +1 on getting this into 3.11.2.  

> On Feb 12, 2018, at 1:11 PM, mck  wrote:
> 
>> I propose the following artifacts for release as 3.11.2.
>> …
>> The vote will be open for 72 hours (longer if needed).
> 
> 
> I just pushed the back-port of CASSANDRA-13080 (under CASSANDRA-14212).
> This is the improvement "Use new token allocation for non bootstrap case as 
> well".
> We've seen that this effects 3.11.1 users and that it would be positive to 
> see it in 3.11.2.
> 
> Any chance we could recut 3.11.2 ?
> 
> regards,
> Mick
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
> 


-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: [VOTE] Release Apache Cassandra 2.2.12

2018-02-13 Thread Jon Haddad
+1


> On Feb 13, 2018, at 7:21 AM, Marcus Eriksson  wrote:
> 
> +1
> 
> On Tue, Feb 13, 2018 at 4:18 PM, Gary Dusbabek  wrote:
> 
>> +1
>> 
>> On Mon, Feb 12, 2018 at 2:30 PM, Michael Shuler 
>> wrote:
>> 
>>> I propose the following artifacts for release as 2.2.12.
>>> 
>>> sha1: 1602e606348959aead18531cb8027afb15f276e7
>>> Git:
>>> http://git-wip-us.apache.org/repos/asf?p=cassandra.git;a=
>>> shortlog;h=refs/tags/2.2.12-tentative
>>> Artifacts:
>>> https://repository.apache.org/content/repositories/
>>> orgapachecassandra-1153/org/apache/cassandra/apache-cassandra/2.2.12/
>>> Staging repository:
>>> https://repository.apache.org/content/repositories/
>>> orgapachecassandra-1153/
>>> 
>>> Debian and RPM packages are available here:
>>> http://people.apache.org/~mshuler
>>> 
>>> *** This release addresses an important fix for CASSANDRA-14092 ***
>>>"Max ttl of 20 years will overflow localDeletionTime"
>>>https://issues.apache.org/jira/browse/CASSANDRA-14092
>>> 
>>> The vote will be open for 72 hours (longer if needed).
>>> 
>>> [1]: (CHANGES.txt) https://goo.gl/QkJeXH
>>> [2]: (NEWS.txt) https://goo.gl/A4iKFb
>>> 
>>> 
>> 


-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: [VOTE] Release Apache Cassandra 3.0.16

2018-02-13 Thread Jon Haddad
+1

> On Feb 13, 2018, at 10:52 AM, Josh McKenzie  wrote:
> 
> +1
> 
> On Feb 13, 2018 9:20 AM, "Marcus Eriksson"  wrote:
> 
>> +1
>> 
>> On Tue, Feb 13, 2018 at 1:29 PM, Aleksey Yeshchenko 
>> wrote:
>> 
>>> +1
>>> 
>>> —
>>> AY
>>> 
>>> On 12 February 2018 at 20:31:23, Michael Shuler (mich...@pbandjelly.org)
>>> wrote:
>>> 
>>> I propose the following artifacts for release as 3.0.16.
>>> 
>>> sha1: 91e83c72de109521074b14a8eeae1309c3b1f215
>>> Git:
>>> http://git-wip-us.apache.org/repos/asf?p=cassandra.git;a=
>>> shortlog;h=refs/tags/3.0.16-tentative
>>> Artifacts:
>>> https://repository.apache.org/content/repositories/
>>> orgapachecassandra-1154/org/apache/cassandra/apache-cassandra/3.0.16/
>>> Staging repository:
>>> https://repository.apache.org/content/repositories/
>>> orgapachecassandra-1154/
>>> 
>>> Debian and RPM packages are available here:
>>> http://people.apache.org/~mshuler
>>> 
>>> *** This release addresses an important fix for CASSANDRA-14092 ***
>>> "Max ttl of 20 years will overflow localDeletionTime"
>>> https://issues.apache.org/jira/browse/CASSANDRA-14092
>>> 
>>> The vote will be open for 72 hours (longer if needed).
>>> 
>>> [1]: (CHANGES.txt) https://goo.gl/rLj59Z
>>> [2]: (NEWS.txt) https://goo.gl/EkrT4G
>>> 
>>> 
>> 


-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: Why isn't there a separate JVM per table?

2018-02-22 Thread Jon Haddad
Yeah, I’m in the compaction on it’s own JVM camp, in an ideal world where we’re 
isolating crazy GC churning parts of the DB.  It would mean reworking how tasks 
are created and removal of all shared state in favor of messaging + a smarter 
manager, which imo would be a good idea regardless. 

It might be a better use of time (especially for 4.0) to do some GC performance 
profiling and cut down on the allocations, since that doesn’t involve a massive 
effort.  

I’ve been meaning to do a little benchmarking and profiling for a while now, 
and it seems like a few others have the same inclination as well, maybe now is 
a good time to coordinate that.  A nice perf bump for 4.0 would be very 
rewarding.

Jon

> On Feb 22, 2018, at 2:00 PM, Nate McCall  wrote:
> 
> I've heard a couple of folks pontificate on compaction in its own
> process as well, given it has such a high impact on GC. Not sure about
> the value of individual tables. Interesting idea though.
> 
> On Fri, Feb 23, 2018 at 10:45 AM, Gary Dusbabek  wrote:
>> I've given it some thought in the past. In the end, I usually talk myself
>> out of it because I think it increases the surface area for failure. That
>> is, managing N processes is more difficult that managing one process. But
>> if the additional failure modes are addressed, there are some interesting
>> possibilities.
>> 
>> For example, having gossip in its own process would decrease the odds that
>> a node is marked dead because STW GC is happening in the storage JVM. On
>> the flipside, you'd need checks to make sure that the gossip process can
>> recognize when the storage process has died vs just running a long GC.
>> 
>> I don't know that I'd go so far as to have separate processes for
>> keyspaces, etc.
>> 
>> There is probably some interesting work that could be done to support the
>> orgs who run multiple cassandra instances on the same node (multiple
>> gossipers in that case is at least a little wasteful).
>> 
>> I've also played around with using domain sockets for IPC inside of
>> cassandra. I never ran a proper benchmark, but there were some throughput
>> advantages to this approach.
>> 
>> Cheers,
>> 
>> Gary.
>> 
>> 
>> On Thu, Feb 22, 2018 at 8:39 PM, Carl Mueller 
>> wrote:
>> 
>>> GC pauses may have been improved in newer releases, since we are in 2.1.x,
>>> but I was wondering why cassandra uses one jvm for all tables and
>>> keyspaces, intermingling the heap for on-JVM objects.
>>> 
>>> ... so why doesn't cassandra spin off a jvm per table so each jvm can be
>>> tuned per table and gc tuned and gc impacts not impact other tables? It
>>> would probably increase the number of endpoints if we avoid having an
>>> overarching query router.
>>> 
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
> 


-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: Cassandra Needs to Grow Up by Version Five!

2018-02-21 Thread Jon Haddad
Ken,

Maybe it’s not clear how open source projects work, so let me try to explain.  
There’s a bunch of us who either get paid by someone or volunteer on our free 
time.  The folks that get paid, (yay!) usually take direction on what the 
priorities are, and work on projects that directly affect our jobs.  That means 
that someone needs to care enough about the features you want to work on them, 
if you’re not going to do it yourself. 

Now as others have said already, please put your list of demands in JIRA, if 
someone is interested, they will work on it.  You may need to contribute a 
little more than you’ve done already, be prepared to get involved if you 
actually want to to see something get done.  Perhaps learning a little more 
about Cassandra’s internals and the people involved will reveal some of the 
design decisions and priorities of the project.  

Third, you seem to be a little obsessed with market share.  While market share 
is fun to talk about, *most* of us that are working on and contributing to 
Cassandra do so because it does actually solve a problem we have, and solves it 
reasonably well.  If some magic open source DB appears out of no where and does 
everything you want Cassandra to, and is bug free, keeps your data consistent, 
automatically does backups, comes with really nice cert management, ad hoc 
querying, amazing materialized views that are perfect, no caveats to secondary 
indexes, and somehow still gives you linear scalability without any mental 
overhead whatsoever then sure, people might start using it.  And that’s 
actually OK, because if that happens we’ll all be incredibly pumped out of our 
minds because we won’t have to work as hard.  If on the slim chance that 
doesn’t manifest, those of us that use Cassandra and are part of the community 
will keep working on the things we care about, iterating, and improving things. 
 Maybe someone will even take a look at your JIRA issues.  

Further filling the mailing list with your grievances will likely not help you 
progress towards your goal of a Cassandra that’s easier to use, so I encourage 
you to try to be a little more productive and try to help rather than just 
complain, which is not constructive.  I did a quick search for your name on the 
mailing list, and I’ve seen very little from you, so to everyone’s who’s been 
around for a while and trying to help you it looks like you’re just some random 
dude asking for people to work for free on the things you’re asking for, 
without offering anything back in return.

Jon


> On Feb 21, 2018, at 11:56 AM, Kenneth Brotman  
> wrote:
> 
> Josh, 
> 
> To say nothing is indifference.  If you care about your community, sometimes 
> don't you have to bring up a subject even though you know it's also 
> temporarily adding some discomfort?  
> 
> As to opening a JIRA, I've got a very specific topic to try in mind now.  An 
> easy one I'll work on and then announce.  Someone else will have to do the 
> coding.  A year from now I would probably just knock it out to make sure it's 
> as easy as I expect it to be but to be honest, as I've been saying, I'm not 
> set up to do that right now.  I've barely looked at any Cassandra code; for 
> one; everyone on this list probably codes more than I do, secondly; and 
> lastly, it's a good one for someone that wants an easy one to start with: 
> vNodes.  I've already seen too many people seeking assistance with the vNode 
> setting.
> 
> And you can expect as others have been mentioning that there should be 
> similar ones on compaction, repair and backup. 
> 
> Microsoft knows poor usability gives them an easy market to take over. And 
> they make it easy to switch.
> 
> Beginning at 4:17 in the video, it says the following:
> 
>   "You don't need to worry about replica sets, quorum or read repair.  
> You can focus on writing correct application logic."
> 
> At 4:42, it says:
>   "Hopefully this gives you a quick idea of how seamlessly you can bring 
> your existing Cassandra applications to Azure Cosmos DB.  No code changes are 
> required.  It works with your favorite Cassandra tools and drivers including 
> for example native Cassandra driver for Spark. And it takes seconds to get 
> going, and it's elastically and globally scalable."
> 
> More to come,
> 
> Kenneth Brotman
> 
> -Original Message-
> From: Josh McKenzie [mailto:jmcken...@apache.org] 
> Sent: Wednesday, February 21, 2018 8:28 AM
> To: dev@cassandra.apache.org
> Cc: User
> Subject: Re: Cassandra Needs to Grow Up by Version Five!
> 
> There's a disheartening amount of "here's where Cassandra is bad, and here's 
> what it needs to do for me for free" happening in this thread.
> 
> This is open-source software. Everyone is *strongly encouraged* to submit a 
> patch to move the needle on *any* of these things being complained about in 
> this thread.
> 
> For the Apache Way  to work, 

Re: CASSANDRA-8527

2017-12-21 Thread Jon Haddad
I had suggested to Alex we kick this discussion over to the ML because the 
change will have a significant impact on the behavior of Cassandra when doing 
reads with range tombstones that cover a lot of rows.  The behavior now is a 
little weird, a single tombstone could shadow hundreds of thousands or even 
millions of rows, and the query would probably just time out.  Personally, I’m 
in favor of the change in behavior of this patch but I wanted to get some more 
feedback before committing to it.  Are there any objections to what Alex 
described?  

Regarding Mockito, I’ve been meaning to bring this up for a while, and I’m a 
solid +1 on introducing it to help with testing.  In an ideal world we’d have 
no singletons and could test everything in isolation, but realistically that’s 
a multi year process and we just aren’t there.  


> On Dec 19, 2017, at 11:07 PM, Alexander Dejanovski  
> wrote:
> 
> Hi folks,
> 
> I'm working on CASSANDRA-8527
>  and would need to
> discuss a few things.
> 
> The ticket makes it visible in tracing and metrics that rows shadowed by
> range tombstones were scanned during reads.
> Currently, scanned range tombstones aren't reported anywhere which hides
> the cause of performance issues during reads when the users perform primary
> key deletes.
> As such, they do not count in the warn and failure thresholds.
> 
> While the number of live rows and tombstone cells is counted in the
> ReadCommand class, it is currently not possible to count the number of
> range tombstones there as they are merged with the rows they shadow before
> reaching the class.
> Instead, we can count the number of deleted rows that were read , which
> already improves diagnosis and show that range tombstones were scanned :
> 
> if (row.hasLiveData(ReadCommand.this.nowInSec(), enforceStrictLiveness))
>++liveRows;
> else if (!row.primaryKeyLivenessInfo().isLive(ReadCommand.this.nowInSec()))
> {
>// We want to detect primary key deletions only.
>// If all cells have expired they will count as tombstones.
>   ++deletedRows;
> }
> 
> Deleted rows would be part of the warning threshold so that we can spot the
> range tombstone scans in the logs and tracing would look like this :
> 
> WARN  [ReadStage-2] 2017-12-18 18:22:31,352 ReadCommand.java:491 -
> Read 2086 live rows, 2440 deleted rows and 0 tombstone cells for
> query..
> 
> 
> Are there any caveats to that approach ?
> Should we include the number of deleted rows in the failure threshold or
> make it optional, knowing that it could make some queries fail while they
> were passing before ?
> 
> On a side note, is it ok to bring in Mockito in order to make it easier
> writing tests ? I would like to use a Spy in order to write some tests
> without starting the whole stack.
> 
> Thanks,
> 
> 
> -- 
> -
> Alexander Dejanovski
> France
> @alexanderdeja
> 
> Consultant
> Apache Cassandra Consulting
> http://www.thelastpickle.com


-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: A JIRA proposing a seperate repository for the online documentation

2018-03-15 Thread Jon Haddad
Murukesh is correct on a very useable, pretty standard process of 
multi-versioned docs.

I’ll put my thoughts in a JIRA epic tonight.  I’ll be a multi-phase process.  
Also correct in that I’d like us to move to Hugo for the site, I’d like us to 
have a unified system between the site & the docs, and hugo has been excellent. 
We run the reaper site & docs off hugo, it works well.  We just don’t do 
multi-versions (because we don’t support multiple): 
https://github.com/thelastpickle/cassandra-reaper/tree/master/src/docs 
. 

Jon

> On Mar 15, 2018, at 8:57 AM, Murukesh Mohanan  
> wrote:
> 
> On Fri, Mar 16, 2018 at 0:19 Kenneth Brotman  >
> wrote:
> 
>> Help me out here.  I could have had a website with support for more than
>> one version done several different ways by now.
>> 
>> A website with several versions of documentation is going to have
>> sub-directories for each version of documentation obviously.  I've offered
>> to create those sub-directories under the "doc" folder of the current
>> repository; and I've offered to move the online documentation to a separate
>> repository and have the sub-directories there.  Both were shot down.  Is
>> there a third way?  If so please just spill the beans.
>> 
> 
> There is. Note that the website is an independent repository. So to host
> docs for multiple versions, only the website's repository (or rather, the
> final built contents) needs multiple directories. You can just checkout
> each branch or tag, generate the docs, make a directory for that branch or
> tag in the website repo, and copy the generated docs there with appropriate
> modifications.
> 
> I do this on a smaller scale using GitHub Pages (repo:
> https://github.com/murukeshm/cassandra 
>  site:
> https://murukeshm.github.io/cassandra/ 
> 
>  > ). The method is a bit
> hacky as I noted in CASSANDRA-13907. A daily cronjobs updated the repo if
> docs are updated. 3.9+ versions are available.
> 
> 
> 
> 
>> Also, no offense to anyone at Sphinx but for a project our size it's not
>> suitable.  We need to move off it now.  It's a problem.
>> 
>> Can we go past this and on to the documenting!  Please help resolve this.
>> 
>> How are we going to:
>> Make the submission of code changes include required changes to
>> documentation including the online documentation?
>> Allow, encourage the online documentation to publish multiple versions of
>> documentation concurrently including all officially supported versions?
> 
> 
> Only on this point: we'll need to start by improving the website build
> process. Michael's comment on 13907 (
> https://issues.apache.org/jira/browse/CASSANDRA-13907?focusedCommentId=16211365=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16211365
>  
> 
> )
> shows it's a painful, fiddly process. That seems to be the main blocker. I
> think Jon has shown interest in moving to Hugo from the current Jekyll
> setup.
> 
> 
> 
>> Move our project onto a more suitable program than Sphinx for our needs?
>> 
>> Kenneth Brotman
>> 
>> -Original Message-
>> From: Eric Evans [mailto:john.eric.ev...@gmail.com]
>> Sent: Thursday, March 15, 2018 7:50 AM
>> To: dev@cassandra.apache.org
>> Subject: Re: A JIRA proposing a seperate repository for the online
>> documentation
>> 
>> On Thu, Mar 15, 2018 at 4:58 AM, Rahul Singh 
>> wrote:
>>> 
>>> I don’t understand why it’s so complicated. In tree docs are as good as
>> any. All the old docs are there in the version control system.
>>> 
>>> All we need to is a) generate docs for old versions b) improve user
>> experience on the site by having it clearly laid out what is latest vs. old
>> docs and c) have some semblance of a search maybe using something like
>> Algolia or whatever.
>> 
>> This.
>> 
>> Keeping the docs in-tree is a huge win, because they can move in lock-step
>> with changes occurring in that branch/version.  I don't think we've been
>> enforcing this, but code-changes that alter documented behavior should be
>> accompanied by corresponding changes to the documentation, or be rejected.
>> Code-changes that correspond with undocumented behavior are an opportunity
>> to include some docs (not grounds to reject a code-review IMO, but
>> certainly an opportunity to politely ask/suggest).
>> 
>> Publishing more than one version (as generated from the respective
>> branches/tags), is a solvable problem.
>> 
 On Thu, Mar 15, 2018 at 1:22 Kenneth Brotman
 

Re: Roadmap for 4.0

2018-04-04 Thread Jon Haddad
+1, well said Scott.

> On Apr 4, 2018, at 5:13 PM, Jonathan Ellis  wrote:
> 
> On Wed, Apr 4, 2018, 7:06 PM Nate McCall  wrote:
> 
>> Top-posting as I think this summary is on point - thanks, Scott! (And
>> great to have you back, btw).
>> 
>> It feels to me like we are coalescing on two points:
>> 1. June 1 as a freeze for alpha
>> 2. "Stable" is the new "Exciting" (and the testing and dogfooding
>> implied by such before a GA)
>> 
>> How do folks feel about the above points?
>> 
> 
> +1
> 
>> 


-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: Roadmap for 4.0

2018-04-04 Thread Jon Haddad
Agreed with Josh.  There’s nothing set in stone after we release 4.0, trying to 
extrapolate what we do here for the rest of humanity’s timeline isn’t going to 
be a useful exercise.

Regarding building a big list - it’s of no value.  In fact, if we’re already 
talking about releasing 4.0 we should really only be merging in small features 
that enhance user experience like improving nodetool output or reasonable 
optimizations.  Merging in big features at the very end of the merge window is 
a really great idea to have dozens of follow up bug fix releases that nobody 
considers stable, where the Coli conjecture always wins.  IMO, it would be 
better / more responsible to merge them into trunk *after* we branch for 4.0. 
Yes, that makes the next release less exciting, but I really don’t think 
“exciting” is what we’re shooting for.  I’m more in favor of stable.

Regarding supporting 3.0 / 3.11, since we’re talking about feature freezing 4.0 
2 months from now, and releasing it *sometime* after that, then add 6 months, 
we’re talking about close to an extra year of 3.0 support.  People are, of 
course, free to continue patching 3.0, back porting fixes, etc, but I am 
completely OK with saying there’s only 9 more months of support starting today.

I’m also in the go straight to 3.11 camp.  I see no reason to upgrade to only 
3.0 if you’re on 2.x.  

Jon

> On Apr 4, 2018, at 6:29 AM, Josh McKenzie  wrote:
> 
>> 
>> This discussion was always about the release strategy. There is no
>> separation between the release strategy for 4.0 and the release strategy
>> for the project, they are the same thing and what is intended to be
>> discussed here.
> 
> Not trying to be pedantic here, but the email thread is titled "Roadmap for
> 4.0" and has been concerned with how we get 4.0 out the door. I don't think
> it's implicit that whatever strategy we settle on for 4.0 is intended to
> apply to subsequent releases, since the 3.0.X to 3.X to 4.0
> relationship/delta is different than a 4.0 to 5.0 can be expected to be.
> 
> 
>> sidenote: 3.10 was released in January 2017, and while the changes list for
>> 4.0 is getting quite large there's not much there that's going to win over
>> users. It's mostly refactorings and improvements that affect developers
>> more so than users.
> 
> If you assume most 3. users are on 3.10, this argument makes sense. I
> believe a majority are on 3.0.X or 2.1/2.2, which leaves a minority looking
> at the small delta from 3.10 to 4.0 in the current form.
> 
> 
> 
> On Wed, Apr 4, 2018 at 8:25 AM, kurt greaves  wrote:
> 
>>> 
>>> I'm also a bit sad that we seem to be getting back to our old demons of
>>> trying
>>> to shove as much as we possibly can in the next major as if having a
>>> feature
>>> miss it means it will never happen.
>> 
>> That wasn't the intention of this thread, but that's the response I got.
>> Thought I made it pretty clear that this was about compiling a list of
>> things that people are currently working on and can commit to getting
>> finished soon (which should be a relatively small list considering the
>> limited number of full time contributors).
>> 
>> Of course, we should probably (re-(re-(re-)))start a discussion on release
>>> "strategy" in parallel because it doesn't seem we have one right now, but
>>> that's imo a discussion we should keep separate.
>> 
>> This discussion was always about the release strategy. There is no
>> separation between the release strategy for 4.0 and the release strategy
>> for the project, they are the same thing and what is intended to be
>> discussed here. I don't think it's possible to have a separate discussion
>> on these two things as the release strategy has a pretty big influence on
>> how 4.0 is released.
>> 
>> I'm all for a feature freeze and KISS, but I feel that this really needs a
>> bit more thought before we just jump in and set another precedent for
>> future releases. IMO the Cassandra project has had a seriously bad track
>> record of releasing major versions in the past, and we should probably work
>> at resolving that properly, rather than just continuing the current "let's
>> just try something new every time without really thinking about it".
>> 
>> Some points:
>> 
>>   1.  This strategy means that we don't care about what improvements
>>   actually make it into any given major version. This means that we will
>> have
>>   major releases with nothing/very little desirable for users, and thus
>>   little reason to upgrade other than to stay on a supported version (from
>>   experience this isn't terribly important to users of a database). I
>> think
>>   this inevitably leads to supporting more versions than necessary, and in
>>   general a pretty poor experience for users as we spend more time
>> fighting
>>   bugs in production rather than before we do a release (purely because of
>>   increased frequency of releases).
>>   2. We'll always be driven by feature 

Re: Repair scheduling tools

2018-04-04 Thread Jon Haddad
Implementation details aside, I’m firmly in the “it would be nice of C* could 
take care of it” camp.  Reaper is pretty damn easy to use and people *still* 
don’t put it in prod.  


> On Apr 4, 2018, at 4:16 AM, Rahul Singh  wrote:
> 
> I understand the merits of both approaches. In working with other DBs In the 
> “old country” of SQL, we often had to write indexing sequences manually for 
> important tables. It was “built into the product” but in order to leverage 
> the maximum benefits of indices we had to have different indices other than 
> the clustered (physical index). The process still sucked. It’s never perfect.
> 
> The JVM is already fraught with GC issues and putting another process being 
> managed in the same heapspace is what I’m worried about. Technically the 
> process could be in the same binary but started as a side Car or in the same 
> main process.
> 
> Consider a process called “cassandra-agent” that’s sitting around with a 
> scheduler based on config or a Cassandra table. Distributed in the same 
> release. Shell / service scripts would start it. The end user knows it only 
> by examining the .sh files. This opens possibilities of including a GUI 
> hosted in the same process without cluttering the core coolness of Cassandra.
> 
> Best,
> 
> --
> Rahul Singh
> rahul.si...@anant.us
> 
> Anant Corporation
> 
> On Apr 4, 2018, 2:50 AM -0400, Dor Laor , wrote:
>> We at Scylla, implemented repair in a similar way to the Cassandra reaper.
>> We do
>> that using an external application, written in go that manages repair for
>> multiple clusters
>> and saves the data in an external Scylla cluster. The logic resembles the
>> reaper one with
>> some specific internal sharding optimizations and uses the Scylla rest api.
>> 
>> However, I have doubts it's the ideal way. After playing a bit with
>> CockroachDB, I realized
>> it's super nice to have a single binary that repairs itself, provides a GUI
>> and is the core DB.
>> 
>> Even while distributed, you can elect a leader node to manage the repair in
>> a consistent
>> way so the complexity can be reduced to a minimum. Repair can write its
>> status to the
>> system tables and to provide an api for progress, rate control, etc.
>> 
>> The big advantage for repair to embedded in the core is that there is no
>> need to expose
>> internal state to the repair logic. So an external program doesn't need to
>> deal with different
>> version of Cassandra, different repair capabilities of the core (such as
>> incremental on/off)
>> and so forth. A good database should schedule its own repair, it knows
>> whether the shreshold
>> of hintedhandoff was cross or not, it knows whether nodes where replaced,
>> etc,
>> 
>> My 2 cents. Dor
>> 
>> On Tue, Apr 3, 2018 at 11:13 PM, Dinesh Joshi <
>> dinesh.jo...@yahoo.com.invalid> wrote:
>> 
>>> Simon,
>>> You could still do load aware repair outside of the main process by
>>> reading Cassandra's metrics.
>>> In general, I don't think the maintenance tasks necessarily need to live
>>> in the main process. They could negatively impact the read / write path.
>>> Unless strictly required by the serving path, it could live in a sidecar
>>> process. There are multiple benefits including isolation, faster iteration,
>>> loose coupling. For example - this would mean that the maintenance tasks
>>> can have a different gc profile than the main process and it would be ok.
>>> Today that is not the case.
>>> The only issue I see is that the project does not provide an official
>>> sidecar. Perhaps there should be one. We probably would've not had to have
>>> this discussion ;)
>>> Dinesh
>>> 
>>> On Tuesday, April 3, 2018, 10:12:56 PM PDT, Qingcun Zhou <
>>> zhouqing...@gmail.com> wrote:
>>> 
>>> Repair has been a problem for us at Uber. In general I'm in favor of
>>> including the scheduling logic in Cassandra daemon. It has the benefit of
>>> introducing something like load-aware repair, eg, only schedule repair
>>> while no ongoing compaction or traffic is low, etc. As proposed by others,
>>> we can expose keyspace/table-level configurations so that users can opt-in.
>>> Regarding the risk, yes there will be problems at the beginning but in the
>>> long run, users will appreciate that repair works out of the box, just like
>>> compaction. We have large Cassandra deployments and can work with Netflix
>>> folks for intensive testing to boost user confidence.
>>> 
>>> On the other hand, have we looked into how other NoSQL databases do repair?
>>> Is there a side car process?
>>> 
>>> 
>>> On Tue, Apr 3, 2018 at 9:21 PM, sankalp kohli >> wrote:
>>> 
 Repair is critical for running C* and I agree with Roopa that it needs to
 be part of the offering. I think we should make it easy for new users to
 run C*.
 
 Can we have a side car process which we can add to Apache Cassandra
 offering and we can put this repair their? I am 

Re: Roadmap for 4.0

2018-04-12 Thread Jon Haddad
Sept works for me too.  I’ll be involved in the validation process before the 
cutoff date.  


> On Apr 12, 2018, at 3:17 PM, Carlos Rolo  wrote:
> 
> I will commit time to test (not a full validation, but at least go through
> operations) regardless of the date. Both seems fine to me.
> 
> Regards,
> 
> Carlos Juzarte Rolo
> Cassandra Consultant / Datastax Certified Architect / Cassandra MVP
> 
> Pythian - Love your data
> 
> rolo@pythian | Twitter: @cjrolo | Skype: cjr2k3 | Linkedin:
> *linkedin.com/in/carlosjuzarterolo
> *
> Mobile: +351 918 918 100
> www.pythian.com
> 
> On Thu, Apr 12, 2018 at 11:00 PM, Joseph Lynch 
> wrote:
> 
>> The Netflix team prefers September as well. We don't have time before that
>> to do a full certification (e2e and performance testing), but can probably
>> work it into end of Q3 / start of Q4.
>> 
>> I personally hope that the extra time gives us as a community a chance to
>> come up with a compelling user story for why users would want to upgrade. I
>> don't feel we have one right now.
>> 
>> -Joey
>> 
>> 
>> On Thu, Apr 12, 2018 at 2:51 PM, Ariel Weisberg  wrote:
>> 
>>> Hi,
>>> 
>>> +1 to September 1st. I know I will have much better availability then.
>>> 
>>> Ariel
>>> On Thu, Apr 12, 2018, at 5:15 PM, Sankalp Kohli wrote:
 +1 with Sept 1st as I am seeing willingness for people to test it after
>>> it
 
> On Apr 12, 2018, at 13:59, Ben Bromhead  wrote:
> 
> While I would prefer earlier, if Sept 1 gets better buy-in and we can
>>> have
> broader commitment to testing. I'm super happy with that. As Nate
>> said,
> having a solid line to work towards is going to help massively.
> 
> On Thu, Apr 12, 2018 at 4:07 PM Nate McCall 
>>> wrote:
> 
>>> If we push it to Sept 1 freeze, I'll personally spend a lot of time
>> testing.
>>> 
>>> What can I do to help convince the Jun1 folks that Sept1 is
>>> acceptable?
>> 
>> I can come around to that. At this point, I really just want us to
>> have a date we can start talking to/planning around.
>> 
>> 
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
>> For additional commands, e-mail: dev-h...@cassandra.apache.org
>> 
>> --
> Ben Bromhead
> CTO | Instaclustr 
> +1 650 284 9692
> Reliability at Scale
> Cassandra, Spark, Elasticsearch on AWS, Azure, GCP and Softlayer
 
 -
 To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
 For additional commands, e-mail: dev-h...@cassandra.apache.org
 
>>> 
>>> -
>>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
>>> For additional commands, e-mail: dev-h...@cassandra.apache.org
>>> 
>>> 
>> 
> 
> -- 
> 
> 
> --
> 
> 
> 
> 
> 


-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: Roadmap for 4.0

2018-04-03 Thread Jon Haddad
I’d prefer to time box it as well.  I like Sylvain’s suggestion, although I’d 
also be comfortable with setting a more aggressive cutoff date for features 
(maybe a month), given all the stuff that’s already in.

If we plan on a follow up (4.1/5.0) in 6 months I *hope* there will be less of 
a desire to do a bunch of last minute feature merges, maybe I’m too optimistic.

Jon

 

> On Apr 3, 2018, at 9:48 AM, Ben Bromhead  wrote:
> 
> +1
> 
> Even though I suggested clearing blockers, I'm equally happy with a
> time-boxed event to draw the line in the sand. As long as its something
> clear to work towards with appropriate commitment from folk.
> 
> On Tue, Apr 3, 2018 at 8:10 AM Sylvain Lebresne  >
> wrote:
> 
>> For what it's worth (and based on the project experience), I think the
>> strategy
>> of "let's agree on a list of tickets everyone would love to get in before
>> we
>> freeze 4.0" doesn't work very well (it's largely useless, expect for making
>> us
>> feel good about not releasing anything). Those lists always end up being
>> too
>> big especially given we have no control on people's ability to contribute
>> (some stuffs will always lag for a very long time, even when they sound
>> really cool on paper).
>> 
>> I'm also a bit sad that we seem to be getting back to our old demons of
>> trying
>> to shove as much as we possibly can in the next major as if having a
>> feature
>> miss it means it will never happen. The 4.0 changelog is big already and we
>> haven't made a release with new features in almost a year now, so I
>> personally
>> think we should start being a bit more aggressive with it and learn to get
>> comfortable letting feature slip if they are not ready.
>> 
>> My concrete proposal would be to declare a feature freeze for 4.0 in 2
>> months,
>> so say June 1th. That leave some time for finishing features that are in
>> progress, but not too much to get derailed. And let's be strict on that
>> freeze.
>> After that, we'll see how quickly we can get stuffs to stabilize but I'd
>> suggest aiming for an alpha 3-4 weeks after that.
>> 
>> Of course, we should probably (re-(re-(re-)))start a discussion on release
>> "strategy" in parallel because it doesn't seem we have one right now, but
>> that's imo a discussion we should keep separate.
>> 
>> --
>> Sylvain
>> 
>> 
>> On Mon, Apr 2, 2018 at 4:54 PM DuyHai Doan  wrote:
>> 
>>> My wish list:
>>> 
>>> * Add support for arithmetic operators (CASSANDRA-11935)
>>> * Allow IN restrictions on column families with collections
>>> (CASSANDRA-12654)
>>> * Add support for + and - operations on dates (CASSANDRA-11936)
>>> * Add the currentTimestamp, currentDate, currentTime and currentTimeUUID
>>> functions (CASSANDRA-13132)
>>> * Allow selecting Map values and Set elements (CASSANDRA-7396)
>>> 
>>> Those are mostly useful for timeseries data models and I guess has no
>>> significant impact on the internals and operations so the risk of
>>> regression is low
>>> 
>>> On Mon, Apr 2, 2018 at 4:33 PM, Jeff Jirsa  wrote:
>>> 
 9608 (java9)
 
 --
 Jeff Jirsa
 
 
> On Apr 2, 2018, at 3:45 AM, Jason Brown 
>> wrote:
> 
> The only additional tickets I'd like to mention are:
> 
> https://issues.apache.org/jira/browse/CASSANDRA-13971 - Automatic
> certificate management using Vault
> - Stefan's Vault integration work. A sub-ticket, CASSANDRA-14102,
 addresses
> encryption at-rest, subsumes CASSANDRA-9633 (SSTable encryption) -
>>> which
 I
> doubt I would be able to get to any time this year. It would
>> definitely
 be
> nice to have a clarified encryption/security story for 4.0.
> 
> https://issues.apache.org/jira/browse/CASSANDRA-11990 - Address rows
 rather
> than partitions in SASI
> - a nice update for SASI, but not critical.
> 
> -Jason
> 
>> On Sat, Mar 31, 2018 at 6:53 PM, Ben Bromhead 
 wrote:
>> 
>> Apologies all, I didn't realize I was responding to this discussion
 only on
>> the @user list. One of the perils of responding to a thread that is
>> on
 both
>> user and dev...
>> 
>> For context, I have included my response to Kurt's previous
>> discussion
 on
>> this topic as it only ended up on the user list.
>> 
>> *After some further discussions with folks offline, I'd like to
>> revive
 this
>> discussion. *
>> 
>> *As Kurt mentioned, to keep it simple I if we can simply build
>>> consensus
>> around what is in for 4.0 and what is out. We can then start the
 process of
>> working off a 4.0 branch towards betas and release candidates. Again
>>> as
>> Kurt mentioned, assigning a timeline to it right now is difficult,
>> but
>> having a firm line in the sand around what features/patches are in,

Re: Paying off tech debt and correctly naming things

2018-03-22 Thread Jon Haddad
Cool.  I think there’s general agreement that doing this in as small bites as 
possible is going to be the best approach.  I have no interest in mega patches. 
 

>  The combined approach takes a
> change that's already non-trivially dealing with complex subsystem
> changes and injects a bunch of trivial renaming noise across unrelated
> subsystems into the signal of an actual logic refactor.

I agree.  This is why I like the idea of proactively working to improve the 
readability of the codebase as a specific goal, rather than being wrapped into 
some other unrelated patch.  Keeping the scope in check is the challenge.  
Simple class and method renames, as several have pointed out, is easy enough 
with IDEA.  

I’ll start with class renames, as individual patches for each of them.  I’ll be 
sure to call it out on the ML.  First one will be ColumnFamilyStore -> 
TableStore.  

Jon

> On Mar 22, 2018, at 7:13 AM, Jason Brown <jasedbr...@gmail.com> wrote:
> 
> Jon,
> 
> Thanks for bringing up this topic. I'll admit that I've been around this
> code base for long enough, and have enough accumulated history, that I
> probably can't fully appreciate the impact for a newcomer wrt naming.
> However, as Josh points out, this situation probably happens to "every
> non-trivially aged code-base ever".
> 
> One thing I'd like to add is that with these types of large refactoring
> changes, the review effort is non-trivial. This is because the review still
> has to ensure that correctness is preserved and it's easy to overlook a
> seemingly innocuous change.
> 
> That being said, I am supportive of this effort. However, I believe it's
> going to be best, for contributor and reviewer, to break it up into
> smaller, more digestible pieces. I'd also like to request that we not go
> whole hog and try to do everything in a compressed time frame; reviewer
> availability is already stretched thin and I'm afraid of deepening the
> review queue, especially mine :)
> 
> Thanks,
> 
> -Jason
> 
> 
> 
> 
> On Thu, Mar 22, 2018 at 6:41 AM, Josh McKenzie <jmcken...@apache.org> wrote:
> 
>>> Some of us have big patches in flight, things that actually
>>> pay off some technical debt, and dealing with such renames is rebase
>> hell :\
>> For sure, but with a code-base this old / organically grown, I expect
>> this will always be the case. If we're talking something as simple as
>> an intellij rename refactor, while menial, couldn't someone with a
>> giant patch just do the same thing on their side and spend half an
>> hour of their life clicking next? ;)
>> 
>>> That said, there is good time for such renames - it’s during
>>> those major refactors and rewrites. When you are
>>> changing a subsystem, might as well do the appropriate renames.
>> Does that hold true for a code-base with as much static state and
>> abstraction leaking / bad factoring as we have? (i.e. every
>> non-trivially aged code-base ever) The combined approach takes a
>> change that's already non-trivially dealing with complex subsystem
>> changes and injects a bunch of trivial renaming noise across unrelated
>> subsystems into the signal of an actual logic refactor.
>> 
>> On Thu, Mar 22, 2018 at 9:31 AM, Aleksey Yeshchenko <alek...@apple.com>
>> wrote:
>>> Poor and out-of-date naming of things is probably the least serious part
>> of our technical debt. Bad factoring, and straight-up
>>> poorly written components is where it’s really at.
>>> 
>>> Doing a big rename for rename sake alone does more harm than it is good,
>> sometimes. Some of us have big patches
>>> in flight, things that actually pay off some technical debt, and dealing
>> with such renames is rebase hell :\
>>> 
>>> That said, there is good time for such renames - it’s during those major
>> refactors and rewrites. When you are
>>> changing a subsystem, might as well do the appropriate renames.
>>> 
>>> —
>>> AY
>>> 
>>> On 20 March 2018 at 22:04:48, Jon Haddad (j...@jonhaddad.com) wrote:
>>> 
>>> Whenever I hop around in the codebase, one thing that always manages to
>> slow me down is needing to understand the context of the variable names
>> that I’m looking at. We’ve now removed thrift the transport, but the
>> variables, classes and comments still remain. Personally, I’d like to go in
>> and pay off as much technical debt as possible by refactoring the code to
>> be as close to CQL as possible. Rows should be rows, not partitions, I’d
>> love to see the term column family removed forever in favor of always using
>> tables. Tha

Re: Audit logging to tables.

2019-04-03 Thread Jon Haddad
; > > > >> > > > > > >>> wrote:
> >  >> > > > >> > > > > > >>>
> >  >> > > > >> > > > > > >>>> I strongly echo Josh’s sentiment. Imagine
> > losing
> >  >> > audit
> >  >> > > > >> entries
> >  >> > > > >> > > > > > because C*
> >  >> > > > >> > > > > > >>>> is overloaded? It’s fine if you don’t care
> > about
> >  >> > losing
> >  >> > > > >> audit
> >  >> > > > >> > > > > entries.
> >  >> > > > >> > > > > > >>>>
> >  >> > > > >> > > > > > >>>> Dinesh
> >  >> > > > >> > > > > > >>>>
> >  >> > > > >> > > > > > >>>>> On Feb 28, 2019, at 6:41 AM, Joshua McKenzie <
> >  >> > > > >> > > > jmcken...@apache.org
> >  >> > > > >> > > > > >
> >  >> > > > >> > > > > > >>>> wrote:
> >  >> > > > >> > > > > > >>>>>
> >  >> > > > >> > > > > > >>>>> One of the things we've run into
> > historically, on
> >  >> a
> >  >> > > > *lot*
> >  >> > > > >> of
> >  >> > > > >> > > > axes,
> >  >> > > > >> > > > > is
> >  >> > > > >> > > > > > >>>> that
> >  >> > > > >> > > > > > >>>>> "just put it in C*" for various functionality
> >  >> looks
> >  >> > > > great
> >  >> > > > >> > from
> >  >> > > > >> > > a
> >  >> > > > >> > > > > user
> >  >> > > > >> > > > > > >>> and
> >  >> > > > >> > > > > > >>>>> usability perspective, and proves to be
> > something
> >  >> > of a
> >  >> > > > >> > > nightmare
> >  >> > > > >> > > > > from
> >  >> > > > >> > > > > > >>> an
> >  >> > > > >> > > > > > >>>>> admin / cluster behavior perspective.
> >  >> > > > >> > > > > > >>>>>
> >  >> > > > >> > > > > > >>>>> i.e. - cluster suffering so you're writing
> > hints?
> >  >> > > Write
> >  >> > > > >> them
> >  >> > > > >> > to
> >  >> > > > >> > > > C*
> >  >> > > > >> > > > > > >>> tables
> >  >> > > > >> > > > > > >>>>> and watch the cluster suffer more! :)
> >  >> > > > >> > > > > > >>>>> Same thing probably holds true for audit
> > logging -
> >  >> > at
> >  >> > > a
> >  >> > > > >> time
> >  >> > > > >> > > > frame
> >  >> > > > >> > > > > > when
> >  >> > > > >> > > > > > >>>>> things are getting hairy w/a cluster, if
> > you're
> >  >> > > writing
> >  >> > > > >> that
> >  >> > > > >> > > > audit
> >  >> > > > >> > > > > > >>>> logging
> >  >> > > > >> > > > > > >>>>> into C* proper (and dealing with ser/deser,
> >  >> > compaction
> >  >> > > > >> > > pressure,
> >  >> > > > >> > > > > > >>> flushing
> >  >> > > > >> > > > > > >>>>> pressure, etc) from that, there's a
> > compo

Re: Stabilising Internode Messaging in 4.0

2019-04-04 Thread Jon Haddad
Given the number of issues that are addressed, I definitely think it's
worth strongly considering merging this in.  I think it might be a
little unrealistic to cut the first alpha after the merge though.
Being realistic, any 20K+ LOC change is going to introduce its own
bugs, and we should be honest with ourselves about that.  It seems
likely the issues the patch addressed would have affected the 4.0
release in some form *anyways* so the question might be do we fix them
now or after someone's cluster burns down because there's no inbound /
outbound message load shedding.

Giving it a quick code review and going through the JIRA comments
(well written, thanks guys) there seem to be some pretty important bug
fixes in here as well as paying off a bit of technical debt.

Jon

On Thu, Apr 4, 2019 at 1:37 PM Pavel Yaskevich  wrote:
>
> Great to see such a significant progress made in the area!
>
> On Thu, Apr 4, 2019 at 1:13 PM Aleksey Yeschenko  wrote:
>
> > I would like to propose CASSANDRA-15066 [1] - an important set of bug fixes
> > and stability improvements to internode messaging code that Benedict, I,
> > and others have been working on for the past couple of months.
> >
> > First, some context.   This work started off as a review of CASSANDRA-14503
> > (Internode connection management is race-prone [2]), CASSANDRA-13630
> > (Support large internode messages with netty) [3], and a pre-4.0
> > confirmatory review of such a major new feature.
> >
> > However, as we dug in, we realized this was insufficient. With more than 50
> > bugs uncovered [4] - dozens of them critical to correctness and/or
> > stability of the system - a substantial rework was necessary to guarantee a
> > solid internode messaging subsystem for the 4.0 release.
> >
> > In addition to addressing all of the uncovered bugs [4] that were unique to
> > trunk + 13630 [3] + 14503 [2], we used this opportunity to correct some
> > long-existing, pre-4.0 bugs and stability issues. For the complete list of
> > notable bug fixes, read the comments to CASSANDRA-15066 [1]. But I’d like
> > to highlight a few.
> >
> > # Lack of message integrity checks
> >
> > It’s known that TCP checksums are too weak [5] and Ethernet CRC cannot be
> > relied upon [6] for integrity. With sufficient scale or time, you will hit
> > bit flips. Sadly, most of the time these go undetected.  Cassandra’s
> > replication model makes this issue much more serious, as the faulty data
> > can infect the cluster.
> >
> > We recognised this problem, and recently introduced a fix for server-client
> > messages, implementing CRCs in CASSANDRA-13304 (Add checksumming to the
> > native protocol) [7].
> >
> > But until CASSANDRA-15066 [1] lands, this is also a critical flaw
> > internode. We have addressed it by ensuring that no matter what, whether
> > you use SSL or not, whether you use internode compression or not, a
> > protocol level CRC is always present, for every message frame. It’s our
> > deep and sincere belief that shipping a new implementation of the messaging
> > protocol without application-level data integrity checks would be
> > unacceptable for a modern database.
> >
>
> I'm all for introducing more correctness checks at all levels especially in
> communication.
> Having dealt with multiple data corruption bugs that could have been easily
> prevented by
> having a checksum, it's great to see that we are moving in this direction.
>
>
> > # Lack of back-pressure and memory usage constraints
> >
> > As it stands today, it’s far too easy for a single slow node to become
> > overwhelmed by messages from its peers.  Conversely, multiple coordinators
> > can be made unstable by the backlog of messages to deliver to just one
> > struggling node.
> >
> > To address this problem, we have introduced strict memory usage constraints
> > that apply TCP-level back-pressure, on both outbound and inbound.  It is
> > now impossible for a node to be swamped on inbound, and on outbound it is
> > made significantly harder to overcommit resources.  It’s a simple, reliable
> > mechanism that drastically improves cluster stability under load, and
> > especially overload.
> >
> > Cassandra is a mature system, and introducing an entirely new messaging
> > implementation without resolving this fundamental stability issue is
> > difficult to justify in our view.
> >
>
> I'd say that this is required to be able to ship 4.0 as a release focused
> on stability.
> I personally have been waiting for this to happen for years. Significant
> step forward in our QoS story.
>
>
> >
> > # State of the patch, feature freeze and 4.0 timeline concerns
> >
> > The patch is essentially complete, with much improved unit tests all
> > passing, dtests green, and extensive fuzz testing underway - with initial
> > results all positive.  We intend to further improve in-code documentation
> > and test coverage in the next week or two, and do some minor additional
> > code review, but we believe it will be basically 

TLP tools for stress testing and building test clusters in AWS

2019-04-12 Thread Jon Haddad
I don't want to derail the discussion about Stabilizing Internode
Messaging, so I'm starting this as a separate thread.  There was a
comment that Josh made [1] about doing performance testing with real
clusters as well as a lot of microbenchmarks, and I'm 100% in support
of this.  We've been working on some tooling at TLP for the last
several months to make this a lot easier.  One of the goals has been
to help improve the 4.0 testing process.

The first tool we have is tlp-stress [2].  It's designed with a "get
started in 5 minutes" mindset.  My goal was to ship a stress tool that
ships with real workloads out of the box that can be easily tweaked,
similar to how fio allows you to design a disk workload and tweak it
with paramaters.  Included are stress workloads that stress LWTs (two
different types), materialized views, counters, time series, and
key-value workloads.  Each workload can be modified easily to change
compaction strategies, concurrent operations, number of partitions.
We can run workloads for a set number of iterations or a custom
duration.  We've used this *extensively* at TLP to help our customers
and most of our blog posts that discuss performance use it as well.
It exports data to both a CSV format and auto sets up prometheus for
metrics collection / aggregation.  As an example, we were able to
determine that the compression length set on the paxos tables imposes
a significant overhead when using the Locking LWT workload, which
simulates locking and unlocking of rows.  See CASSANDRA-15080 for
details.

We have documentation [3] on the TLP website.

The second tool we've been working on is tlp-cluster [4].  This tool
is designed to help provision AWS instances for the purposes of
testing.  To be clear, I don't expect, or want, this tool to be used
for production environments.  It's designed to assist with the
Cassandra build process by generating deb packages or re-using the
ones that have already been uploaded.  Here's a short list of the
things you'll care about:

1. Create instances in AWS for Cassandra using any instance size and
number of nodes.  Also create tlp-stress instances and a box for
monitoring
2. Use any available build of Cassandra, with a quick option to change
YAML config.  For example: tlp-stress use 3.11.4 -c
concurrent_writes:256
3. Do custom builds just by pointing to a local Cassandra git repo.
They can be used the same way as #2.
4. tlp-stress is automatically installed on the stress box.
5. Everything's installed with pure bash.  I considered something more
complex, but since this is for development only, it turns out the
simplest tool possible works well and it means it's easily
configurable.  Just drop in your own bash script starting with a
number in a XX_script_name.sh format and it gets run.
6. The monitoring box is running Prometheus.  It auto scrapes
Cassandra using the Instaclustr metrics library.
7. Grafana is also installed automatically.  There's a couple sample
graphs there now.  We plan on having better default graphs soon.

For the moment it installs java 8 only but that should be easily
fixable to use java 11 to test ZGC (it's on my radar).

Documentation for tlp-cluster is here [5].

There's still some things to work out in the tool, and we've been
working hard to smooth out the rough edges.  I still haven't announced
anything WRT tlp-cluster on the TLP blog, because I don't think it's
quite ready for public consumption, but I think the folks on this list
are smart enough to see the value in it even if it has a few warts
still.

I don't consider myself familiar enough with the networking patch to
give it a full review, but I am qualified to build tools to help test
it and go through the testing process myself.  From what I can tell
the patch is moving the codebase in a positive direction and I'd like
to help build confidence in it so we can get it merged in.

We'll continue to build out and improve the tooling with the goal of
making it easier for people to jump into the QA side of things.

Jon

[1] 
https://lists.apache.org/thread.html/742009c8a77999f4b62062509f087b670275f827d0c1895bf839eece@%3Cdev.cassandra.apache.org%3E
[2] https://github.com/thelastpickle/tlp-stress
[3] http://thelastpickle.com/tlp-stress/
[4] https://github.com/thelastpickle/tlp-cluster
[5] http://thelastpickle.com/tlp-cluster/

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: TLP tools for stress testing and building test clusters in AWS

2019-04-15 Thread Jon Haddad
Hey all,

I've set up a Zoom call for 9AM Pacific time.  Everyone's welcome to join.

https://zoom.us/j/189920888

Looking forward to a good discussion on how we can all pitch in on
getting 4.0 out the door.

Jon

On Sat, Apr 13, 2019 at 9:14 AM Jonathan Koppenhofer
 wrote:
>
> Wednesday would work for me.
>
> We use and (slightly) contribute to tlp tools. We are platform testing and
> beginning 4.0 testing ourselves, so an in person overview would be great!
>
> On Sat, Apr 13, 2019, 8:48 AM Aleksey Yeshchenko 
> wrote:
>
> > Wednesday and Thursday, either, at 9 AM pacific WFM.
> >
> > > On 13 Apr 2019, at 13:31, Stefan Miklosovic <
> > stefan.mikloso...@instaclustr.com> wrote:
> > >
> > > Hi Jon,
> > >
> > > I would like be on that call too but I am off on Thursday.
> > >
> > > I am from Australia so 5pm London time is ours 2am next day so your
> > > Wednesday morning is my Thursday night. Wednesday early morning so
> > > your Tuesday morning and London's afternoon would be the best.
> > >
> > > Recording the thing would be definitely helpful too.
> > >
> > > On Sat, 13 Apr 2019 at 07:45, Jon Haddad  wrote:
> > >>
> > >> I'd be more than happy to hop on a call next week to give you both
> > >> (and anyone else interested) a tour of our dev tools.  Maybe something
> > >> early morning on my end, which should be your evening, could work?
> > >>
> > >> I can set up a Zoom conference to get everyone acquainted.  We can
> > >> record and post it for any who can't make it.
> > >>
> > >> I'm thinking Tuesday, Wednesday, or Thursday morning, 9AM Pacific (5pm
> > >> London)?  If anyone's interested please reply with what dates work.
> > >> I'll be sure to post the details back here with the zoom link in case
> > >> anyone wants to join that didn't get a chance to reply, as well as a
> > >> link to the recorded call.
> > >>
> > >> Jon
> > >>
> > >> On Fri, Apr 12, 2019 at 10:41 AM Benedict Elliott Smith
> > >>  wrote:
> > >>>
> > >>> +1
> > >>>
> > >>> I’m also just as excited to see some standardised workloads and test
> > bed.  At the moment we’re benefiting from some large contributors doing
> > their own proprietary performance testing, which is super valuable and
> > something we’ve lacked before.  But I’m also keen to see some more
> > representative workloads that are reproducible by anybody in the community
> > take shape.
> > >>>
> > >>>
> > >>>> On 12 Apr 2019, at 18:09, Aleksey Yeshchenko
> >  wrote:
> > >>>>
> > >>>> Hey Jon,
> > >>>>
> > >>>> This sounds exciting and pretty useful, thanks.
> > >>>>
> > >>>> Looking forward to using tlp-stress for validating 15066 performance.
> > >>>>
> > >>>> We should touch base some time next week to pick a comprehensive set
> > of workloads and versions, perhaps?
> > >>>>
> > >>>>
> > >>>>> On 12 Apr 2019, at 16:34, Jon Haddad  wrote:
> > >>>>>
> > >>>>> I don't want to derail the discussion about Stabilizing Internode
> > >>>>> Messaging, so I'm starting this as a separate thread.  There was a
> > >>>>> comment that Josh made [1] about doing performance testing with real
> > >>>>> clusters as well as a lot of microbenchmarks, and I'm 100% in support
> > >>>>> of this.  We've been working on some tooling at TLP for the last
> > >>>>> several months to make this a lot easier.  One of the goals has been
> > >>>>> to help improve the 4.0 testing process.
> > >>>>>
> > >>>>> The first tool we have is tlp-stress [2].  It's designed with a "get
> > >>>>> started in 5 minutes" mindset.  My goal was to ship a stress tool
> > that
> > >>>>> ships with real workloads out of the box that can be easily tweaked,
> > >>>>> similar to how fio allows you to design a disk workload and tweak it
> > >>>>> with paramaters.  Included are stress workloads that stress LWTs (two
> > >>>>> different types), materialized views, counters, time series, and
> > >>>>> key-value workloads.  Each workload can be modified easily 

Re: TLP tools for stress testing and building test clusters in AWS

2019-04-16 Thread Jon Haddad
Yes, sorry about that. Wednesday morning 9am PT

On Tue, Apr 16, 2019 at 3:26 AM Benedict Elliott Smith 
wrote:

> Just to confirm, this is on Wednesday?
>
> > On 15 Apr 2019, at 22:38, Jon Haddad  wrote:
> >
> > Hey all,
> >
> > I've set up a Zoom call for 9AM Pacific time.  Everyone's welcome to
> join.
> >
> > https://zoom.us/j/189920888
> >
> > Looking forward to a good discussion on how we can all pitch in on
> > getting 4.0 out the door.
> >
> > Jon
> >
> > On Sat, Apr 13, 2019 at 9:14 AM Jonathan Koppenhofer
> >  wrote:
> >>
> >> Wednesday would work for me.
> >>
> >> We use and (slightly) contribute to tlp tools. We are platform testing
> and
> >> beginning 4.0 testing ourselves, so an in person overview would be
> great!
> >>
> >> On Sat, Apr 13, 2019, 8:48 AM Aleksey Yeshchenko
> 
> >> wrote:
> >>
> >>> Wednesday and Thursday, either, at 9 AM pacific WFM.
> >>>
> >>>> On 13 Apr 2019, at 13:31, Stefan Miklosovic <
> >>> stefan.mikloso...@instaclustr.com> wrote:
> >>>>
> >>>> Hi Jon,
> >>>>
> >>>> I would like be on that call too but I am off on Thursday.
> >>>>
> >>>> I am from Australia so 5pm London time is ours 2am next day so your
> >>>> Wednesday morning is my Thursday night. Wednesday early morning so
> >>>> your Tuesday morning and London's afternoon would be the best.
> >>>>
> >>>> Recording the thing would be definitely helpful too.
> >>>>
> >>>> On Sat, 13 Apr 2019 at 07:45, Jon Haddad  wrote:
> >>>>>
> >>>>> I'd be more than happy to hop on a call next week to give you both
> >>>>> (and anyone else interested) a tour of our dev tools.  Maybe
> something
> >>>>> early morning on my end, which should be your evening, could work?
> >>>>>
> >>>>> I can set up a Zoom conference to get everyone acquainted.  We can
> >>>>> record and post it for any who can't make it.
> >>>>>
> >>>>> I'm thinking Tuesday, Wednesday, or Thursday morning, 9AM Pacific
> (5pm
> >>>>> London)?  If anyone's interested please reply with what dates work.
> >>>>> I'll be sure to post the details back here with the zoom link in case
> >>>>> anyone wants to join that didn't get a chance to reply, as well as a
> >>>>> link to the recorded call.
> >>>>>
> >>>>> Jon
> >>>>>
> >>>>> On Fri, Apr 12, 2019 at 10:41 AM Benedict Elliott Smith
> >>>>>  wrote:
> >>>>>>
> >>>>>> +1
> >>>>>>
> >>>>>> I’m also just as excited to see some standardised workloads and test
> >>> bed.  At the moment we’re benefiting from some large contributors doing
> >>> their own proprietary performance testing, which is super valuable and
> >>> something we’ve lacked before.  But I’m also keen to see some more
> >>> representative workloads that are reproducible by anybody in the
> community
> >>> take shape.
> >>>>>>
> >>>>>>
> >>>>>>> On 12 Apr 2019, at 18:09, Aleksey Yeshchenko
> >>>  wrote:
> >>>>>>>
> >>>>>>> Hey Jon,
> >>>>>>>
> >>>>>>> This sounds exciting and pretty useful, thanks.
> >>>>>>>
> >>>>>>> Looking forward to using tlp-stress for validating 15066
> performance.
> >>>>>>>
> >>>>>>> We should touch base some time next week to pick a comprehensive
> set
> >>> of workloads and versions, perhaps?
> >>>>>>>
> >>>>>>>
> >>>>>>>> On 12 Apr 2019, at 16:34, Jon Haddad  wrote:
> >>>>>>>>
> >>>>>>>> I don't want to derail the discussion about Stabilizing Internode
> >>>>>>>> Messaging, so I'm starting this as a separate thread.  There was a
> >>>>>>>> comment that Josh made [1] about doing performance testing with
> real
> >>>>>>>> clusters as well as a lot of microbenchmarks, and I'm 100% in
> support
> >>>>>>>> of this.  We've be

Re: TLP tools for stress testing and building test clusters in AWS

2019-04-16 Thread Jon Haddad
The one I sent out is open, no separate invite required.

On Tue, Apr 16, 2019 at 3:47 PM Dinesh Joshi  wrote:
>
> I'm slightly confused. The zoom meeting mentioned in this thread is only open 
> to who have registered interest here? If so, can someone please add me?
>
> Dinesh
>
> > On Apr 16, 2019, at 3:29 PM, Anthony Grasso  
> > wrote:
> >
> > Hi Stefan,
> >
> > Thanks for sending the invite out!
> >
> > Just wondering what do you think of the idea of having a Zoom meeting that
> > anyone can join? This way anyone else interested can join us as well. I can
> > set that up if you like?
> >
> > Cheers,
> > Anthony
> >
> > On Tue, 16 Apr 2019 at 21:24, Stefan Miklosovic <
> > stefan.mikloso...@instaclustr.com> wrote:
> >
> >> Hi Anthony,
> >>
> >> sounds good. I ve sent you Hangouts meeting invitation privately.
> >>
> >> Regards
> >>
> >> On Tue, 16 Apr 2019 at 14:53, Anthony Grasso 
> >> wrote:
> >>>
> >>> Hi Stefan,
> >>>
> >>> I have been working with Jon on developing the tool set. I can do a Zoom
> >>> call tomorrow (Wednesday) at 11am AEST if that works for you? We can go
> >>> through all the same information that Jon is going to go through in his
> >>> call. Note that I am in the same timezone as you, so if tomorrow morning
> >> is
> >>> no good we can always do the afternoon.
> >>>
> >>> Cheers,
> >>> Anthony
> >>>
> >>>
> >>> On Sat, 13 Apr 2019 at 22:38, Stefan Miklosovic <
> >>> stefan.mikloso...@instaclustr.com> wrote:
> >>>
> >>>> Hi Jon,
> >>>>
> >>>> I would like be on that call too but I am off on Thursday.
> >>>>
> >>>> I am from Australia so 5pm London time is ours 2am next day so your
> >>>> Wednesday morning is my Thursday night. Wednesday early morning so
> >>>> your Tuesday morning and London's afternoon would be the best.
> >>>>
> >>>> Recording the thing would be definitely helpful too.
> >>>>
> >>>> On Sat, 13 Apr 2019 at 07:45, Jon Haddad  wrote:
> >>>>>
> >>>>> I'd be more than happy to hop on a call next week to give you both
> >>>>> (and anyone else interested) a tour of our dev tools.  Maybe
> >> something
> >>>>> early morning on my end, which should be your evening, could work?
> >>>>>
> >>>>> I can set up a Zoom conference to get everyone acquainted.  We can
> >>>>> record and post it for any who can't make it.
> >>>>>
> >>>>> I'm thinking Tuesday, Wednesday, or Thursday morning, 9AM Pacific
> >> (5pm
> >>>>> London)?  If anyone's interested please reply with what dates work.
> >>>>> I'll be sure to post the details back here with the zoom link in case
> >>>>> anyone wants to join that didn't get a chance to reply, as well as a
> >>>>> link to the recorded call.
> >>>>>
> >>>>> Jon
> >>>>>
> >>>>> On Fri, Apr 12, 2019 at 10:41 AM Benedict Elliott Smith
> >>>>>  wrote:
> >>>>>>
> >>>>>> +1
> >>>>>>
> >>>>>> I’m also just as excited to see some standardised workloads and
> >> test
> >>>> bed.  At the moment we’re benefiting from some large contributors doing
> >>>> their own proprietary performance testing, which is super valuable and
> >>>> something we’ve lacked before.  But I’m also keen to see some more
> >>>> representative workloads that are reproducible by anybody in the
> >> community
> >>>> take shape.
> >>>>>>
> >>>>>>
> >>>>>>> On 12 Apr 2019, at 18:09, Aleksey Yeshchenko
> >>>>  wrote:
> >>>>>>>
> >>>>>>> Hey Jon,
> >>>>>>>
> >>>>>>> This sounds exciting and pretty useful, thanks.
> >>>>>>>
> >>>>>>> Looking forward to using tlp-stress for validating 15066
> >> performance.
> >>>>>>>
> >>>>>>> We should touch base some time next week to pick a comprehensive
> >> set
> >>>> of workloads 

Re: TLP tools for stress testing and building test clusters in AWS

2019-04-12 Thread Jon Haddad
I'd be more than happy to hop on a call next week to give you both
(and anyone else interested) a tour of our dev tools.  Maybe something
early morning on my end, which should be your evening, could work?

I can set up a Zoom conference to get everyone acquainted.  We can
record and post it for any who can't make it.

I'm thinking Tuesday, Wednesday, or Thursday morning, 9AM Pacific (5pm
London)?  If anyone's interested please reply with what dates work.
I'll be sure to post the details back here with the zoom link in case
anyone wants to join that didn't get a chance to reply, as well as a
link to the recorded call.

Jon

On Fri, Apr 12, 2019 at 10:41 AM Benedict Elliott Smith
 wrote:
>
> +1
>
> I’m also just as excited to see some standardised workloads and test bed.  At 
> the moment we’re benefiting from some large contributors doing their own 
> proprietary performance testing, which is super valuable and something we’ve 
> lacked before.  But I’m also keen to see some more representative workloads 
> that are reproducible by anybody in the community take shape.
>
>
> > On 12 Apr 2019, at 18:09, Aleksey Yeshchenko  
> > wrote:
> >
> > Hey Jon,
> >
> > This sounds exciting and pretty useful, thanks.
> >
> > Looking forward to using tlp-stress for validating 15066 performance.
> >
> > We should touch base some time next week to pick a comprehensive set of 
> > workloads and versions, perhaps?
> >
> >
> >> On 12 Apr 2019, at 16:34, Jon Haddad  wrote:
> >>
> >> I don't want to derail the discussion about Stabilizing Internode
> >> Messaging, so I'm starting this as a separate thread.  There was a
> >> comment that Josh made [1] about doing performance testing with real
> >> clusters as well as a lot of microbenchmarks, and I'm 100% in support
> >> of this.  We've been working on some tooling at TLP for the last
> >> several months to make this a lot easier.  One of the goals has been
> >> to help improve the 4.0 testing process.
> >>
> >> The first tool we have is tlp-stress [2].  It's designed with a "get
> >> started in 5 minutes" mindset.  My goal was to ship a stress tool that
> >> ships with real workloads out of the box that can be easily tweaked,
> >> similar to how fio allows you to design a disk workload and tweak it
> >> with paramaters.  Included are stress workloads that stress LWTs (two
> >> different types), materialized views, counters, time series, and
> >> key-value workloads.  Each workload can be modified easily to change
> >> compaction strategies, concurrent operations, number of partitions.
> >> We can run workloads for a set number of iterations or a custom
> >> duration.  We've used this *extensively* at TLP to help our customers
> >> and most of our blog posts that discuss performance use it as well.
> >> It exports data to both a CSV format and auto sets up prometheus for
> >> metrics collection / aggregation.  As an example, we were able to
> >> determine that the compression length set on the paxos tables imposes
> >> a significant overhead when using the Locking LWT workload, which
> >> simulates locking and unlocking of rows.  See CASSANDRA-15080 for
> >> details.
> >>
> >> We have documentation [3] on the TLP website.
> >>
> >> The second tool we've been working on is tlp-cluster [4].  This tool
> >> is designed to help provision AWS instances for the purposes of
> >> testing.  To be clear, I don't expect, or want, this tool to be used
> >> for production environments.  It's designed to assist with the
> >> Cassandra build process by generating deb packages or re-using the
> >> ones that have already been uploaded.  Here's a short list of the
> >> things you'll care about:
> >>
> >> 1. Create instances in AWS for Cassandra using any instance size and
> >> number of nodes.  Also create tlp-stress instances and a box for
> >> monitoring
> >> 2. Use any available build of Cassandra, with a quick option to change
> >> YAML config.  For example: tlp-stress use 3.11.4 -c
> >> concurrent_writes:256
> >> 3. Do custom builds just by pointing to a local Cassandra git repo.
> >> They can be used the same way as #2.
> >> 4. tlp-stress is automatically installed on the stress box.
> >> 5. Everything's installed with pure bash.  I considered something more
> >> complex, but since this is for development only, it turns out the
> >> simplest tool possible works well and it means it's easily
> >> configurable.  Just drop in 

Re: [VOTE] remove the old wiki

2019-06-04 Thread Jon Haddad
I think we could port that page over and clean it up before deleting the
wiki.

On Tue, Jun 4, 2019 at 12:30 PM Joshua McKenzie 
wrote:

> Before I vote, do we have something analogous to this:
> https://wiki.apache.org/cassandra/ArchitectureInternals
> In the new wiki / docs? Looks like it's a stub:
> https://cassandra.apache.org/doc/latest/architecture/overview.html
>
> Having an architectural overview landing page would be critical before
> sunsetting the old one IMO. And yes, that ArchitectureInternals article
> is... very old. But very old > nothing in terms of establishing a framework
> in which to think about something. Maybe.
>
> On Tue, Jun 4, 2019 at 2:47 PM Jon Haddad  wrote:
>
> > I assume everyone here knows the old wiki hasn't been maintained, and is
> > years out of date.  I propose we sunset it completely and delete it
> forever
> > from the world.
> >
> > I'm happy to file the INFRA ticket to delete it, I'd just like to give
> > everyone the opportunity to speak up in case there's something I'm not
> > aware of.
> >
> > In favor of removing the wiki?  That's a +1.
> > -1 if you think we're better off migrating the entire thing to cwiki.
> >
> > If you only need couple pages, feel free to move the content to the
> > documentation.  I'm sure we can also export the wiki in its entirety and
> > put it somewhere offline, if there's a concern about maybe needing some
> of
> > the content at some point in the future.
> >
> > I think 72 hours is enough time to leave a vote open on this topic.
> >
> > Jon
> >
>


[VOTE] remove the old wiki

2019-06-04 Thread Jon Haddad
I assume everyone here knows the old wiki hasn't been maintained, and is
years out of date.  I propose we sunset it completely and delete it forever
from the world.

I'm happy to file the INFRA ticket to delete it, I'd just like to give
everyone the opportunity to speak up in case there's something I'm not
aware of.

In favor of removing the wiki?  That's a +1.
-1 if you think we're better off migrating the entire thing to cwiki.

If you only need couple pages, feel free to move the content to the
documentation.  I'm sure we can also export the wiki in its entirety and
put it somewhere offline, if there's a concern about maybe needing some of
the content at some point in the future.

I think 72 hours is enough time to leave a vote open on this topic.

Jon


Re: [DISCUSS] Moving chats to ASF's Slack instance

2019-05-28 Thread Jon Haddad
+1

On Tue, May 28, 2019, 2:54 PM Joshua McKenzie  wrote:

> +1 to switching over. One less comms client + history + searchability is
> enough to get my vote easy.
>
> On Tue, May 28, 2019 at 5:52 PM Jonathan Ellis  wrote:
>
> > I agree.  This lowers the barrier to entry for new participants.  Slack
> is
> > probably two orders of magnitude more commonly used now than irc for sw
> > devs and three for everyone else.  And then you have the quality-of-life
> > features that you get out of the box with Slack and only with difficulty
> in
> > irc (history, search, file uploads...)
> >
> > On Tue, May 28, 2019 at 4:29 PM Nate McCall  wrote:
> >
> > > Hi Folks,
> > > While working on ApacheCon last week, I had to get setup on ASF's slack
> > > workspace. After poking around a bit, on a whim I created #cassandra
> and
> > > #cassandra-dev. I then invited a couple of people to come signup and
> test
> > > it out - primarily to make sure that the process was seamless for
> non-ASF
> > > account holders as well as committers, etc (it was).
> > >
> > > If you want to jump in, you can signup here:
> > > https://s.apache.org/slack-invite
> > >
> > > That said, I think it's time we transition from IRC to Slack. Now, I
> like
> > > CLI friendly, straight forward tools like IRC as much as anyone, but
> it's
> > > been more than once recently where a user I've talked to has said one
> of
> > > two things regarding our IRC channels: "What's IRC?" or "Yeah, I don't
> > > really do that anymore."
> > >
> > > In short, I think it's time to migrate. I think this will really just
> > > consist of some communications to our lists and updating the site
> > (anything
> > > I'm missing?). The archives of IRC should just kind of persist for
> > > posterity sake without any additional effort or maintenance. The
> > > ASF-requirements are all configured already on the Slack workspace, so
> I
> > > think we are good there.
> > >
> > > Thanks,
> > > -Nate
> > >
> >
> >
> > --
> > Jonathan Ellis
> > co-founder, http://www.datastax.com
> > @spyced
> >
>


Re: "4.0: TBD" -> "4.0: Est. Q4 2019"?

2019-05-28 Thread Jon Haddad
Sept is a pretty long ways off.  I think the ideal case is we can announce
4.0 release at the summit.  I'm not putting this as a "do or die" date, and
I don't think we need to announce it or make promises.  Sticking with "when
it's ready" is the right approach, but we need a target, and this is imo a
good one.

This date also gives us a pretty good runway.  We could cut our first
alphas in mid June / early July, betas in August and release in Sept.
 There's a ton of work going into testing 4.0 already.
Landing CASSANDRA-15066 will put us in a pretty good spot.  We've developed
tooling at TLP that will make it a lot easier to spin up dev clusters in
AWS as well as stress test them.  I've written about this a few times in
the past, and I'll have a few blog posts coming up that will help show this
in more details.

There's some other quality of life things we should try to hammer out
before then.  Updating our default JVM settings would be nice, for
example.  Improving documentation (the data modeling section in
particular), fixing the dynamic snitch issues [1], and some improvements to
virtual tables like exposing the sstable metadata [2], and exposing table
statistics [3] come to mind.  The dynamic snitch improvement will help
performance in a big way, and the virtual tables will go a long way to
helping with quality of life.  I showed a few folks virtual tables at the
Accelerate conference last week and the missing table statistics was a big
shock.  If we can get them in, it'll be a big help to operators.

[1] https://issues.apache.org/jira/browse/CASSANDRA-14459
[2] https://issues.apache.org/jira/browse/CASSANDRA-14630
[3] https://issues.apache.org/jira/browse/CASSANDRA-14572




On Mon, May 27, 2019 at 2:36 PM Nate McCall  wrote:

> Hi Sumanth,
> Thank you so much for taking the time to put this together.
>
> Cheers,
> -Nate
>
> On Tue, May 28, 2019 at 3:27 AM Sumanth Pasupuleti <
> sumanth.pasupuleti...@gmail.com> wrote:
>
> > I have taken an initial stab at documenting release types and exit
> criteria
> > in a google doc, to get us started, and to collaborate on.
> >
> >
> https://docs.google.com/document/d/1bS6sr-HSrHFjZb0welife6Qx7u3ZDgRiAoENMLYlfz8/edit?usp=sharing
> >
> > Thanks,
> > Sumanth
> >
> > On Thu, May 23, 2019 at 12:04 PM Dinesh Joshi  wrote:
> >
> > > Sankalp,
> > >
> > > Great point. This is the page created for testing.
> > >
> > >
> > >
> >
> https://cwiki.apache.org/confluence/display/CASSANDRA/4.0+Quality%3A+Components+and+Test+Plans
> > >
> > > I think we need to define the various release types and the exit
> criteria
> > > for each type of release. Anybody want to take a stab at this or start
> a
> > > thread to discuss it?
> > >
> > > Thanks,
> > >
> > > Dinesh
> > >
> > >
> > > > On May 23, 2019, at 11:57 AM, sankalp kohli 
> > > wrote:
> > > >
> > > > Hi,
> > > >Is there a page where it is written what is expected from an
> alpha,
> > > > beta, rc and a 4.0 release?
> > > > Also how are we coming up with Q4 2019 timeline. Is this for alpha,
> > beta,
> > > > rc or 4.0 release?
> > > >
> > > > Thanks,
> > > > Sankalp
> > > >
> > > > On Thu, May 23, 2019 at 11:27 AM Attila Wind  >
> > > wrote:
> > > >
> > > >> +1+1+1 I read a blog post was talking about last sept(?) to freeze
> > > >> features and start extensive testing. Maybe its really time to hit
> it!
> > > :-)
> > > >>
> > > >> Attila Wind
> > > >>
> > > >> http://www.linkedin.com/in/attilaw
> > > >> Mobile: +36 31 7811355
> > > >>
> > > >>
> > > >> On 2019. 05. 23. 19:30, ajs6f wrote:
> > > >>> +1 in the fullest degree. A date that needs to be changed is still
> > > >> enormously more attractive than no date at all.
> > > >>>
> > > >>> Adam Soroka
> > > >>>
> > >  On May 23, 2019, at 12:01 PM, Sumanth Pasupuleti <
> > > >> spasupul...@netflix.com.INVALID> wrote:
> > > 
> > >  Having at least a ballpark target on the website will definitely
> > help.
> > > >> +1
> > >  on setting it to Q4 2019 for now.
> > > 
> > >  On Thu, May 23, 2019 at 8:52 AM Dinesh Joshi 
> > > wrote:
> > > 
> > > > +1 on setting a date.
> > > >
> > > > Dinesh
> > > >
> > > >> On May 23, 2019, at 11:07 AM, Michael Shuler <
> > > mich...@pbandjelly.org>
> > > > wrote:
> > > >> We've had 4.0 listed as TBD release date for a very long time.
> > > >>
> > > >> Yesterday, Alexander Dejanovski got a "when's 4.0 going to
> > release?"
> > > > question after his repair talk and he suggested possibly Q4 2019.
> > > This
> > > > morning Nate McCall hinted at possibly being close by ApacheCon
> Las
> > > >> Vegas
> > > > in September. These got me thinking..
> > > >> Think we can we shoot for having a 4.0 alpha/beta/rc ready to
> > > > announce/release at ApacheCon? At that time, we'll have been
> frozen
> > > >> for 1
> > > > year, and I think we can. We'll GA release when it's ready, but I
> > > >> think Q4
> > > > could be an realistic target.
> > > >> With 

Re: "4.0: TBD" -> "4.0: Est. Q4 2019"?

2019-05-28 Thread Jon Haddad
My thinking is I'd like to be able to recommend 4.0.0 as a production ready
database for business critical cases of TLP customers.  If it's not ready
for prod, there's no way I'd vote to release it.  The TLP tooling I've
mentioned was developed over the last 6 months with the specific goal of
being able to test custom builds for the 4.0 release, and I've run several
clusters using it already.  The stress tool we built just got a --ttl
option so I should be able to start some longer running clusters that TTL
data out, so we can see the impact of running a cluster under heavy load
for several weeks.



On Tue, May 28, 2019 at 9:57 AM sankalp kohli 
wrote:

> Hi Jon,
>When you say 4.0 release, how do u match it with 3.0 minor
> releases. The unofficial rule is to not upgrade to prod till .10 is cut.
> Also due to heavy investment in testing, I dont think it will take as long
> as 3.0 but want to know what is your thinking with this.
>
> Thanks,
> Sankalp
>
> On Tue, May 28, 2019 at 9:40 AM Jon Haddad  wrote:
>
> > Sept is a pretty long ways off.  I think the ideal case is we can
> announce
> > 4.0 release at the summit.  I'm not putting this as a "do or die" date,
> and
> > I don't think we need to announce it or make promises.  Sticking with
> "when
> > it's ready" is the right approach, but we need a target, and this is imo
> a
> > good one.
> >
> > This date also gives us a pretty good runway.  We could cut our first
> > alphas in mid June / early July, betas in August and release in Sept.
> >  There's a ton of work going into testing 4.0 already.
> > Landing CASSANDRA-15066 will put us in a pretty good spot.  We've
> developed
> > tooling at TLP that will make it a lot easier to spin up dev clusters in
> > AWS as well as stress test them.  I've written about this a few times in
> > the past, and I'll have a few blog posts coming up that will help show
> this
> > in more details.
> >
> > There's some other quality of life things we should try to hammer out
> > before then.  Updating our default JVM settings would be nice, for
> > example.  Improving documentation (the data modeling section in
> > particular), fixing the dynamic snitch issues [1], and some improvements
> to
> > virtual tables like exposing the sstable metadata [2], and exposing table
> > statistics [3] come to mind.  The dynamic snitch improvement will help
> > performance in a big way, and the virtual tables will go a long way to
> > helping with quality of life.  I showed a few folks virtual tables at the
> > Accelerate conference last week and the missing table statistics was a
> big
> > shock.  If we can get them in, it'll be a big help to operators.
> >
> > [1] https://issues.apache.org/jira/browse/CASSANDRA-14459
> > [2] https://issues.apache.org/jira/browse/CASSANDRA-14630
> > [3] https://issues.apache.org/jira/browse/CASSANDRA-14572
> >
> >
> >
> >
> > On Mon, May 27, 2019 at 2:36 PM Nate McCall  wrote:
> >
> > > Hi Sumanth,
> > > Thank you so much for taking the time to put this together.
> > >
> > > Cheers,
> > > -Nate
> > >
> > > On Tue, May 28, 2019 at 3:27 AM Sumanth Pasupuleti <
> > > sumanth.pasupuleti...@gmail.com> wrote:
> > >
> > > > I have taken an initial stab at documenting release types and exit
> > > criteria
> > > > in a google doc, to get us started, and to collaborate on.
> > > >
> > > >
> > >
> >
> https://docs.google.com/document/d/1bS6sr-HSrHFjZb0welife6Qx7u3ZDgRiAoENMLYlfz8/edit?usp=sharing
> > > >
> > > > Thanks,
> > > > Sumanth
> > > >
> > > > On Thu, May 23, 2019 at 12:04 PM Dinesh Joshi 
> > wrote:
> > > >
> > > > > Sankalp,
> > > > >
> > > > > Great point. This is the page created for testing.
> > > > >
> > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/CASSANDRA/4.0+Quality%3A+Components+and+Test+Plans
> > > > >
> > > > > I think we need to define the various release types and the exit
> > > criteria
> > > > > for each type of release. Anybody want to take a stab at this or
> > start
> > > a
> > > > > thread to discuss it?
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Dinesh
> > > > >
> > > > >
> > > > > > On May 23, 2019, at

Re: Jira Suggestion

2019-05-14 Thread Jon Haddad
Great idea. +1

On Tue, May 14, 2019, 12:10 PM Benedict Elliott Smith 
wrote:

> It will be possible to insert n/a.  It will simply be a text field - Jira
> doesn’t know anything about the concept of a SHA, and I don’t intend to
> introduce validation logic.  It’s just a logical and consistent place for
> it to live, and a strong reminder to include it.  My intention is for it to
> be a text field supporting Jira markup, like Test and Doc Plan, so that we
> can insert cleanly formatted links to GitHub just like we do now in
> comments.
>
>
>
> > On 14 May 2019, at 20:04, Dinesh Joshi  wrote:
> >
> > I am +0.5 on this. I think it is a good idea. I want to ensure that we
> capture use-cases such as Tasks that may not have a git commit associated
> with them. There might be tickets that may have multiple git commits across
> repos. SVN commits may also need to be handled.
> >
> > Dinesh
> >
> >> On May 14, 2019, at 11:34 AM, Jeff Jirsa  wrote:
> >>
> >> Please
> >>
> >> --
> >> Jeff Jirsa
> >>
> >>
> >>> On May 14, 2019, at 7:53 AM, Benedict Elliott Smith <
> bened...@apache.org> wrote:
> >>>
> >>> How would people feel about introducing a field for the (git) commit
> SHA, to be required on (Jira) commit?
> >>>
> >>> The norm is that we comment the SHA, but given this is the norm
> perhaps we should codify it instead, while we have the chance?  It would
> also make it easier to find.
> >>> -
> >>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> >>> For additional commands, e-mail: dev-h...@cassandra.apache.org
> >>>
> >>
> >> -
> >> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> >> For additional commands, e-mail: dev-h...@cassandra.apache.org
> >>
> >
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > For additional commands, e-mail: dev-h...@cassandra.apache.org
> >
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
>


Re: TLP tools for stress testing and building test clusters in AWS

2019-04-17 Thread Jon Haddad
Hey folks.  I've opened the 9am zoom session.

You can join here: https://zoom.us/j/189920888


On Tue, Apr 16, 2019 at 10:49 PM Stefan Miklosovic
 wrote:
>
> Thanks Anthony for going that proverbial extra mile to cover people in
> different time zones too.
>
> I believe other people will find your talk as helpful as we did.
>
> Regards
>
> On Wed, 17 Apr 2019 at 10:08, Anthony Grasso  wrote:
> >
> > Hi Stefan and devs,
> >
> > I have set up a zoom link for the TLP tool set intro that will be on in an
> > hours time (17 April 2019 @ 11:00AM AEST): https://zoom.us/j/272648772
> >
> > This link is open so if anyone else wishes to join they are welcome to do
> > so. I will be covering the same topics Jon is covering in his meeting
> > tomorrow.
> >
> > Regards,
> > Anthony
> >
> >
> > On Wed, 17 Apr 2019 at 08:29, Anthony Grasso 
> > wrote:
> >
> > > Hi Stefan,
> > >
> > > Thanks for sending the invite out!
> > >
> > > Just wondering what do you think of the idea of having a Zoom meeting that
> > > anyone can join? This way anyone else interested can join us as well. I 
> > > can
> > > set that up if you like?
> > >
> > > Cheers,
> > > Anthony
> > >
> > > On Tue, 16 Apr 2019 at 21:24, Stefan Miklosovic <
> > > stefan.mikloso...@instaclustr.com> wrote:
> > >
> > >> Hi Anthony,
> > >>
> > >> sounds good. I ve sent you Hangouts meeting invitation privately.
> > >>
> > >> Regards
> > >>
> > >> On Tue, 16 Apr 2019 at 14:53, Anthony Grasso 
> > >> wrote:
> > >> >
> > >> > Hi Stefan,
> > >> >
> > >> > I have been working with Jon on developing the tool set. I can do a 
> > >> > Zoom
> > >> > call tomorrow (Wednesday) at 11am AEST if that works for you? We can go
> > >> > through all the same information that Jon is going to go through in his
> > >> > call. Note that I am in the same timezone as you, so if tomorrow
> > >> morning is
> > >> > no good we can always do the afternoon.
> > >> >
> > >> > Cheers,
> > >> > Anthony
> > >> >
> > >> >
> > >> > On Sat, 13 Apr 2019 at 22:38, Stefan Miklosovic <
> > >> > stefan.mikloso...@instaclustr.com> wrote:
> > >> >
> > >> > > Hi Jon,
> > >> > >
> > >> > > I would like be on that call too but I am off on Thursday.
> > >> > >
> > >> > > I am from Australia so 5pm London time is ours 2am next day so your
> > >> > > Wednesday morning is my Thursday night. Wednesday early morning so
> > >> > > your Tuesday morning and London's afternoon would be the best.
> > >> > >
> > >> > > Recording the thing would be definitely helpful too.
> > >> > >
> > >> > > On Sat, 13 Apr 2019 at 07:45, Jon Haddad  wrote:
> > >> > > >
> > >> > > > I'd be more than happy to hop on a call next week to give you both
> > >> > > > (and anyone else interested) a tour of our dev tools.  Maybe
> > >> something
> > >> > > > early morning on my end, which should be your evening, could work?
> > >> > > >
> > >> > > > I can set up a Zoom conference to get everyone acquainted.  We can
> > >> > > > record and post it for any who can't make it.
> > >> > > >
> > >> > > > I'm thinking Tuesday, Wednesday, or Thursday morning, 9AM Pacific
> > >> (5pm
> > >> > > > London)?  If anyone's interested please reply with what dates work.
> > >> > > > I'll be sure to post the details back here with the zoom link in
> > >> case
> > >> > > > anyone wants to join that didn't get a chance to reply, as well as 
> > >> > > > a
> > >> > > > link to the recorded call.
> > >> > > >
> > >> > > > Jon
> > >> > > >
> > >> > > > On Fri, Apr 12, 2019 at 10:41 AM Benedict Elliott Smith
> > >> > > >  wrote:
> > >> > > > >
> > >> > > > > +1
> > >> > > > >
> > >> > > > > I’m also just as excited to see 

Re: Stability of MaterializedView in 3.11.x | 4.0

2019-08-30 Thread Jon Haddad
If you don't have any intent on running across multiple nodes, Cassandra is
probably the wrong DB for you.

Postgres will give you a better feature set for a single node.

On Fri, Aug 30, 2019 at 5:23 AM Pankaj Gajjar 
wrote:

> Understand it well, how about Cassandra running on single node, we don’t
> have cluster setup (3 nodes+ i.e).
>
> Does MVs perform well on single node machine ?
>
> Note: I know about HA, so lets keep it side for now and it's only possible
> when we have cluster setup.
>
> On 29/08/19, 06:21, "Dor Laor"  wrote:
>
> On Wed, Aug 28, 2019 at 5:43 PM Jon Haddad  wrote:
>
> > >  Arguably, the other alternative to server-side denormalization is
> to do
> > the denormalization client-side which comes with the same axes of
> costs and
> > complexity, just with more of each.
> >
> > That's not completely true.  You can write to any number of tables
> without
> > doing a read, and the cost of reading data off disk is significantly
> > greater than an insert alone.  You can crush a cluster with a write
> heavy
> > workload and MVs that would otherwise be completely fine to do all
> writes.
> >
> > The other issue with MVs is that you still need to understand
> fundamentals
> > of data modeling, that don't magically solve the problem of enormous
> > partitions.  One of the reasons I've had to un-MV a lot of clusters
> is
> > because people have put an MV on a table with a low-cardinality
> field and
> > found themselves with a 10GB partition nightmare, so they need to go
> back
> > and remodel the view as something more complex anyways.  In this
> case, the
> > MV was extremely high cost since now they've not only pushed out a
> poor
> > implementation to begin with but now have the cost of a migration as
> well
> > as a rewrite.
> >
>
> +1
>
> Moreover, the hard part is that an update for the base table means that
> the original data needs to be read and the database (or the poor
> developer
> who implements the denormalized model) needs to delete the data in the
> view
> and then to write the new ones. All need to be of course resilient to
> all
> types of
> errors and failures. Had it been simple, there was no need for a
> database
> MV..
>
>
> >
> >
> >
> > On Wed, Aug 28, 2019 at 9:58 AM Joshua McKenzie <
> jmcken...@apache.org>
> > wrote:
> >
> > > >
> > > > so we need to start migration from MVs to manual query base
> table ?
> > >
> > >  Arguably, the other alternative to server-side denormalization is
> to do
> > > the denormalization client-side which comes with the same axes of
> costs
> > and
>     > > complexity, just with more of each.
> > >
> > > Jeff's spot on when he discusses the risk appetite vs. mitigation
> aspect
> > of
> > > it. There's a reason banks do end-of-day close-out validation
> analysis
> > and
> > > have redundant systems for things like this.
> > >
> > > On Wed, Aug 28, 2019 at 11:49 AM Jon Haddad 
> wrote:
> > >
> > > > I've helped a lot of teams (a dozen to two dozen maybe) migrate
> away
> > from
> > > > MVs due to inconsistencies, issues with streaming (have you
> added or
> > > > removed nodes yet?), and massive performance issues to the point
> of
> > > cluster
> > > > failure under (what I consider) trivial load.  I haven't gone
> too deep
> > > into
> > > > analyzing their issues, folks are usually fine with "move off
> them", vs
> > > > having me do a ton of analysis.
> > > >
> > > > tlp-stress has a materialized view workload built in, and you
> can add
> > > > arbitrary CQL via the --cql flag to add a MV to any existing
> workload
> > > such
> > > > as KeyValue or BasicTimeSeries.
> > > >
> > > > On Wed, Aug 28, 2019 at 8:11 AM Jeff Jirsa 
> wrote:
> > > >
> > > > > There have been people who have had operational issues related
> to MVs
> > > > (many
> > > > > of them around running repair), but the biggest concern is
> > correctness.
> > > > >
> > > > > It probably ultimately depends on what type of database you're
>

4.0 alpha before apachecon?

2019-08-28 Thread Jon Haddad
Hey folks,

I think it's time we cut a 4.0 alpha release.  Before I put up a vote
thread, is there a reason not to have a 4.0 alpha before ApacheCon /
Cassandra Summit?

There's a handful of small issues that I should be done for 4.0 (client
list in virtual tables, dynamic snitch improvements, fixing token counts),
I'm not trying to suggest we don't include them, but they're small enough I
think it's OK to merge them in following the first alpha.

Jon


Re: Stability of MaterializedView in 3.11.x | 4.0

2019-08-28 Thread Jon Haddad
I've helped a lot of teams (a dozen to two dozen maybe) migrate away from
MVs due to inconsistencies, issues with streaming (have you added or
removed nodes yet?), and massive performance issues to the point of cluster
failure under (what I consider) trivial load.  I haven't gone too deep into
analyzing their issues, folks are usually fine with "move off them", vs
having me do a ton of analysis.

tlp-stress has a materialized view workload built in, and you can add
arbitrary CQL via the --cql flag to add a MV to any existing workload such
as KeyValue or BasicTimeSeries.

On Wed, Aug 28, 2019 at 8:11 AM Jeff Jirsa  wrote:

> There have been people who have had operational issues related to MVs (many
> of them around running repair), but the biggest concern is correctness.
>
> It probably ultimately depends on what type of database you're running. If
> you're running some sort of IOT / analytics workload and you just want
> another way to SELECT the data, but you won't notice one of a billion
> records going missing, using MVs may be fine. If you're a bank, and one of
> a billion records going missing means you lose someone's bank account, I
> would avoid using MVs.
>
> It's all just risk management.
>
> On Wed, Aug 28, 2019 at 7:18 AM Pankaj Gajjar <
> pankaj.gaj...@contentserv.com>
> wrote:
>
> > Hi Michael,
> >
> > Thanks for putting very clever information " Users of MVs *must*
> determine
> > for themselves, through
> > thorough testing and understanding, if they wish to use them." And
> > this concluded that if there is any issue occur in future then only
> > solution is to rebuild the MVs since Cassandra does not able to make
> > consistent synch well.
> >
> > Also, we practically using the 10+ MVs and as of now, we have not faced
> > any issue, so my question to all community member, does anyone face any
> > critical issues ? so we need to start migration from MVs to manual query
> > base table ?
> >
> > Also, I can understand now, it's experimental and not ready for
> > production, so if possible, please ignore it only right ?
> >
> > Thanks
> > Pankaj
> >
> > On 27/08/19, 19:03, "Michael Shuler"  > of mich...@pbandjelly.org> wrote:
> >
> > It appears that you found the first message of the chain. I suggest
> > reading the linked JIRA and the complete dev@ thread that arrived at
> > this conclusion; there are loads of well formed opinions and
> > information. Users of MVs *must* determine for themselves, through
> > thorough testing and understanding, if they wish to use them.
> >
> > Linkage:
> > https://issues.apache.org/jira/browse/CASSANDRA-13959
> >   (sub-linkage..)
> >   https://issues.apache.org/jira/browse/CASSANDRA-13595
> >   https://issues.apache.org/jira/browse/CASSANDRA-13911
> >   https://issues.apache.org/jira/browse/CASSANDRA-13880
> >   https://issues.apache.org/jira/browse/CASSANDRA-12872
> >   https://issues.apache.org/jira/browse/CASSANDRA-13747
> >
> > Very much worth reading the complete thread:
> > part1:
> >
> >
> https://lists.apache.org/thread.html/d81a61da48e1b872d7599df4edfa8e244d34cbd591a18539f724796f@
> > 
> > part2:
> >
> >
> https://lists.apache.org/thread.html/19b7fcfd3b47f1526d6e993b3bb97f6c43e5ce204bc976ec0701cdd3@
> > 
> >
> > Quick JQL for open tickets with "mv":
> >
> >
> https://issues.apache.org/jira/issues/?jql=project%20%3D%20CASSANDRA%20AND%20text%20~%20mv%20AND%20status%20!%3D%20Resolved
> >
> > --
> > Michael
> >
> > On 8/27/19 5:01 AM, pankaj gajjar wrote:
> > > Hello,
> > >
> > >
> > >
> > > concern about Materialized Views (MVs) in Cassandra. Unfortunately
> > starting
> > > with version 3.11, MVs are officially considered experimental and
> > not ready
> > > for production use, as you can read here:
> > >
> > >
> > >
> > >
> >
> http://mail-archives.apache.org/mod_mbox/cassandra-user/201710.mbox/%3cetpan.59f24f38.438f4e99.7...@apple.com%3E
> > >
> > >
> > >
> > > Can you please someone give some productive feedback on this ? it
> > would
> > > help us to further implementation around the MVs in Cassandra.
> > >
> > >
> > >
> > > Does anyone facing some critical issue or data lose or
> > synchronization
> > > issue ?
> > >
> > >
> > >
> > > Regards
> > >
> > > Pankaj.
> > >
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > For additional commands, e-mail: dev-h...@cassandra.apache.org
> >
> >
> >
> >
>


Re: Stability of MaterializedView in 3.11.x | 4.0

2019-08-28 Thread Jon Haddad
>  Arguably, the other alternative to server-side denormalization is to do
the denormalization client-side which comes with the same axes of costs and
complexity, just with more of each.

That's not completely true.  You can write to any number of tables without
doing a read, and the cost of reading data off disk is significantly
greater than an insert alone.  You can crush a cluster with a write heavy
workload and MVs that would otherwise be completely fine to do all writes.

The other issue with MVs is that you still need to understand fundamentals
of data modeling, that don't magically solve the problem of enormous
partitions.  One of the reasons I've had to un-MV a lot of clusters is
because people have put an MV on a table with a low-cardinality field and
found themselves with a 10GB partition nightmare, so they need to go back
and remodel the view as something more complex anyways.  In this case, the
MV was extremely high cost since now they've not only pushed out a poor
implementation to begin with but now have the cost of a migration as well
as a rewrite.



On Wed, Aug 28, 2019 at 9:58 AM Joshua McKenzie 
wrote:

> >
> > so we need to start migration from MVs to manual query base table ?
>
>  Arguably, the other alternative to server-side denormalization is to do
> the denormalization client-side which comes with the same axes of costs and
> complexity, just with more of each.
>
> Jeff's spot on when he discusses the risk appetite vs. mitigation aspect of
> it. There's a reason banks do end-of-day close-out validation analysis and
> have redundant systems for things like this.
>
> On Wed, Aug 28, 2019 at 11:49 AM Jon Haddad  wrote:
>
> > I've helped a lot of teams (a dozen to two dozen maybe) migrate away from
> > MVs due to inconsistencies, issues with streaming (have you added or
> > removed nodes yet?), and massive performance issues to the point of
> cluster
> > failure under (what I consider) trivial load.  I haven't gone too deep
> into
> > analyzing their issues, folks are usually fine with "move off them", vs
> > having me do a ton of analysis.
> >
> > tlp-stress has a materialized view workload built in, and you can add
> > arbitrary CQL via the --cql flag to add a MV to any existing workload
> such
> > as KeyValue or BasicTimeSeries.
> >
> > On Wed, Aug 28, 2019 at 8:11 AM Jeff Jirsa  wrote:
> >
> > > There have been people who have had operational issues related to MVs
> > (many
> > > of them around running repair), but the biggest concern is correctness.
> > >
> > > It probably ultimately depends on what type of database you're running.
> > If
> > > you're running some sort of IOT / analytics workload and you just want
> > > another way to SELECT the data, but you won't notice one of a billion
> > > records going missing, using MVs may be fine. If you're a bank, and one
> > of
> > > a billion records going missing means you lose someone's bank account,
> I
> > > would avoid using MVs.
> > >
> > > It's all just risk management.
> > >
> > > On Wed, Aug 28, 2019 at 7:18 AM Pankaj Gajjar <
> > > pankaj.gaj...@contentserv.com>
> > > wrote:
> > >
> > > > Hi Michael,
> > > >
> > > > Thanks for putting very clever information " Users of MVs *must*
> > > determine
> > > > for themselves, through
> > > > thorough testing and understanding, if they wish to use them."
> And
> > > > this concluded that if there is any issue occur in future then only
> > > > solution is to rebuild the MVs since Cassandra does not able to make
> > > > consistent synch well.
> > > >
> > > > Also, we practically using the 10+ MVs and as of now, we have not
> faced
> > > > any issue, so my question to all community member, does anyone face
> any
> > > > critical issues ? so we need to start migration from MVs to manual
> > query
> > > > base table ?
> > > >
> > > > Also, I can understand now, it's experimental and not ready for
> > > > production, so if possible, please ignore it only right ?
> > > >
> > > > Thanks
> > > > Pankaj
> > > >
> > > > On 27/08/19, 19:03, "Michael Shuler"  > behalf
> > > > of mich...@pbandjelly.org> wrote:
> > > >
> > > > It appears that you found the first message of the chain. I
> suggest
> > > > reading the linked JIRA and the complete dev@ thread that
> arrived
> > at
> > > > this conclusion

Re: 4.0 alpha before apachecon?

2019-08-28 Thread Jon Haddad
Regarding the dynamic snitch improvements, it's gone through several rounds
of review already and there's been significant testing of it.  Regarding
the token change, switching a number from 256 -> 16 isn't so invasive that
we shouldn't do it.  There's a little extra work that needs to be done
there ideally to ensure safety, but it's again small enough where it
shouldn't be too big of a problem imo.  Both current implementations (256
tokens + our insanely over memory allocating dynamic snitch) limit the
ability of people to run large clusters, harming both availability and
performance.  It's been extremely harmful for Cassandra's reputation and
I'd really like it if we could ship something where I don't have to
constantly apologize to people I'm trying to help for the land mine
defaults we put out there.

To your point, I agree as a community we're lacking in an open, well
documented and up to date plan, and it needs to be addressed.  I think the
virtual meetings idea held at a regular might help a bit with that, I
intend on participating there.


On Wed, Aug 28, 2019 at 9:52 AM Joshua McKenzie 
wrote:

> >
> > dynamic snitch improvements, fixing token counts
>
>
>
> > they're small enough
>
>
> By what axis of measurement out of curiosity? Risk to re-test and validate
> a final artifact? Do we have a more clear understanding of what testing has
> taken place across the community?
>
> The last I saw, our documented test plan
> <
> https://cwiki.apache.org/confluence/display/CASSANDRA/4.0+Quality%3A+Components+and+Test+Plans
> >
> hasn't
> been maintained or kept up to date
> <
> https://issues.apache.org/jira/browse/CASSANDRA-14862?jql=project%20%3D%20CASSANDRA%20AND%20%20labels%20%3D%204.0-QA
> >.
> Is there another artifact reflecting what testing people have in flight to
> better reflect what risk of needing to re-test we have from these (and
> other) post-freeze changes?
>
>
>
> On Wed, Aug 28, 2019 at 11:52 AM Jon Haddad  wrote:
>
> > Hey folks,
> >
> > I think it's time we cut a 4.0 alpha release.  Before I put up a vote
> > thread, is there a reason not to have a 4.0 alpha before ApacheCon /
> > Cassandra Summit?
> >
> > There's a handful of small issues that I should be done for 4.0 (client
> > list in virtual tables, dynamic snitch improvements, fixing token
> counts),
> > I'm not trying to suggest we don't include them, but they're small
> enough I
> > think it's OK to merge them in following the first alpha.
> >
> > Jon
> >
>


Re: 4.0 alpha before apachecon?

2019-08-28 Thread Jon Haddad
Yes we do.  It's one of the reasons I've spent about a lot of (thousands?)
hours working on tlp-stress and tlp-cluster in the last 2 years.  I shared
some progress on this a little ways back.  I'll send out a separate email
soon with updates, since we just merged in a *lot* of features that will
help with testing.

On Wed, Aug 28, 2019 at 10:52 AM Dinesh Joshi  wrote:

> +1 on cutting an alpha and having a clear, documented test plan[1] for
> alpha. We need volunteers to drive the test plan, though. :)
>
> Thanks,
>
> Dinesh
>
> [1]
> https://cwiki.apache.org/confluence/display/CASSANDRA/4.0+Quality%3A+Components+and+Test+Plans
>
> > On Aug 28, 2019, at 10:27 AM, Jon Haddad  wrote:
> >
> > Regarding the dynamic snitch improvements, it's gone through several
> rounds
> > of review already and there's been significant testing of it.  Regarding
> > the token change, switching a number from 256 -> 16 isn't so invasive
> that
> > we shouldn't do it.  There's a little extra work that needs to be done
> > there ideally to ensure safety, but it's again small enough where it
> > shouldn't be too big of a problem imo.  Both current implementations (256
> > tokens + our insanely over memory allocating dynamic snitch) limit the
> > ability of people to run large clusters, harming both availability and
> > performance.  It's been extremely harmful for Cassandra's reputation and
> > I'd really like it if we could ship something where I don't have to
> > constantly apologize to people I'm trying to help for the land mine
> > defaults we put out there.
> >
> > To your point, I agree as a community we're lacking in an open, well
> > documented and up to date plan, and it needs to be addressed.  I think
> the
> > virtual meetings idea held at a regular might help a bit with that, I
> > intend on participating there.
> >
> >
> > On Wed, Aug 28, 2019 at 9:52 AM Joshua McKenzie 
> > wrote:
> >
> >>>
> >>> dynamic snitch improvements, fixing token counts
> >>
> >>
> >>
> >>> they're small enough
> >>
> >>
> >> By what axis of measurement out of curiosity? Risk to re-test and
> validate
> >> a final artifact? Do we have a more clear understanding of what testing
> has
> >> taken place across the community?
> >>
> >> The last I saw, our documented test plan
> >> <
> >>
> https://cwiki.apache.org/confluence/display/CASSANDRA/4.0+Quality%3A+Components+and+Test+Plans
> >>>
> >> hasn't
> >> been maintained or kept up to date
> >> <
> >>
> https://issues.apache.org/jira/browse/CASSANDRA-14862?jql=project%20%3D%20CASSANDRA%20AND%20%20labels%20%3D%204.0-QA
> >>> .
> >> Is there another artifact reflecting what testing people have in flight
> to
> >> better reflect what risk of needing to re-test we have from these (and
> >> other) post-freeze changes?
> >>
> >>
> >>
> >> On Wed, Aug 28, 2019 at 11:52 AM Jon Haddad  wrote:
> >>
> >>> Hey folks,
> >>>
> >>> I think it's time we cut a 4.0 alpha release.  Before I put up a vote
> >>> thread, is there a reason not to have a 4.0 alpha before ApacheCon /
> >>> Cassandra Summit?
> >>>
> >>> There's a handful of small issues that I should be done for 4.0 (client
> >>> list in virtual tables, dynamic snitch improvements, fixing token
> >> counts),
> >>> I'm not trying to suggest we don't include them, but they're small
> >> enough I
> >>> think it's OK to merge them in following the first alpha.
> >>>
> >>> Jon
> >>>
> >>
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
>


Re: 4.0 alpha before apachecon?

2019-08-29 Thread Jon Haddad
Agreed. There's no point in a branch if we aren't committing new features
to trunk, and I don't think we should yet.

On Thu, Aug 29, 2019 at 3:50 PM Dinesh Joshi  wrote:

> We should not branch trunk at least until the RC is out.
>
> Dinesh
>
> > On Aug 29, 2019, at 3:32 PM, Sankalp Kohli 
> wrote:
> >
> > I do not think we should branch and is -1 on it. The reason we have
> trunk frozen was for our focus to be on 4.0. I think we still need that
> focus till a few more releases like these.
> >
> >> On Aug 30, 2019, at 12:24 AM, Nate McCall  wrote:
> >>
> >> On Fri, Aug 30, 2019 at 10:11 AM Benedict Elliott Smith <
> bened...@apache.org>
> >> wrote:
> >>
> >>>
> >>>   Seems to make sense to branch, right?
> >>>
> >>> Feels like a good line in the sand. +1
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > For additional commands, e-mail: dev-h...@cassandra.apache.org
> >
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
>


Re: [VOTE] Release Apache Cassandra 4.0-alpha1 (24 hour vote)

2019-09-05 Thread Jon Haddad
+1

On Thu, Sep 5, 2019 at 3:44 PM Michael Shuler 
wrote:

> I propose the following artifacts for release as 4.0-alpha1.
>
> sha1: fc4381ca89ab39a82c9018e5171975285cc3bfe7
> Git:
>
> https://gitbox.apache.org/repos/asf?p=cassandra.git;a=shortlog;h=refs/tags/4.0-alpha1-tentative
> Artifacts:
>
> https://repository.apache.org/content/repositories/orgapachecassandra-1177/org/apache/cassandra/apache-cassandra/4.0-alpha1/
> Staging repository:
> https://repository.apache.org/content/repositories/orgapachecassandra-1177/
>
> The Debian and RPM packages are available here:
> http://people.apache.org/~mshuler
>
> The vote will be open for 24 hours (longer if needed).
>
> [1]: CHANGES.txt:
>
> https://gitbox.apache.org/repos/asf?p=cassandra.git;a=blob_plain;f=CHANGES.txt;hb=refs/tags/4.0-alpha1-tentative
> [2]: NEWS.txt:
>
> https://gitbox.apache.org/repos/asf?p=cassandra.git;a=blob_plain;f=CHANGES.txt;hb=refs/tags/4.0-alpha1-tentative
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
>


Re: Cassandra image for Kubernetes

2019-09-19 Thread Jon Haddad
I recently had a side conversation about including a Prometheus endpoint in
the sidecar project, which would query virtual tables exclusively.  Is it
reasonable to require running the sidecar in addition to Cassandra when
deploying on K8?  My container knowledge is pretty limited.

If that's not a problem, I think it shouldn't be too hard to add, it's just
a matter of someone finding the time.

On Thu, Sep 19, 2019 at 1:39 PM Dimo Velev  wrote:

> Hi,
>
> A docker image for Cassandra would be the first step in the right
> direction. What one would really want is a kubernetes Operator that can
> deploy, upgrade, etc. a Cassandra cluster (
> https://kubernetes.io/docs/concepts/extend-kubernetes/operator).
>
> A base image with placeholders would be neat for people to quickly start
> using Cassandra. We have built our own image (actually a few of them over
> the time) and also use it e.g. for cqlsh (docker run -it —rm —entrypoint
> cqlsh  —help). What one would probably do in the end is building
> their own image based on the official one adding a layer with the
> configuration file, certificates, etc (or mounting them from a configmap in
> kubernetes).
>
> Some things to consider for the image:
> • Special care must be taken to make the image usable in OpenShift as it
> does not run the processes in the containers as root.
> • Logging should be changed to stdout so that it is easier to use and
> automatically picked up by log indexers of kubernetes clusters.
> • As for metrics, exposing Prometheus endpoint with them would be great as
> this makes scraping them just a matter of meta configuration of the pods
> (telling Prometheus the actual endpoint to scrape)
> • I had issues with Cassandra binding on the wrong IP in a container
>
> Would be happy to help with the image part / testing on OpenShift.
>
> Cheers,
> Dimo
>
> > On 19. Sep 2019, at 18:32, Nate McCall  wrote:
> >
> > Hi Cyril,
> > Thanks for bringing this topic up. I think it would be a good idea for us
> > to have an "official" docker file in the source tree.
> >
> > There are, however, some caveats:
> >
> https://issues.apache.org/jira/browse/LEGAL-270?focusedCommentId=15524446=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-15524446
> >
> > As long as we could adhere to those guidelines (which I don't see as
> hard)
> > we could do this.
> >
> > In looking through your specific image (thanks for posting this, btw), I
> > would personally prefer something with a lot fewer dependencies
> (basically
> > from just our source tree) and a lot more replacement properties
> available
> > for config files.
> >
> > Curious what other folks think?
> >
> > Cheers,
> > -Nate
> >
> >> On Wed, Sep 18, 2019 at 6:43 AM Cyril Scetbon 
> wrote:
> >>
> >> Hey guys,
> >>
> >> I heard that at the last summit there were discussions about providing
> an
> >> official docker image to run Cassandra on Kubernetes. Is it something
> that
> >> you’ve started to work on ? We have our own at
> >> https://github.com/Orange-OpenSource/cassandra-image <
> >> https://github.com/Orange-OpenSource/cassandra-image> but I think
> >> providing an official image makes sense. As long as we can easily do
> >> everything we do today. We could also collaborate.
> >>
> >> Thank you
> >> —
> >> Cyril Scetbon
> >>
> >>
>


Re: moving the site from SVN to git

2019-10-02 Thread Jon Haddad
I created an INFRA ticket here:
https://issues.apache.org/jira/browse/INFRA-19218.

On Wed, Sep 25, 2019 at 6:04 AM Michael Shuler 
wrote:

> I see no good reason to trash history. There are tools to make moving
> from svn to git (hopefully) painless. We used git-svn for the main c*
> source to retain history of both, which this tool uses to do migrations
> - https://github.com/nirvdrum/svn2git
>
> Michael
>
> On 9/25/19 12:57 AM, Mick Semb Wever wrote:
> >
> >> Personally, no, I don't.  What I need to know is if someone who actually
> >> works on the site needs the history in *git*.
> >
> >
> > Yes. I need the history in *git*.
> >
> >
> > And I believe that INFRA can do the migration for you.
> > (For example, INFRA-12055 and spark-website)
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > For additional commands, e-mail: dev-h...@cassandra.apache.org
> >
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
>


fixing paging state for 4.0

2019-09-24 Thread Jon Haddad
I'm working with a team who just ran into CASSANDRA-14683 [1], which I
didn't realize was an issue till now.

Anyone have an interest in fixing full table pagination?  I'm not sure of
the full implications of changing the int to a long in the paging stage.

https://issues.apache.org/jira/browse/CASSANDRA-14683


Re: [DISCUSS] C* track for ApacheCon 2020 (was: ApacheCon North America 2020, project participation)

2019-10-01 Thread Jon Haddad
Completely agree, we should absolutely do it.

On Tue, Oct 1, 2019 at 12:27 PM Nate McCall  wrote:

> [Moved to C* only thread]
>
> We got a lot of feedback from folks after this year's track. I'm keen to do
> this again. At this point I think we are just looking at saying "yes we
> would like to participate" and all details about tracks, etc. will be TBD.
>
> Thoughts?
>
> -Nate
>
> -- Forwarded message -
> From: Rich Bowen 
> Date: Wed, Oct 2, 2019 at 5:36 AM
> Subject: ApacheCon North America 2020, project participation
> To: plann...@apachecon.com 
>
>
> Hi, folks,
>
> (Note: You're receiving this email because you're on the dev@ list for
> one or more Apache Software Foundation projects.)
>
> For ApacheCon North America 2019, we asked projects to participate in
> the creation of project/topic specific tracks. This was very successful,
> with about 15 projects stepping up to curate the content for their
> track/summit/event.
>
> We need to know if you're going to do the same for 2020. This informs
> how large a venue we book for the event, how long the event runs, and
> many other considerations.
>
> If you intend to participate again in 2020, we need to hear from you on
> the plann...@apachecon.com mailing list. This is not a firm commitment,
> but we need to know if you're, say, 75% confident that you'll be
> participating.
>
> And, no, we do not have any details at all, but assume that it will be
> in roughly the same calendar space as this year's event, ie, somewhere
> in the August-October timeframe.
>
> Thanks.
>
> --
> Rich Bowen
> VP Conferences
> The Apache Software Foundation
> @apachecon
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>


putting the alphas on the website downloads section

2019-11-04 Thread Jon Haddad
I noticed we don't currently list the alpha in the downloads section.
Anyone object if I add the relevant information after the "Older supported
releases" section in the downloads page?  I'd make it clear that this is
alpha and non-production release, and we're soliciting feedback.

http://cassandra.apache.org/download/

Jon


Re: putting the alphas on the website downloads section

2019-11-06 Thread Jon Haddad
Seems reasonable.  I can set up that wiki page and update the website at
the end of the week, unless someone else gets to it first.

Maybe I should know this already - is there a nightly build that's already
created we could also point people to?

Jon



On Mon, Nov 4, 2019 at 2:59 PM Michael Shuler 
wrote:

> wiki != project website
>
> I think this sounds completely reasonable for a wiki page, and anyone
> can edit easily. Good suggestion.
>
> Michael
>
> On 11/4/19 3:18 PM, Joshua McKenzie wrote:
> > Is there an opportunity to consider a separate "upcoming release testing"
> > type page with downloads to alpha releases? Sounds like, as per letter of
> > the law, we wouldn't that on the official project page but getting
> > something going where we can have project-wide "test out this alpha" or
> > where individual devs could post builds of a feature they're working on
> at
> > similar milestones (alpha, beta, etc) might be helpful in terms of
> getting
> > a healthier dev <-> user feedback cycle going on some things. Maybe a
> wiki
> > page with this type of information?
> >
> > Ultimately I'd like to see a way for us to reduce friction to users
> getting
> > involved in the testing of C* if possible without crossing that line into
> > risking people running alpha code on accident in a production
> environment.
> >
> > On Mon, Nov 4, 2019 at 3:10 PM Michael Shuler 
> > wrote:
> >
> >> I will also add that I did send the user@ list 4.0-alpha release notes,
> >> along with dev@, and also added to the @cassandra tweet last week. I
> >> thought those were acceptable to get a little wider audience, but didn't
> >> want to link from downloads page, since this is explicit.
> >>
> >> Michael
> >>
> >> On 11/4/19 2:06 PM, Michael Shuler wrote:
> >>> -1 (I looked into this when we released 4.0-alpha1)
> >>>
> >>> "During the process of developing software and preparing a release,
> >>> various packages are made available to the developer community for
> >>> testing purposes. Do not include any links on the project website that
> >>> might encourage non-developers to download and use nightly builds,
> >>> snapshots, release candidates, or any other similar package. The only
> >>> people who are supposed to know about such packages are the people
> >>> following the dev list (or searching its archives) and thus aware of
> the
> >>> conditions placed on the package. If you find that the general public
> >>> are downloading such test packages, then remove them."
> >>>
> >>> http://www.apache.org/legal/release-policy.html#what
> >>>
> >>> Michael
> >>>
> >>> On 11/4/19 1:11 PM, Dinesh Joshi wrote:
> >>>> I think this is a good idea. I am +1 on making this more discoverable
> >>>> on our website. Please add instructions to report bugs and give us
> >>>> feedback around it.
> >>>>
> >>>> Dinesh
> >>>>
> >>>>> On Nov 4, 2019, at 10:53 AM, Jon Haddad  wrote:
> >>>>>
> >>>>> I noticed we don't currently list the alpha in the downloads section.
> >>>>> Anyone object if I add the relevant information after the "Older
> >>>>> supported
> >>>>> releases" section in the downloads page?  I'd make it clear that this
> >> is
> >>>>> alpha and non-production release, and we're soliciting feedback.
> >>>>>
> >>>>> http://cassandra.apache.org/download/
> >>>>>
> >>>>> Jon
> >>>>
> >>>>
> >>>> -
> >>>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> >>>> For additional commands, e-mail: dev-h...@cassandra.apache.org
> >>>>
> >>
> >> -
> >> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> >> For additional commands, e-mail: dev-h...@cassandra.apache.org
> >>
> >>
> >
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
>


Re: moving the site from SVN to git

2019-10-31 Thread Jon Haddad
Thanks for the help Michael and Nate.

It took a couple wacks with a hammer but I've gotten the website to
regenerate, pushing up the newest documentation and finally fixing the
community page to link to the Slack channel as a priority over IRC.

Jon

On Wed, Oct 30, 2019 at 1:52 PM Jon Haddad  wrote:

> Thanks Michael, that's exactly what I needed.
>
> On Wed, Oct 30, 2019 at 1:44 PM Michael Shuler 
> wrote:
>
>> Looks like the source markdown was added in the next commit.
>> http://svn.apache.org/viewvc?view=revision=1857226
>>
>> (which I see in git as commit 157df5cdfb83cb2edd0051002736316f5f470ad9)
>>
>> Michael
>>
>> On 10/30/19 3:29 PM, Nate McCall wrote:
>> > Unfortunately my svn foo is about as atrophied as yours. I followed the
>> > usual steps when publishing as I've done with the other posts, so no
>> idea
>> > what happened? I dont have anything uncommitted locally either.
>> >
>> > Whatever we can do to get it showing until we move off SVN (or just
>> move)
>> > is fine with me.
>> >
>> > On Thu, Oct 31, 2019 at 9:17 AM Jon Haddad  wrote:
>> >
>> >> I figured out what was wrong with the site generation, I've pushed up a
>> >> fix.
>> >>
>> >> I regenerated the site and noticed the blog post for streaming was
>> marked
>> >> as deleted, which was odd.  I dug back through the history and found
>> it an
>> >> HTML file committed at
>> >> http://svn.apache.org/viewvc?view=revision=1857225, but I'm
>> not
>> >> sure where the original content is.  (My SVN foo is terrible now)
>> >>
>> >> Looking at the Git history I see this:
>> >>
>> >> commit f00523b1eefc90b5e4515db0d8c3ab207656684b
>> >> Author: zznate 
>> >> Date:   Wed Apr 10 01:16:07 2019 +
>> >>
>> >>  CASSANDRA-14765 - Streaming performance post by Sumanth Pasupuleti
>> >>
>> >>  git-svn-id:
>> http://svn.apache.org/repos/asf/cassandra/site@1857225
>> >> 13f79535-47bb-0310-9956-ffa450edef68
>> >> <
>> http://svn.apache.org/repos/asf/cassandra/site@185722513f79535-47bb-0310-9956-ffa450edef68
>> >
>> >>
>> >> Was this just published directly as HTML?
>> >>
>> >> Nate, do you remember how this was handled?  Am I missing something
>> >> obvious?
>> >>
>> >> Jon
>> >>
>> >> On Tue, Oct 29, 2019 at 1:28 PM Jon Haddad  wrote:
>> >>
>> >>> I'll take a look at the website generation.  Thanks for fixing
>> manually
>> >>> for now.
>> >>>
>> >>> On Tue, Oct 29, 2019 at 1:16 PM Michael Shuler <
>> mich...@pbandjelly.org>
>> >>> wrote:
>> >>>
>> >>>> I have updated the new releases in:
>> >>>> src/_data/releases.yaml
>> >>>>
>> >>>> I ran through the docker build/run, yet the main index and download
>> >>>> pages of the site were not modified with the new release versions and
>> >>>> dates. I'm going to reset --hard and hand edit those pages. #justFYI
>> >>>>
>> >>>> Michael
>> >>>>
>> >>>> On 10/17/19 9:07 AM, Jon Haddad wrote:
>> >>>>> The migration is finished.
>> >>>>>
>> >>>>> I had to fix a few things along the way.  The docker containers
>> didn't
>> >>>>> build correctly (based on debian:latest rather than a fixed tag),
>> and
>> >>>> the
>> >>>>> site had to be served out of the content directory rather than the
>> >>>> publish
>> >>>>> one we were using.
>> >>>>>
>> >>>>> I'm going to address a couple things as follow ups.  We still point
>> >>>> people
>> >>>>> to IRC, I'll update that to slack.  Longer term I'll migrate it to
>> >> Hugo,
>> >>>>> which will make the entire process a lot easier.
>> >>>>>
>> >>>>> Jon
>> >>>>>
>> >>>>>
>> >>>>>
>> >>>>> On Wed, Oct 9, 2019 at 10:26 PM Jon Haddad 
>> wrote:
>> >>>>>
>> >>>>>> OK, I checked with INFRA on some details and will finish the
>> >> migration
>> >>&

Re: moving the site from SVN to git

2019-10-30 Thread Jon Haddad
I think I figured out what happened.. it's regenerating into a different
directory than originally was.  I'll dig into why.

The site's already moved and publishing via git, I'm just trying to figure
out the last couple wrinkles that I didn't spot in the process.

On Wed, Oct 30, 2019 at 1:30 PM Nate McCall  wrote:

> Unfortunately my svn foo is about as atrophied as yours. I followed the
> usual steps when publishing as I've done with the other posts, so no idea
> what happened? I dont have anything uncommitted locally either.
>
> Whatever we can do to get it showing until we move off SVN (or just move)
> is fine with me.
>
> On Thu, Oct 31, 2019 at 9:17 AM Jon Haddad  wrote:
>
> > I figured out what was wrong with the site generation, I've pushed up a
> > fix.
> >
> > I regenerated the site and noticed the blog post for streaming was marked
> > as deleted, which was odd.  I dug back through the history and found it
> an
> > HTML file committed at
> > http://svn.apache.org/viewvc?view=revision=1857225, but I'm not
> > sure where the original content is.  (My SVN foo is terrible now)
> >
> > Looking at the Git history I see this:
> >
> > commit f00523b1eefc90b5e4515db0d8c3ab207656684b
> > Author: zznate 
> > Date:   Wed Apr 10 01:16:07 2019 +
> >
> > CASSANDRA-14765 - Streaming performance post by Sumanth Pasupuleti
> >
> > git-svn-id: http://svn.apache.org/repos/asf/cassandra/site@1857225
> > 13f79535-47bb-0310-9956-ffa450edef68
> > <
> http://svn.apache.org/repos/asf/cassandra/site@185722513f79535-47bb-0310-9956-ffa450edef68
> >
> >
> > Was this just published directly as HTML?
> >
> > Nate, do you remember how this was handled?  Am I missing something
> > obvious?
> >
> > Jon
> >
> > On Tue, Oct 29, 2019 at 1:28 PM Jon Haddad  wrote:
> >
> > > I'll take a look at the website generation.  Thanks for fixing manually
> > > for now.
> > >
> > > On Tue, Oct 29, 2019 at 1:16 PM Michael Shuler  >
> > > wrote:
> > >
> > >> I have updated the new releases in:
> > >>src/_data/releases.yaml
> > >>
> > >> I ran through the docker build/run, yet the main index and download
> > >> pages of the site were not modified with the new release versions and
> > >> dates. I'm going to reset --hard and hand edit those pages. #justFYI
> > >>
> > >> Michael
> > >>
> > >> On 10/17/19 9:07 AM, Jon Haddad wrote:
> > >> > The migration is finished.
> > >> >
> > >> > I had to fix a few things along the way.  The docker containers
> didn't
> > >> > build correctly (based on debian:latest rather than a fixed tag),
> and
> > >> the
> > >> > site had to be served out of the content directory rather than the
> > >> publish
> > >> > one we were using.
> > >> >
> > >> > I'm going to address a couple things as follow ups.  We still point
> > >> people
> > >> > to IRC, I'll update that to slack.  Longer term I'll migrate it to
> > Hugo,
> > >> > which will make the entire process a lot easier.
> > >> >
> > >> > Jon
> > >> >
> > >> >
> > >> >
> > >> > On Wed, Oct 9, 2019 at 10:26 PM Jon Haddad 
> wrote:
> > >> >
> > >> >> OK, I checked with INFRA on some details and will finish the
> > migration
> > >> >> tomorrow.
> > >> >>
> > >> >> On Thu, Oct 3, 2019 at 11:32 AM Jon Haddad 
> > wrote:
> > >> >>
> > >> >>> Awesome, thanks Michael.
> > >> >>>
> > >> >>> We need to do a little bit of additional configuration to have it
> > >> switch
> > >> >>> over.  Specifically, we need to set up the .asf.yaml config to
> tell
> > >> the
> > >> >>> servers how the site should be published.  I can take care of it
> > >> >>> tomorrow.
> > >> >>>
> > >> >>> Reference:
> > >> >>>
> > >>
> >
> https://cwiki.apache.org/confluence/display/INFRA/.asf.yaml+features+for+git+repositories#id-.asf.yamlfeaturesforgitrepositories-Publishingabranchtoyourprojectwebsite
> > >> >>>
> > >> >>> On Thu, Oct 3, 2019 at 10:57 AM Michael Shuler <
> > >> mich...@pbandjelly.org>
> 

Re: moving the site from SVN to git

2019-10-30 Thread Jon Haddad
I figured out what was wrong with the site generation, I've pushed up a fix.

I regenerated the site and noticed the blog post for streaming was marked
as deleted, which was odd.  I dug back through the history and found it an
HTML file committed at
http://svn.apache.org/viewvc?view=revision=1857225, but I'm not
sure where the original content is.  (My SVN foo is terrible now)

Looking at the Git history I see this:

commit f00523b1eefc90b5e4515db0d8c3ab207656684b
Author: zznate 
Date:   Wed Apr 10 01:16:07 2019 +

CASSANDRA-14765 - Streaming performance post by Sumanth Pasupuleti

git-svn-id: http://svn.apache.org/repos/asf/cassandra/site@1857225
13f79535-47bb-0310-9956-ffa450edef68

Was this just published directly as HTML?

Nate, do you remember how this was handled?  Am I missing something obvious?

Jon

On Tue, Oct 29, 2019 at 1:28 PM Jon Haddad  wrote:

> I'll take a look at the website generation.  Thanks for fixing manually
> for now.
>
> On Tue, Oct 29, 2019 at 1:16 PM Michael Shuler 
> wrote:
>
>> I have updated the new releases in:
>>src/_data/releases.yaml
>>
>> I ran through the docker build/run, yet the main index and download
>> pages of the site were not modified with the new release versions and
>> dates. I'm going to reset --hard and hand edit those pages. #justFYI
>>
>> Michael
>>
>> On 10/17/19 9:07 AM, Jon Haddad wrote:
>> > The migration is finished.
>> >
>> > I had to fix a few things along the way.  The docker containers didn't
>> > build correctly (based on debian:latest rather than a fixed tag), and
>> the
>> > site had to be served out of the content directory rather than the
>> publish
>> > one we were using.
>> >
>> > I'm going to address a couple things as follow ups.  We still point
>> people
>> > to IRC, I'll update that to slack.  Longer term I'll migrate it to Hugo,
>> > which will make the entire process a lot easier.
>> >
>> > Jon
>> >
>> >
>> >
>> > On Wed, Oct 9, 2019 at 10:26 PM Jon Haddad  wrote:
>> >
>> >> OK, I checked with INFRA on some details and will finish the migration
>> >> tomorrow.
>> >>
>> >> On Thu, Oct 3, 2019 at 11:32 AM Jon Haddad  wrote:
>> >>
>> >>> Awesome, thanks Michael.
>> >>>
>> >>> We need to do a little bit of additional configuration to have it
>> switch
>> >>> over.  Specifically, we need to set up the .asf.yaml config to tell
>> the
>> >>> servers how the site should be published.  I can take care of it
>> >>> tomorrow.
>> >>>
>> >>> Reference:
>> >>>
>> https://cwiki.apache.org/confluence/display/INFRA/.asf.yaml+features+for+git+repositories#id-.asf.yamlfeaturesforgitrepositories-Publishingabranchtoyourprojectwebsite
>> >>>
>> >>> On Thu, Oct 3, 2019 at 10:57 AM Michael Shuler <
>> mich...@pbandjelly.org>
>> >>> wrote:
>> >>>
>> >>>> committed! :)
>> >>>>
>> >>>> https://gitbox.apache.org/repos/asf?p=cassandra-website.git
>> >>>>
>> >>>> Michael
>> >>>>
>> >>>> On 10/3/19 12:22 PM, Jon Haddad wrote:
>> >>>>> I think we can safely ignore them.  Thanks for figuring this out.
>> >>>>>
>> >>>>> On Thu, Oct 3, 2019 at 10:01 AM Michael Shuler <
>> mich...@pbandjelly.org
>> >>>>>
>> >>>>> wrote:
>> >>>>>
>> >>>>>> I'm making progress through many periodic timeouts to svn.a.o and
>> >>>>>> restarts, but it appears that svn2git is smart enough to pick up
>> where
>> >>>>>> it left off. The first commit captured at the svn path I'm
>> specifying
>> >>>> is
>> >>>>>> when Cassandra was moved to a top level project at r922689
>> >>>> (2010-03-13).
>> >>>>>> I don't know the old incubator path, and it's probably OK to ignore
>> >>>> the
>> >>>>>> few older incubator commits? I imagine it would mean starting over
>> the
>> >>>>>> entire import to pull in those older incubator svn commits, then
>> >>>>>> changing the url and somehow importing the newer path on top?
>> >>>>>>
>> >>>>>> I tried using a local path as the source to try to speed things up

Re: moving the site from SVN to git

2019-10-30 Thread Jon Haddad
Thanks Michael, that's exactly what I needed.

On Wed, Oct 30, 2019 at 1:44 PM Michael Shuler 
wrote:

> Looks like the source markdown was added in the next commit.
> http://svn.apache.org/viewvc?view=revision=1857226
>
> (which I see in git as commit 157df5cdfb83cb2edd0051002736316f5f470ad9)
>
> Michael
>
> On 10/30/19 3:29 PM, Nate McCall wrote:
> > Unfortunately my svn foo is about as atrophied as yours. I followed the
> > usual steps when publishing as I've done with the other posts, so no idea
> > what happened? I dont have anything uncommitted locally either.
> >
> > Whatever we can do to get it showing until we move off SVN (or just move)
> > is fine with me.
> >
> > On Thu, Oct 31, 2019 at 9:17 AM Jon Haddad  wrote:
> >
> >> I figured out what was wrong with the site generation, I've pushed up a
> >> fix.
> >>
> >> I regenerated the site and noticed the blog post for streaming was
> marked
> >> as deleted, which was odd.  I dug back through the history and found it
> an
> >> HTML file committed at
> >> http://svn.apache.org/viewvc?view=revision=1857225, but I'm
> not
> >> sure where the original content is.  (My SVN foo is terrible now)
> >>
> >> Looking at the Git history I see this:
> >>
> >> commit f00523b1eefc90b5e4515db0d8c3ab207656684b
> >> Author: zznate 
> >> Date:   Wed Apr 10 01:16:07 2019 +
> >>
> >>  CASSANDRA-14765 - Streaming performance post by Sumanth Pasupuleti
> >>
> >>  git-svn-id: http://svn.apache.org/repos/asf/cassandra/site@1857225
> >> 13f79535-47bb-0310-9956-ffa450edef68
> >> <
> http://svn.apache.org/repos/asf/cassandra/site@185722513f79535-47bb-0310-9956-ffa450edef68
> >
> >>
> >> Was this just published directly as HTML?
> >>
> >> Nate, do you remember how this was handled?  Am I missing something
> >> obvious?
> >>
> >> Jon
> >>
> >> On Tue, Oct 29, 2019 at 1:28 PM Jon Haddad  wrote:
> >>
> >>> I'll take a look at the website generation.  Thanks for fixing manually
> >>> for now.
> >>>
> >>> On Tue, Oct 29, 2019 at 1:16 PM Michael Shuler  >
> >>> wrote:
> >>>
> >>>> I have updated the new releases in:
> >>>> src/_data/releases.yaml
> >>>>
> >>>> I ran through the docker build/run, yet the main index and download
> >>>> pages of the site were not modified with the new release versions and
> >>>> dates. I'm going to reset --hard and hand edit those pages. #justFYI
> >>>>
> >>>> Michael
> >>>>
> >>>> On 10/17/19 9:07 AM, Jon Haddad wrote:
> >>>>> The migration is finished.
> >>>>>
> >>>>> I had to fix a few things along the way.  The docker containers
> didn't
> >>>>> build correctly (based on debian:latest rather than a fixed tag), and
> >>>> the
> >>>>> site had to be served out of the content directory rather than the
> >>>> publish
> >>>>> one we were using.
> >>>>>
> >>>>> I'm going to address a couple things as follow ups.  We still point
> >>>> people
> >>>>> to IRC, I'll update that to slack.  Longer term I'll migrate it to
> >> Hugo,
> >>>>> which will make the entire process a lot easier.
> >>>>>
> >>>>> Jon
> >>>>>
> >>>>>
> >>>>>
> >>>>> On Wed, Oct 9, 2019 at 10:26 PM Jon Haddad 
> wrote:
> >>>>>
> >>>>>> OK, I checked with INFRA on some details and will finish the
> >> migration
> >>>>>> tomorrow.
> >>>>>>
> >>>>>> On Thu, Oct 3, 2019 at 11:32 AM Jon Haddad 
> >> wrote:
> >>>>>>
> >>>>>>> Awesome, thanks Michael.
> >>>>>>>
> >>>>>>> We need to do a little bit of additional configuration to have it
> >>>> switch
> >>>>>>> over.  Specifically, we need to set up the .asf.yaml config to tell
> >>>> the
> >>>>>>> servers how the site should be published.  I can take care of it
> >>>>>>> tomorrow.
> >>>>>>>
> >>>>>>> Reference:
> >>>>>>>
> 

Re: Improving our frequency of (patch) releases, and letting committers make releases

2019-10-18 Thread Jon Haddad
I agree with Mick.  Those tickets have been open for a while, and I think
we're unlikely to resolve them in the next few days.

Since we're updating it anyways, I'll put mine in there too. Might as well
triple the number of active community members doing releases.

On Fri, Oct 18, 2019 at 1:30 AM Mick Semb Wever  wrote:

>
> > I believe some basic distribution changes would greatly help the entire
> > question here, including making release builds easier for other people
> > to perform, shortening the cycle times as desired. If there is some
> > interest in building releases, I would like some help solving the
> > problems that exist and have been in JIRA for quite some time.
>
>
> Or we can just say that committers can make releases, and the KEYS file
> can change.
> These tickets can be improvements later, rather than blockers now.
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
>


Re: moving the site from SVN to git

2019-10-29 Thread Jon Haddad
I'll take a look at the website generation.  Thanks for fixing manually for
now.

On Tue, Oct 29, 2019 at 1:16 PM Michael Shuler 
wrote:

> I have updated the new releases in:
>src/_data/releases.yaml
>
> I ran through the docker build/run, yet the main index and download
> pages of the site were not modified with the new release versions and
> dates. I'm going to reset --hard and hand edit those pages. #justFYI
>
> Michael
>
> On 10/17/19 9:07 AM, Jon Haddad wrote:
> > The migration is finished.
> >
> > I had to fix a few things along the way.  The docker containers didn't
> > build correctly (based on debian:latest rather than a fixed tag), and the
> > site had to be served out of the content directory rather than the
> publish
> > one we were using.
> >
> > I'm going to address a couple things as follow ups.  We still point
> people
> > to IRC, I'll update that to slack.  Longer term I'll migrate it to Hugo,
> > which will make the entire process a lot easier.
> >
> > Jon
> >
> >
> >
> > On Wed, Oct 9, 2019 at 10:26 PM Jon Haddad  wrote:
> >
> >> OK, I checked with INFRA on some details and will finish the migration
> >> tomorrow.
> >>
> >> On Thu, Oct 3, 2019 at 11:32 AM Jon Haddad  wrote:
> >>
> >>> Awesome, thanks Michael.
> >>>
> >>> We need to do a little bit of additional configuration to have it
> switch
> >>> over.  Specifically, we need to set up the .asf.yaml config to tell the
> >>> servers how the site should be published.  I can take care of it
> >>> tomorrow.
> >>>
> >>> Reference:
> >>>
> https://cwiki.apache.org/confluence/display/INFRA/.asf.yaml+features+for+git+repositories#id-.asf.yamlfeaturesforgitrepositories-Publishingabranchtoyourprojectwebsite
> >>>
> >>> On Thu, Oct 3, 2019 at 10:57 AM Michael Shuler  >
> >>> wrote:
> >>>
> >>>> committed! :)
> >>>>
> >>>> https://gitbox.apache.org/repos/asf?p=cassandra-website.git
> >>>>
> >>>> Michael
> >>>>
> >>>> On 10/3/19 12:22 PM, Jon Haddad wrote:
> >>>>> I think we can safely ignore them.  Thanks for figuring this out.
> >>>>>
> >>>>> On Thu, Oct 3, 2019 at 10:01 AM Michael Shuler <
> mich...@pbandjelly.org
> >>>>>
> >>>>> wrote:
> >>>>>
> >>>>>> I'm making progress through many periodic timeouts to svn.a.o and
> >>>>>> restarts, but it appears that svn2git is smart enough to pick up
> where
> >>>>>> it left off. The first commit captured at the svn path I'm
> specifying
> >>>> is
> >>>>>> when Cassandra was moved to a top level project at r922689
> >>>> (2010-03-13).
> >>>>>> I don't know the old incubator path, and it's probably OK to ignore
> >>>> the
> >>>>>> few older incubator commits? I imagine it would mean starting over
> the
> >>>>>> entire import to pull in those older incubator svn commits, then
> >>>>>> changing the url and somehow importing the newer path on top?
> >>>>>>
> >>>>>> I tried using a local path as the source to try to speed things up,
> >>>>>> after I got my first few timeouts, but that fails.
> >>>>>>
> >>>>>> Curious if anyone really cares if we lose a few early commits - if
> >>>> so, I
> >>>>>> can try to figure out the old path and start again.
> >>>>>>
> >>>>>> Michael
> >>>>>>
> >>>>>> On 10/3/19 11:14 AM, Jon Haddad wrote:
> >>>>>>> Thanks for taking a look, Michael.  Hopefully you have better luck
> >>>> than
> >>>>>> me
> >>>>>>> :)
> >>>>>>>
> >>>>>>> On Thu, Oct 3, 2019 at 6:42 AM Michael Shuler <
> >>>> mich...@pbandjelly.org>
> >>>>>>> wrote:
> >>>>>>>
> >>>>>>>> I cloned the empty cassandra-website git repo, and I'm running:
> >>>>>>>>
> >>>>>>>> svn2git http://svn.apache.org/repos/asf/cassandra/site
> >>>> --rootistrunk
> >>>>>>>> --no-minimize-url
> >>>

Re: moving the site from SVN to git

2019-10-17 Thread Jon Haddad
The migration is finished.

I had to fix a few things along the way.  The docker containers didn't
build correctly (based on debian:latest rather than a fixed tag), and the
site had to be served out of the content directory rather than the publish
one we were using.

I'm going to address a couple things as follow ups.  We still point people
to IRC, I'll update that to slack.  Longer term I'll migrate it to Hugo,
which will make the entire process a lot easier.

Jon



On Wed, Oct 9, 2019 at 10:26 PM Jon Haddad  wrote:

> OK, I checked with INFRA on some details and will finish the migration
> tomorrow.
>
> On Thu, Oct 3, 2019 at 11:32 AM Jon Haddad  wrote:
>
>> Awesome, thanks Michael.
>>
>> We need to do a little bit of additional configuration to have it switch
>> over.  Specifically, we need to set up the .asf.yaml config to tell the
>> servers how the site should be published.  I can take care of it
>> tomorrow.
>>
>> Reference:
>> https://cwiki.apache.org/confluence/display/INFRA/.asf.yaml+features+for+git+repositories#id-.asf.yamlfeaturesforgitrepositories-Publishingabranchtoyourprojectwebsite
>>
>> On Thu, Oct 3, 2019 at 10:57 AM Michael Shuler 
>> wrote:
>>
>>> committed! :)
>>>
>>> https://gitbox.apache.org/repos/asf?p=cassandra-website.git
>>>
>>> Michael
>>>
>>> On 10/3/19 12:22 PM, Jon Haddad wrote:
>>> > I think we can safely ignore them.  Thanks for figuring this out.
>>> >
>>> > On Thu, Oct 3, 2019 at 10:01 AM Michael Shuler >> >
>>> > wrote:
>>> >
>>> >> I'm making progress through many periodic timeouts to svn.a.o and
>>> >> restarts, but it appears that svn2git is smart enough to pick up where
>>> >> it left off. The first commit captured at the svn path I'm specifying
>>> is
>>> >> when Cassandra was moved to a top level project at r922689
>>> (2010-03-13).
>>> >> I don't know the old incubator path, and it's probably OK to ignore
>>> the
>>> >> few older incubator commits? I imagine it would mean starting over the
>>> >> entire import to pull in those older incubator svn commits, then
>>> >> changing the url and somehow importing the newer path on top?
>>> >>
>>> >> I tried using a local path as the source to try to speed things up,
>>> >> after I got my first few timeouts, but that fails.
>>> >>
>>> >> Curious if anyone really cares if we lose a few early commits - if
>>> so, I
>>> >> can try to figure out the old path and start again.
>>> >>
>>> >> Michael
>>> >>
>>> >> On 10/3/19 11:14 AM, Jon Haddad wrote:
>>> >>> Thanks for taking a look, Michael.  Hopefully you have better luck
>>> than
>>> >> me
>>> >>> :)
>>> >>>
>>> >>> On Thu, Oct 3, 2019 at 6:42 AM Michael Shuler <
>>> mich...@pbandjelly.org>
>>> >>> wrote:
>>> >>>
>>> >>>> I cloned the empty cassandra-website git repo, and I'm running:
>>> >>>>
>>> >>>> svn2git http://svn.apache.org/repos/asf/cassandra/site
>>> --rootistrunk
>>> >>>> --no-minimize-url
>>> >>>>
>>> >>>> ..to see what I get. An svn checkout of the above url says it's only
>>> >>>> 69M, so I suppose it's pulling all history of all time for all
>>> projects?
>>> >>>>
>>> >>>> I'll let this roll for a while I run an errand and report back!
>>> >>>>
>>> >>>> Michael
>>> >>>>
>>> >>>> On 10/2/19 9:30 PM, Jon Haddad wrote:
>>> >>>>> Daniel referred me to the GitBox self service system.
>>> >>>>>
>>> >>>>> I've attempted to port the site over using the tool Michael
>>> suggested,
>>> >>>> but
>>> >>>>> after a couple hours it died with this message:
>>> >>>>>
>>> >>>>> command failed: r922600
>>> >>>>> git checkout -f master
>>> >>>>>
>>> >>>>> If either of you (Mick or Michael) want to give svn2git a shot
>>> maybe
>>> >>>> you'll
>>> >>>>> get a different result.I think it may have been due to the
>>>

Re: Improving our frequency of (patch) releases, and letting committers make releases

2019-10-17 Thread Jon Haddad
Mick's absolutely right.  Even if we had been doing more frequent releases,
it's a big risk for us to only have one person able to it in the first
place.

I also agree with Benedict. I don't think we need to be crazy strict about
windows.  As long as we tell folks they may need to import keys, I think
we're much better off than we are now.

On Thu, Oct 17, 2019 at 5:08 AM Benedict Elliott Smith 
wrote:

> +1
>
> We need to do something about this, and I don't mind what.  It would be
> better if we cut fix releases immediately after a critical fix lands, much
> as we used to.  We've got fairly lazy about producing releases, perhaps
> because many of the full-time contributors now work for organisations that
> don't really need them.
>
> We should definitely add any willing volunteers to the KEYS file now.  I
> don't personally think we need any kind of a strict policy about modifying
> it in future, except that if our release cadence drops and we have willing
> volunteers we should do it again.
>
>
>  On 17/10/2019, 07:50, "Mick Semb Wever"  wrote:
>
>
> > We're still in the position where only four people in the project:
> > Michael, Sylvain, Jake, and Eric; can cut, stage, and propose a
> > release.
>
>
> Our current patch release frequency is lacking. It's been 8 months
> since 3.11.4.
> This is having an impact on users and their faith in the technology.
>
> There is currently only one person in the community that is actively
> making releases. This really doesn't inspire confidence. We really should
> be cutting a patch release every 2 to 6 weeks, if critical fixes apply,
> imho. And I for one would certainly like to be helping out with this
> situation.
>
> If we choose to address this issue, there are two facets to it that
> come to mind:
>   1) This misnomer that committers can't cut and publish releases.
>   2) That we can't make changes to the KEYS file (or that it's too
> painful to do so).
>
> Re (1). I'm not sure where this misunderstanding came from in our
> community. But the ASF policy does not prevent committers from being the
> release manager. By default the only thing in the process a committer can't
> do is publish the successfully voted upon release from stage to public.
> This is a one-line svn command and the last and very small action in the
> release process at large. A committer can coordinate, cut, stage, announce,
> and initiate the vote on a release, which is the bulk of the work. And the
> committer can also do the final publish command if the PMC has voted that
> this is ok in this community. This is all in
> http://www.apache.org/legal/release-policy.html
>
>
> > Is it time to add more people to our KEYS file?
> > This will have the consequence that debian users will have to re-add
> it.
>
>
> Re (2), the problem is that changes to the KEYS file mean that debian
> users have to re-import it before pulling new packages. But is that really
> worse than an 8 month or more for an earlier patch version like ".5" ?
>
>
> > But maybe we can accept a window, from now until the first 4.0 rc,
> > where all committers are free to open a PR to add themselves to the
> > KEYS file?
>
>
> I can think of a number of ways forward on this.
>   a) remove the constraint that we can't update the KEYS file, asking
> debian users to be prepared to have to re-import it.
>   b) open a one-off window where we get as many PMC and Committers as
> possible to add their gpg key to the KEYS file.
>   c) open a regular window each year, eg last quarter, where we
> collect new keys to add from new PMC and Committers, and merge them all in,
> in one go, at the end of that window.
>
> It would be great to be in a better place than the current situation
> where we have only four keys in the file :-(
>
> regards,
> Mick
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
>
>
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
>


Re: [VOTE] Release Apache Cassandra 4.0-alpha2

2019-10-25 Thread Jon Haddad
+1

On Fri, Oct 25, 2019 at 10:18 AM Sam Tunnicliffe  wrote:

> +1
>
> > On 24 Oct 2019, at 18:26, Michael Shuler  wrote:
> >
> > I propose the following artifacts for release as 4.0-alpha2.
> >
> > sha1: ca928a49c68186bdcd57dea8b10c30991c6a3c55
> > Git:
> https://gitbox.apache.org/repos/asf?p=cassandra.git;a=shortlog;h=refs/tags/4.0-alpha2-tentative
> > Artifacts:
> https://repository.apache.org/content/repositories/orgapachecassandra-1185/org/apache/cassandra/apache-cassandra/4.0-alpha2/
> > Staging repository:
> https://repository.apache.org/content/repositories/orgapachecassandra-1185/
> >
> > The Debian and RPM packages are available here:
> http://people.apache.org/~mshuler
> >
> > The vote will be open for 72 hours (longer if needed).
> >
> > [1]: CHANGES.txt:
> https://gitbox.apache.org/repos/asf?p=cassandra.git;a=blob_plain;f=CHANGES.txt;hb=refs/tags/4.0-alpha2-tentative
> > [2]: NEWS.txt:
> https://gitbox.apache.org/repos/asf?p=cassandra.git;a=blob_plain;f=NEWS.txt;hb=refs/tags/4.0-alpha2-tentative
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > For additional commands, e-mail: dev-h...@cassandra.apache.org
> >
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
>


moving the site from SVN to git

2019-09-24 Thread Jon Haddad
While at apachecon I spoke with the folks at the INFRA desk to find out
about moving the website from SVN to Git, and it's pretty straightforward.
Here's the process:

1. Create a new Git repo fort the site
2. Copy the files over.  I don't think we care about history.  Unless
someone speaks up, I'll just copy over what's current.
3. File a ticket and point them to the new repo

Are there any dependencies currently pointing to the SVN repo that we'll
need to update that would block me from taking care of this?

Jon


Re: moving the site from SVN to git

2019-09-24 Thread Jon Haddad
Personally, no, I don't.  What I need to know is if someone who actually
works on the site needs the history in *git*.  It would still be archived
in SVN.

On Tue, Sep 24, 2019 at 2:12 PM Joshua Drake  wrote:

> On Tue, Sep 24, 2019 at 1:44 PM Jon Haddad  wrote:
>
> > While at apachecon I spoke with the folks at the INFRA desk to find out
> > about moving the website from SVN to Git, and it's pretty
> straightforward.
> > Here's the process:
> >
> > 1. Create a new Git repo fort the site
> > 2. Copy the files over.  I don't think we care about history.  Unless
> > someone speaks up, I'll just copy over what's current.
> >
>
> You care. Use a migration tool.
>
> JD
>


Re: fixing paging state for 4.0

2019-09-24 Thread Jon Haddad
What's the pain point?  Is it because of mixed version clusters or is there
something else that makes it a problem?

On Tue, Sep 24, 2019 at 11:03 AM Blake Eggleston
 wrote:

> Changing paging state format is kind of a pain since the driver treats it
> as an opaque blob. I'd prefer we went with Sylvain's suggestion to just
> interpret Integer.MAX_VALUE as "no limit", which would be a lot simpler to
> implement.
>
> > On Sep 24, 2019, at 10:44 AM, Jon Haddad  wrote:
> >
> > I'm working with a team who just ran into CASSANDRA-14683 [1], which I
> > didn't realize was an issue till now.
> >
> > Anyone have an interest in fixing full table pagination?  I'm not sure of
> > the full implications of changing the int to a long in the paging stage.
> >
> > https://issues.apache.org/jira/browse/CASSANDRA-14683
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
>


Re: fixing paging state for 4.0

2019-09-24 Thread Jon Haddad
The problem is that the payload isn't versioned, because the individual
fields aren't really part of the protocol.  I think the long term fix
should be to add the fields of the paging state to the protocol itself
rather than have it just be some serialized blob.  Then we don't have to
deal with separately versioning the paging state.

I think recognizing max int as special number that just means "a lot" is
fine for now till we have time to rework it is a reasonable approach.

Jon

On Tue, Sep 24, 2019 at 6:52 PM J. D. Jordan 
wrote:

> Are their drivers that try to do mixed protocol version connections?  If
> so that would be a mistake on the drivers part if it sent the new paging
> state to an old server.  Pretty easily protected against in said driver
> when it implements support for the new protocol version.  The payload is
> opaque, but that doesn’t mean a driver would send the new payload to an old
> server.
>
> Many of the drivers I have looked at don’t do mixed version connections.
> If they start at a higher version they will not connect to older nodes that
> don’t support it. Or they will connect to the newer nodes with the older
> protocol version. In either of those cases there is no problem.
>
> Protocol changes aside, I would suggest fixing the bug starting back on
> 3.x by changing the meaning of MAX. Whether or not the limit is switched to
> a var int in a bumped protocol version.
>
> -Jeremiah
>
>
> > On Sep 24, 2019, at 8:28 PM, Blake Eggleston
>  wrote:
> >
> > Right, that's the problem with changing the paging state format. It
> doesn't work in mixed mode.
> >
> >> On Sep 24, 2019, at 4:47 PM, Jeremiah Jordan 
> wrote:
> >>
> >> Clients do negotiate the protocol version they use when connecting. If
> the server bumped the protocol version then this larger paging state could
> be part of the new protocol version. But that doesn’t solve the problem for
> existing versions.
> >>
> >> The special treatment of Integer.MAX_VALUE can be done back to 3.x and
> fix the bug in all versions, letting users requests to receive all of their
> data.  Which realistically is probably what someone who sets the protocol
> level query limit to Integer.MAX_VALUE is trying to do.
> >>
> >> -Jeremiah
> >>
> >>>> On Sep 24, 2019, at 4:09 PM, Blake Eggleston
>  wrote:
> >>>
> >>> Right, mixed version clusters. The opaque blob isn't versioned, and
> there isn't an opportunity for min version negotiation that you have with
> the messaging service. The result is situations where a client begins a
> read on one node, and attempts to read the next page from a different node
> over a protocol version where the paging state serialization format has
> changed. This causes an exception deserializing the paging state and the
> read fails.
> >>>
> >>> There are ways around this, but they're not comprehensive (I think),
> and they're much more involved than just interpreting Integer.MAX_VALUE as
> unlimited. The "right" solution would be for the paging state to be
> deserialized/serialized on the client side, but that won't happen in 4.0.
> >>>
> >>>> On Sep 24, 2019, at 1:12 PM, Jon Haddad  wrote:
> >>>>
> >>>> What's the pain point?  Is it because of mixed version clusters or is
> there
> >>>> something else that makes it a problem?
> >>>>
> >>>>> On Tue, Sep 24, 2019 at 11:03 AM Blake Eggleston
> >>>>>  wrote:
> >>>>>
> >>>>> Changing paging state format is kind of a pain since the driver
> treats it
> >>>>> as an opaque blob. I'd prefer we went with Sylvain's suggestion to
> just
> >>>>> interpret Integer.MAX_VALUE as "no limit", which would be a lot
> simpler to
> >>>>> implement.
> >>>>>
> >>>>>> On Sep 24, 2019, at 10:44 AM, Jon Haddad  wrote:
> >>>>>>
> >>>>>> I'm working with a team who just ran into CASSANDRA-14683 [1],
> which I
> >>>>>> didn't realize was an issue till now.
> >>>>>>
> >>>>>> Anyone have an interest in fixing full table pagination?  I'm not
> sure of
> >>>>>> the full implications of changing the int to a long in the paging
> stage.
> >>>>>>
> >>>>>>
> https://urldefense.proofpoint.com/v2/url?u=https-3A__issues.apache.org_jira_browse_CASSANDRA-2D14683=DwIFAg=adz96Xi0w1RHqtPMowiL2g=CNZK3RiJDLqhsZDG6FQGnXn8WyPRCQhp4x_uBICNC0g=6_gWDV_kv-TQJ8GyBlYfcrhPGl7WmGYGEJ9

Re: "4.0: TBD" -> "4.0: Est. Q4 2019"?

2019-09-25 Thread Jon Haddad
gt;>>>> completeness, deprecation, and backwards compatibility.
> > > Establishing
> > > > a
> > > > >>>>> higher standard for official project releases (even at the
> alpha
> > > and
> > > > >>> beta
> > > > >>>>> stage) will help us really polish the final build together.
> > > > >>>>>
> > > > >>>>> Ideally, I feel that contributors should have completed
> extensive
> > > > >>>>> testing/validation to ensure that no critical or severe bugs
> > exist
> > > > >>> prior
> > > > >>>> to
> > > > >>>>> the release of an alpha (e.g., data loss, consistency
> violations,
> > > > >>>> incorrect
> > > > >>>>> responses to queries, etc). Perhaps we can add a line to this
> > > effect.
> > > > >>>>>
> > > > >>>>> Ensuring that we've met that bar prior to alpha will help us
> > focus
> > > > the
> > > > >>>>> final stages of the release on gathering feedback from users +
> > > > >>> developers
> > > > >>>>> to validate tooling and automation; compatibility with less
> > > > >>> commonly-used
> > > > >>>>> client libraries, testing new features, evaluating performance
> > and
> > > > >>>>> stability under their workloads, etc.
> > > > >>>>>
> > > > >>>>> – Scott
> > > > >>>>>
> > > > >>>>> On 6/11/19, 6:45 AM, "Sumanth Pasupuleti" <
> > > > >>>>> sumanth.pasupuleti...@gmail.com> wrote:
> > > > >>>>>
> > > > >>>>>Thanks for the feedback on the product stages/ release life
> > > cycle
> > > > >>>>> document.
> > > > >>>>>I have incorporated the suggestions and looking for any
> > > additional
> > > > >>>>> feedback
> > > > >>>>>folks may have.
> > > > >>>>>
> > > > >>>>>
> > > > >>>>
> > > > >>>
> > > >
> > >
> >
> https://docs.google.com/document/d/1bS6sr-HSrHFjZb0welife6Qx7u3ZDgRiAoENMLYlfz8/edit#
> > > > >>>>>
> > > > >>>>>Thanks,
> > > > >>>>>Sumanth
> > > > >>>>>
> > > > >>>>>On Tue, May 28, 2019 at 10:43 PM Scott Andreas <
> > > > >>> sc...@paradoxica.net
> > > > >>>>>
> > > > >>>>> wrote:
> > > > >>>>>
> > > > >>>>>> Echoing Jon’s point here –
> > > > >>>>>>
> > > > >>>>>> JH: “My thinking is I'd like to be able to recommend 4.0.0 as
> a
> > > > >>>>> production
> > > > >>>>>> ready
> > > > >>>>>> database for business critical cases”
> > > > >>>>>>
> > > > >>>>>> I feel that this is a standard that is both appropriate and
> > > > >>>>> achievable,
> > > > >>>>>> and one I’m legitimately excited about.
> > > > >>>>>>
> > > > >>>>>> Re: the current state of the test plan wiki in Confluence, I
> owe
> > > > >>>>> another
> > > > >>>>>> pass through. There has been a lot of progress here, but I’ve
> > > > >>> let
> > > > >>>>> perfect
> > > > >>>>>> be the enemy of the good in getting updates out. I’ll complete
> > > > >>> that
> > > > >>>>> pass
> > > > >>>>>> later this week.
> > > > >>>>>>
> > > > >>>>>> Cheers,
> > > > >>>>>>
> > > > >>>>>> — Scott
> > > > >>>>>>
> > > > >>>>>>> On May 28, 2019, at 10:48 AM, Dinesh Joshi <
> djo...@apache.org
> > > > >>>>
> > > > >>>>>

Re: moving the site from SVN to git

2019-10-03 Thread Jon Haddad
Awesome, thanks Michael.

We need to do a little bit of additional configuration to have it switch
over.  Specifically, we need to set up the .asf.yaml config to tell the
servers how the site should be published.  I can take care of it tomorrow.

Reference:
https://cwiki.apache.org/confluence/display/INFRA/.asf.yaml+features+for+git+repositories#id-.asf.yamlfeaturesforgitrepositories-Publishingabranchtoyourprojectwebsite

On Thu, Oct 3, 2019 at 10:57 AM Michael Shuler 
wrote:

> committed! :)
>
> https://gitbox.apache.org/repos/asf?p=cassandra-website.git
>
> Michael
>
> On 10/3/19 12:22 PM, Jon Haddad wrote:
> > I think we can safely ignore them.  Thanks for figuring this out.
> >
> > On Thu, Oct 3, 2019 at 10:01 AM Michael Shuler 
> > wrote:
> >
> >> I'm making progress through many periodic timeouts to svn.a.o and
> >> restarts, but it appears that svn2git is smart enough to pick up where
> >> it left off. The first commit captured at the svn path I'm specifying is
> >> when Cassandra was moved to a top level project at r922689 (2010-03-13).
> >> I don't know the old incubator path, and it's probably OK to ignore the
> >> few older incubator commits? I imagine it would mean starting over the
> >> entire import to pull in those older incubator svn commits, then
> >> changing the url and somehow importing the newer path on top?
> >>
> >> I tried using a local path as the source to try to speed things up,
> >> after I got my first few timeouts, but that fails.
> >>
> >> Curious if anyone really cares if we lose a few early commits - if so, I
> >> can try to figure out the old path and start again.
> >>
> >> Michael
> >>
> >> On 10/3/19 11:14 AM, Jon Haddad wrote:
> >>> Thanks for taking a look, Michael.  Hopefully you have better luck than
> >> me
> >>> :)
> >>>
> >>> On Thu, Oct 3, 2019 at 6:42 AM Michael Shuler 
> >>> wrote:
> >>>
> >>>> I cloned the empty cassandra-website git repo, and I'm running:
> >>>>
> >>>> svn2git http://svn.apache.org/repos/asf/cassandra/site --rootistrunk
> >>>> --no-minimize-url
> >>>>
> >>>> ..to see what I get. An svn checkout of the above url says it's only
> >>>> 69M, so I suppose it's pulling all history of all time for all
> projects?
> >>>>
> >>>> I'll let this roll for a while I run an errand and report back!
> >>>>
> >>>> Michael
> >>>>
> >>>> On 10/2/19 9:30 PM, Jon Haddad wrote:
> >>>>> Daniel referred me to the GitBox self service system.
> >>>>>
> >>>>> I've attempted to port the site over using the tool Michael
> suggested,
> >>>> but
> >>>>> after a couple hours it died with this message:
> >>>>>
> >>>>> command failed: r922600
> >>>>> git checkout -f master
> >>>>>
> >>>>> If either of you (Mick or Michael) want to give svn2git a shot maybe
> >>>> you'll
> >>>>> get a different result.I think it may have been due to the large
> >> size
> >>>>> of the repo and the small drive on the VM I was using.  I can try it
> >>>> again
> >>>>> tomorrow with more storage to see if I get a better result.  Mick if
> >> you
> >>>>> want to give it a shot in the meantime that would be appreciated.
> >>>>>
> >>>>> Jon
> >>>>>
> >>>>> On Wed, Oct 2, 2019 at 3:18 PM Jon Haddad  wrote:
> >>>>>
> >>>>>> I created an INFRA ticket here:
> >>>>>> https://issues.apache.org/jira/browse/INFRA-19218.
> >>>>>>
> >>>>>> On Wed, Sep 25, 2019 at 6:04 AM Michael Shuler <
> >> mich...@pbandjelly.org>
> >>>>>> wrote:
> >>>>>>
> >>>>>>> I see no good reason to trash history. There are tools to make
> moving
> >>>>>>> from svn to git (hopefully) painless. We used git-svn for the main
> c*
> >>>>>>> source to retain history of both, which this tool uses to do
> >> migrations
> >>>>>>> - https://github.com/nirvdrum/svn2git
> >>>>>>>
> >>>>>>> Mich

Can we kick off a release?

2019-10-07 Thread Jon Haddad
Michael,

Would you mind kicking off builds and starting a vote thread for the latest
2.2, 3.0 and 3.11 builds?

Much appreciated,
Jon


Re: moving the site from SVN to git

2019-10-09 Thread Jon Haddad
OK, I checked with INFRA on some details and will finish the migration
tomorrow.

On Thu, Oct 3, 2019 at 11:32 AM Jon Haddad  wrote:

> Awesome, thanks Michael.
>
> We need to do a little bit of additional configuration to have it switch
> over.  Specifically, we need to set up the .asf.yaml config to tell the
> servers how the site should be published.  I can take care of it
> tomorrow.
>
> Reference:
> https://cwiki.apache.org/confluence/display/INFRA/.asf.yaml+features+for+git+repositories#id-.asf.yamlfeaturesforgitrepositories-Publishingabranchtoyourprojectwebsite
>
> On Thu, Oct 3, 2019 at 10:57 AM Michael Shuler 
> wrote:
>
>> committed! :)
>>
>> https://gitbox.apache.org/repos/asf?p=cassandra-website.git
>>
>> Michael
>>
>> On 10/3/19 12:22 PM, Jon Haddad wrote:
>> > I think we can safely ignore them.  Thanks for figuring this out.
>> >
>> > On Thu, Oct 3, 2019 at 10:01 AM Michael Shuler 
>> > wrote:
>> >
>> >> I'm making progress through many periodic timeouts to svn.a.o and
>> >> restarts, but it appears that svn2git is smart enough to pick up where
>> >> it left off. The first commit captured at the svn path I'm specifying
>> is
>> >> when Cassandra was moved to a top level project at r922689
>> (2010-03-13).
>> >> I don't know the old incubator path, and it's probably OK to ignore the
>> >> few older incubator commits? I imagine it would mean starting over the
>> >> entire import to pull in those older incubator svn commits, then
>> >> changing the url and somehow importing the newer path on top?
>> >>
>> >> I tried using a local path as the source to try to speed things up,
>> >> after I got my first few timeouts, but that fails.
>> >>
>> >> Curious if anyone really cares if we lose a few early commits - if so,
>> I
>> >> can try to figure out the old path and start again.
>> >>
>> >> Michael
>> >>
>> >> On 10/3/19 11:14 AM, Jon Haddad wrote:
>> >>> Thanks for taking a look, Michael.  Hopefully you have better luck
>> than
>> >> me
>> >>> :)
>> >>>
>> >>> On Thu, Oct 3, 2019 at 6:42 AM Michael Shuler > >
>> >>> wrote:
>> >>>
>> >>>> I cloned the empty cassandra-website git repo, and I'm running:
>> >>>>
>> >>>> svn2git http://svn.apache.org/repos/asf/cassandra/site --rootistrunk
>> >>>> --no-minimize-url
>> >>>>
>> >>>> ..to see what I get. An svn checkout of the above url says it's only
>> >>>> 69M, so I suppose it's pulling all history of all time for all
>> projects?
>> >>>>
>> >>>> I'll let this roll for a while I run an errand and report back!
>> >>>>
>> >>>> Michael
>> >>>>
>> >>>> On 10/2/19 9:30 PM, Jon Haddad wrote:
>> >>>>> Daniel referred me to the GitBox self service system.
>> >>>>>
>> >>>>> I've attempted to port the site over using the tool Michael
>> suggested,
>> >>>> but
>> >>>>> after a couple hours it died with this message:
>> >>>>>
>> >>>>> command failed: r922600
>> >>>>> git checkout -f master
>> >>>>>
>> >>>>> If either of you (Mick or Michael) want to give svn2git a shot maybe
>> >>>> you'll
>> >>>>> get a different result.I think it may have been due to the large
>> >> size
>> >>>>> of the repo and the small drive on the VM I was using.  I can try it
>> >>>> again
>> >>>>> tomorrow with more storage to see if I get a better result.  Mick if
>> >> you
>> >>>>> want to give it a shot in the meantime that would be appreciated.
>> >>>>>
>> >>>>> Jon
>> >>>>>
>> >>>>> On Wed, Oct 2, 2019 at 3:18 PM Jon Haddad 
>> wrote:
>> >>>>>
>> >>>>>> I created an INFRA ticket here:
>> >>>>>> https://issues.apache.org/jira/browse/INFRA-19218.
>> >>>>>>
>> >>>>>> On Wed, Sep 25, 2019 at 6:04 AM Michael Shuler <
>> >> mich...@pbandjelly.org>
>> >>&g

Re: [VOTE-2] Apache Cassandra Release Lifecycle

2019-10-08 Thread Jon Haddad
This has definitely been a confusing topic in the past, I completely agree
Benedict.  Glad you brought this up.

I'm 100% on board with 5.0 after 4.0.

On Tue, Oct 8, 2019 at 2:27 PM Benedict Elliott Smith 
wrote:

> As a brief side-step on the topic only of versioning (which no doubt will
> cause enough consternation), I personally endorse streamlining it.  We have
> not had a consistently meaningful convention on this, at any point, and we
> made it even worse in the 3.x line.  There's no real difference between
> 1.2->2.0, 2.0->2.1, or 3.0->3.11 and 3.11->4.0; let's admit this and go
> straight to 5.0 for our next feature release, with 4.1 our first patch
> release of the 4.x line.
>
> 
> On 08/10/2019, 21:36, "Scott Andreas"  wrote:
>
> Re: "How can we decide that *all* new features are suppose to go into
> trunk only, if we don’t even have an idea about the upcoming release
> schedule?"
>
> This is a great question. My understanding of the intent of the
> document is that new features are generally expected to land in trunk, with
> an exception process defined for feature backports. I think that's a
> reasonable expectation to start with. But I also agree with you that it's
> important we evolve a way to discuss and agree up on release scope - this
> was the focus of my slides at NGCC. I would love to discuss this on a
> separate thread.
>
> Re: “Bug fix releases have associated new minor version.”
> "Patchlevel version" might be more in keeping with our current
> convention.
>
> Re: "We should give users a way to plan, by having EOL dates"
> Incorporating EOL dates into our release management / planning is a
> great idea.
>
> Would you be willing to rephrase your comments in the form of proposed
> edits to the document?
>
> – Scott
>
> 
> From: Stefan Podkowinski 
> Sent: Tuesday, October 8, 2019 1:22 PM
> To: dev@cassandra.apache.org
> Subject: Re: [VOTE-2] Apache Cassandra Release Lifecycle
>
>  From the document:
>
> General Availability (GA): “A new branch is created for the release
> with
> the new major version, limiting any new feature addition to the new
> release branch, with new feature development will continue to happen
> only on trunk.”
> Maintenance: “Missing features from newer generation releases are
> back-ported on per - PMC vote basis.”
>
> We had a feature freeze before 4.0, which showed that people have
> different views on what actually qualifies as a feature. It doesn’t
> work
> without defining “feature” in more detail. Although I’d rather avoid to
> have this in the document at all, since I don’t think this is getting
> us
> anywhere, without having a clearer picture on the bigger context in
> which release are going to happen in the future, starting with release
> cadence and support periods. How can we decide that *all* new features
> are suppose to go into trunk only, if we don’t even have an idea about
> the upcoming release schedule?
>
> “Bug fix releases have associated new minor version.”
>
> So the next bug fix version will be 4.1? There will be no minor feature
> releases like we did with 3.x.0/2.x.0?
>
> Deprecated:
> "Through a dev community voting process, EOL date is determined for
> this
> release.”
> “Users actively encouraged to move away from the offering.”
>
> We should give users a way to plan, by having EOL dates that may be
> months or years ahead in the future. We did this with 3.0 and 2.x,
> which
> would be all “deprecated” a long time ago with the new proposal.
>
> Deprecated: “Only security vulnerabilities and production-impacting
> bugs
> without workarounds are addressed.”
>
> Although devs will define their own definition of “production-impacting
> bugs without workarounds” in any way they need, I don’t think we should
> have this in the document. It’s okay to use EOLed releases and we
> should
> not prevent users from contributing smaller fixes, performance
> improvements and useful enhancements for minor feature releases.
>
> On 08.10.19 20:00, sankalp kohli wrote:
> > Hi,
> >  We have discussed in the email thread[1] about Apache Cassandra
> Release
> > Lifecycle. We came up with a doc[2] for it. We have finalized the doc
> > here[3] Please vote on it if you agree with the content of the doc
> [3].
> >
> > We did not proceed with the previous vote as we want to use
> confluence for
> > it. Here is the link for that[4]. It also mentions why we are doing
> this
> > vote.
> >
> > Vote will remain open for 72 hours.
> >
> > Thanks,
> > Sankalp
> >
> > [1]
> >
> https://lists.apache.org/thread.html/c610b23f9002978636b66d09f0e0481ed3de9b78895050da22c91c6f@%3Cdev.cassandra.apache.org%3E
> > [2]
> >
> 

Re: Can we kick off a release?

2019-10-08 Thread Jon Haddad
I forgot to mention, we should also release alpha2 of 4.0.


On Tue, Oct 8, 2019 at 1:04 PM Michael Shuler 
wrote:

> Thanks Sam, I'm following #15193 and should catch the status change there.
>
> Michael
>
> On Tue, Oct 8, 2019 at 6:17 AM Sam Tunnicliffe  wrote:
> >
> > CASSANDRA-15193 just got +1’d yesterday and would be good to include in
> the 3.0 and 3.11 releases. If you don’t mind holding off while I add a
> cqlsh test and merge it, that’d be good.
> >
> > Thanks,
> > Sam
> >
> > > On 7 Oct 2019, at 22:54, Michael Shuler 
> wrote:
> > >
> > > Will do! I probably won't get this done this evening, so will send out
> > > the emails tomorrow.
> > >
> > > Thanks,
> > > Michael
> > >
> > > On Mon, Oct 7, 2019 at 2:37 PM Jon Haddad  wrote:
> > >>
> > >> Michael,
> > >>
> > >> Would you mind kicking off builds and starting a vote thread for the
> latest
> > >> 2.2, 3.0 and 3.11 builds?
> > >>
> > >> Much appreciated,
> > >> Jon
> > >
> > > -
> > > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > > For additional commands, e-mail: dev-h...@cassandra.apache.org
> > >
> >
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > For additional commands, e-mail: dev-h...@cassandra.apache.org
> >
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
>


Re: Protocol-impacting (internode and client) changes for 4.0

2019-10-09 Thread Jon Haddad
Seems reasonable, especially since we're in alpha mode.

On Wed, Oct 9, 2019 at 10:28 AM Aleksey Yeshchenko
 wrote:

> +1; in particular since the protocol itself is still in beta
>
> > On 9 Oct 2019, at 17:26, Oleksandr Petrov 
> wrote:
> >
> > Hi,
> >
> > During NGCC/ACNA19 we've had quite a few conversations around the 4.0
> > release. Many (minor) features and changes suggested during that time are
> > possible to implement in 4.next without any problem. However, some
> changes
> > that seem to be very important for the community, which got mentioned in
> > several conversations, are not possible to implement without protocol
> > changes. By *protocol* changes here I mean both native and client
> protocol.
> >
> > Here's a shortlist of the issues in question:
> > https://issues.apache.org/jira/browse/CASSANDRA-15349 Add “Going away”
> > message to the client protocol
> > https://issues.apache.org/jira/browse/CASSANDRA-15350 Add CAS
> “uncertainty”
> > and “contention" messages that are currently propagated as a
> > WriteTimeoutException.
> > https://issues.apache.org/jira/browse/CASSANDRA-15351 Allow configuring
> > timeouts on the per-request basis
> > https://issues.apache.org/jira/browse/CASSANDRA-15352 Replica failure
> > propagation to coordinator and client
> > https://issues.apache.org/jira/browse/CASSANDRA-15299 Improve
> checksumming
> > and compression in protocol v5-beta
> >
> > And, less importantly - CASSANDRA-14683 (paging state issue).
> >
> > My suggestion would be to lift a freeze for all (or at least some) of
> these
> > issues, since they seem to be quite important for operators and each one
> of
> > them is extremely low risk, which means that any validation effort that
> has
> > already happened won't have to be re-done. All of the issues are fairly
> > easy to implement, which means they won't delay the release.
> >
> > To my best knowledge, there's no client that fully supports 4.0, I think
> > doing this now actually makes sense, meaning that driver implementers
> won't
> > really have to redo anything.
> >
> > Your thoughts on this are welcome,
> > -- Alex
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
>


Re: moving the site from SVN to git

2019-10-02 Thread Jon Haddad
Daniel referred me to the GitBox self service system.

I've attempted to port the site over using the tool Michael suggested, but
after a couple hours it died with this message:

command failed: r922600
git checkout -f master

If either of you (Mick or Michael) want to give svn2git a shot maybe you'll
get a different result.I think it may have been due to the large size
of the repo and the small drive on the VM I was using.  I can try it again
tomorrow with more storage to see if I get a better result.  Mick if you
want to give it a shot in the meantime that would be appreciated.

Jon

On Wed, Oct 2, 2019 at 3:18 PM Jon Haddad  wrote:

> I created an INFRA ticket here:
> https://issues.apache.org/jira/browse/INFRA-19218.
>
> On Wed, Sep 25, 2019 at 6:04 AM Michael Shuler 
> wrote:
>
>> I see no good reason to trash history. There are tools to make moving
>> from svn to git (hopefully) painless. We used git-svn for the main c*
>> source to retain history of both, which this tool uses to do migrations
>> - https://github.com/nirvdrum/svn2git
>>
>> Michael
>>
>> On 9/25/19 12:57 AM, Mick Semb Wever wrote:
>> >
>> >> Personally, no, I don't.  What I need to know is if someone who
>> actually
>> >> works on the site needs the history in *git*.
>> >
>> >
>> > Yes. I need the history in *git*.
>> >
>> >
>> > And I believe that INFRA can do the migration for you.
>> > (For example, INFRA-12055 and spark-website)
>> >
>> > -
>> > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
>> > For additional commands, e-mail: dev-h...@cassandra.apache.org
>> >
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
>> For additional commands, e-mail: dev-h...@cassandra.apache.org
>>
>>


time for a release?

2019-10-04 Thread Jon Haddad
It's been a while since we did a release and I think there's enough in here
to put one out.

2.2.15 changes:
https://issues.apache.org/jira/issues/?jql=project%20%3D%20CASSANDRA%20AND%20fixVersion%20%3D%202.2.15%20and%20status%20%3D%20Resolved%20

3.0.19 changes:
https://issues.apache.org/jira/issues/?jql=project%20%3D%20CASSANDRA%20AND%20fixVersion%20%3D%203.0.19%20and%20status%20%3D%20Resolved%20%20

3.11.5 changes:
https://issues.apache.org/jira/issues/?jql=project%20%3D%20CASSANDRA%20AND%20fixVersion%20%3D%203.11.5%20and%20status%20%3D%20Resolved%20

Any reason not to put this up for a vote?  If I don't hear anything by
Monday I'll start a vote thread.

Jon


Re: moving the site from SVN to git

2019-10-03 Thread Jon Haddad
I think we can safely ignore them.  Thanks for figuring this out.

On Thu, Oct 3, 2019 at 10:01 AM Michael Shuler 
wrote:

> I'm making progress through many periodic timeouts to svn.a.o and
> restarts, but it appears that svn2git is smart enough to pick up where
> it left off. The first commit captured at the svn path I'm specifying is
> when Cassandra was moved to a top level project at r922689 (2010-03-13).
> I don't know the old incubator path, and it's probably OK to ignore the
> few older incubator commits? I imagine it would mean starting over the
> entire import to pull in those older incubator svn commits, then
> changing the url and somehow importing the newer path on top?
>
> I tried using a local path as the source to try to speed things up,
> after I got my first few timeouts, but that fails.
>
> Curious if anyone really cares if we lose a few early commits - if so, I
> can try to figure out the old path and start again.
>
> Michael
>
> On 10/3/19 11:14 AM, Jon Haddad wrote:
> > Thanks for taking a look, Michael.  Hopefully you have better luck than
> me
> > :)
> >
> > On Thu, Oct 3, 2019 at 6:42 AM Michael Shuler 
> > wrote:
> >
> >> I cloned the empty cassandra-website git repo, and I'm running:
> >>
> >> svn2git http://svn.apache.org/repos/asf/cassandra/site --rootistrunk
> >> --no-minimize-url
> >>
> >> ..to see what I get. An svn checkout of the above url says it's only
> >> 69M, so I suppose it's pulling all history of all time for all projects?
> >>
> >> I'll let this roll for a while I run an errand and report back!
> >>
> >> Michael
> >>
> >> On 10/2/19 9:30 PM, Jon Haddad wrote:
> >>> Daniel referred me to the GitBox self service system.
> >>>
> >>> I've attempted to port the site over using the tool Michael suggested,
> >> but
> >>> after a couple hours it died with this message:
> >>>
> >>> command failed: r922600
> >>> git checkout -f master
> >>>
> >>> If either of you (Mick or Michael) want to give svn2git a shot maybe
> >> you'll
> >>> get a different result.    I think it may have been due to the large
> size
> >>> of the repo and the small drive on the VM I was using.  I can try it
> >> again
> >>> tomorrow with more storage to see if I get a better result.  Mick if
> you
> >>> want to give it a shot in the meantime that would be appreciated.
> >>>
> >>> Jon
> >>>
> >>> On Wed, Oct 2, 2019 at 3:18 PM Jon Haddad  wrote:
> >>>
> >>>> I created an INFRA ticket here:
> >>>> https://issues.apache.org/jira/browse/INFRA-19218.
> >>>>
> >>>> On Wed, Sep 25, 2019 at 6:04 AM Michael Shuler <
> mich...@pbandjelly.org>
> >>>> wrote:
> >>>>
> >>>>> I see no good reason to trash history. There are tools to make moving
> >>>>> from svn to git (hopefully) painless. We used git-svn for the main c*
> >>>>> source to retain history of both, which this tool uses to do
> migrations
> >>>>> - https://github.com/nirvdrum/svn2git
> >>>>>
> >>>>> Michael
> >>>>>
> >>>>> On 9/25/19 12:57 AM, Mick Semb Wever wrote:
> >>>>>>
> >>>>>>> Personally, no, I don't.  What I need to know is if someone who
> >>>>> actually
> >>>>>>> works on the site needs the history in *git*.
> >>>>>>
> >>>>>>
> >>>>>> Yes. I need the history in *git*.
> >>>>>>
> >>>>>>
> >>>>>> And I believe that INFRA can do the migration for you.
> >>>>>> (For example, INFRA-12055 and spark-website)
> >>>>>>
> >>>>>>
> -
> >>>>>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> >>>>>> For additional commands, e-mail: dev-h...@cassandra.apache.org
> >>>>>>
> >>>>>
> >>>>> -
> >>>>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> >>>>> For additional commands, e-mail: dev-h...@cassandra.apache.org
> >>>>>
> >>>>>
> >>>
> >>
> >> -
> >> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> >> For additional commands, e-mail: dev-h...@cassandra.apache.org
> >>
> >>
> >
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
>


Re: moving the site from SVN to git

2019-10-03 Thread Jon Haddad
Thanks for taking a look, Michael.  Hopefully you have better luck than me
:)

On Thu, Oct 3, 2019 at 6:42 AM Michael Shuler 
wrote:

> I cloned the empty cassandra-website git repo, and I'm running:
>
> svn2git http://svn.apache.org/repos/asf/cassandra/site --rootistrunk
> --no-minimize-url
>
> ..to see what I get. An svn checkout of the above url says it's only
> 69M, so I suppose it's pulling all history of all time for all projects?
>
> I'll let this roll for a while I run an errand and report back!
>
> Michael
>
> On 10/2/19 9:30 PM, Jon Haddad wrote:
> > Daniel referred me to the GitBox self service system.
> >
> > I've attempted to port the site over using the tool Michael suggested,
> but
> > after a couple hours it died with this message:
> >
> > command failed: r922600
> > git checkout -f master
> >
> > If either of you (Mick or Michael) want to give svn2git a shot maybe
> you'll
> > get a different result.I think it may have been due to the large size
> > of the repo and the small drive on the VM I was using.  I can try it
> again
> > tomorrow with more storage to see if I get a better result.  Mick if you
> > want to give it a shot in the meantime that would be appreciated.
> >
> > Jon
> >
> > On Wed, Oct 2, 2019 at 3:18 PM Jon Haddad  wrote:
> >
> >> I created an INFRA ticket here:
> >> https://issues.apache.org/jira/browse/INFRA-19218.
> >>
> >> On Wed, Sep 25, 2019 at 6:04 AM Michael Shuler 
> >> wrote:
> >>
> >>> I see no good reason to trash history. There are tools to make moving
> >>> from svn to git (hopefully) painless. We used git-svn for the main c*
> >>> source to retain history of both, which this tool uses to do migrations
> >>> - https://github.com/nirvdrum/svn2git
> >>>
> >>> Michael
> >>>
> >>> On 9/25/19 12:57 AM, Mick Semb Wever wrote:
> >>>>
> >>>>> Personally, no, I don't.  What I need to know is if someone who
> >>> actually
> >>>>> works on the site needs the history in *git*.
> >>>>
> >>>>
> >>>> Yes. I need the history in *git*.
> >>>>
> >>>>
> >>>> And I believe that INFRA can do the migration for you.
> >>>> (For example, INFRA-12055 and spark-website)
> >>>>
> >>>> -
> >>>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> >>>> For additional commands, e-mail: dev-h...@cassandra.apache.org
> >>>>
> >>>
> >>> -
> >>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> >>> For additional commands, e-mail: dev-h...@cassandra.apache.org
> >>>
> >>>
> >
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
>


another alpha?

2020-03-02 Thread Jon Haddad
Looking at CHANGES.txt, we've got 30+ changes since the last alpha.  I
think it's a good time to cut another alpha release.  The biggest item here
is Python 3 support for cqlsh.  It would be good to get as much feedback as
possible on this since it's such a critical tool.

Here's what's changed:

Include finalized pending sstables in preview repair (CASSANDRA-15553)
Reverted to the original behavior of CLUSTERING ORDER on CREATE TABLE
(CASSANDRA-15271)
Correct inaccurate logging message (CASSANDRA-15549)
Add documentation of dynamo (CASSANDRA-15486)
Added documentation for Guarantees (CASSANDRA-15482)
Added documentation for audit logging (CASSANDRA-15474)
Unset GREP_OPTIONS (CASSANDRA-14487)
Added streaming documentation (CASSANDRA-15477)
Update to Python driver 3.21 for cqlsh (CASSANDRA-14872)
Added bulk loading documentation (CASSANDRA-15480)
Updated overview documentation (CASSANDRA-15483)
Added CDC and speculative retry documentation to DDL section
(CASSANDRA-15492)
Fix missing Keyspaces in cqlsh describe output (CASSANDRA-15576)
Fix multi DC nodetool status output (CASSANDRA-15305)
Added documentation covering new Netty based internode messaging
(CASSANDRA-15478)
Add documentation of hints (CASSANDRA-15491)
updateCoordinatorWriteLatencyTableMetric can produce misleading metrics
(CASSANDRA-15569)
Added documentation for read repair and an example of full repair
(CASSANDRA-15485)
Make cqlsh and cqlshlib Python 2 & 3 compatible (CASSANDRA-10190)
Added documentation for Full Query Logging (CASSANDRA-15475)
Added documentation for backups (CASSANDRA-15479)
Documentation gives the wrong instruction to activate remote jmx
(CASSANDRA-15535)
Improve the description of nodetool listsnapshots command (CASSANDRA-14587)
allow embedded cassandra launched from a one-jar or uno-jar
(CASSANDRA-15494)
Update hppc library to version 0.8.1 (CASSANDRA-12995)
Limit the dependencies used by UDFs/UDAs (CASSANDRA-14737)
Make native_transport_max_concurrent_requests_in_bytes updatable
(CASSANDRA-15519)
Cleanup and improvements to IndexInfo/ColumnIndex (CASSANDRA-15469)
Potential Overflow in DatabaseDescriptor Functions That Convert Between
KB/MB & Bytes (CASSANDRA-15470)
Merged from 3.0:
Run evictFromMembership in GossipStage (CASSANDRA-15592)
Merged from 2.2:
Allow EXTRA_CLASSPATH to work on tar/source installations (CASSANDRA-15567)

Thoughts?
Jon


Re: Update defaults for 4.0?

2020-01-23 Thread Jon Haddad
Yes. please do. We should also update our JVM defaults.

On Thu, Jan 23, 2020, 9:28 PM Jeremy Hanna 
wrote:

> To summarize this thread, I think people are generally okay with updating
> certain defaults for 4.0 provided we make sure it doesn't unpleasantly
> surprise cluster operators.  I think with the num_tokens and
> compaction_throughput_in_mb we could go with a release note for the reasons
> in my last email.  I also agree that we should consider bump
> roles_validity_in_ms, permissions_validity_in_ms, and
>credentials_validity_in_ms along with the default snitch (going to GPFS
> as the default) as that gives people a DC aware default at least to start.
>
> Is everyone okay if I create tickets for each of these and link them with
> an epic so that we can discuss them separately?
>
> Thanks,
>
> Jeremy
>
> On Thu, Jan 23, 2020 at 5:34 AM Alex Ott  wrote:
>
> > In addition to these, maybe we could consider to change other as well?
> > Like:
> >
> > 1. bump roles_validity_in_ms, permissions_validity_in_ms, and
> >credentials_validity_in_ms as well - maybe at least to a minute, or
> 2. I
> >have seen multiple times when authentication was failing under the
> heavy
> >load because queries to system tables were timing out - with these
> >defaults people may still have the possibility to get updates to
> >roles/credentials faster when specifying _update_interval_ variants of
> >these configurations.
> > 2. change default snitch from SimpleSnitch to
> GossipingPropertyFileSnitch -
> >we're anyway saying that SimpleSnitch is only appropriate for
> >single-datacenter deployments, and for real production we need to use
> >GossipingPropertyFileSnitch - why not to set it as default?
> >
> >
> > Jeremy Hanna  at "Wed, 22 Jan 2020 11:22:36 +1100" wrote:
> >  JH> I mentioned this in the contributor meeting as a topic to bring up
> on
> > the list - should we
> >  JH> take the opportunity to update defaults for Cassandra 4.0?
> >
> >  JH> The rationale is two-fold:
> >  JH> 1) There are best practices and tribal knowledge around certain
> > properties where people
> >  JH> just know to update those properties immediately as a starting
> > point.  If it's pretty much
> >  JH> a given that we set something as a starting point different than the
> > current defaults, why
> >  JH> not make that the new default?
> >  JH> 2) We should align the defaults with what we test with.  There may
> be
> > exceptions if we
> >  JH> have one-off tests but on the whole, we should be testing with
> > defaults.
> >
> >  JH> As a starting point, compaction throughput and number of vnodes seem
> > like good candidates
> >  JH> but it would be great to get feedback for any others.
> >
> >  JH> For compaction throughput (
> > https://jira.apache.org/jira/browse/CASSANDRA-14902), I've made
> >  JH> a basic case on the ticket to default to 64 just as a starting point
> > because the decision
> >  JH> for 16 was made when spinning disk was most common.  Hence most
> > people I know change that
> >  JH> and I think without too much bikeshedding, 64 is a reasonable
> > starting point.  A case
> >  JH> could be made that empirically the compaction throughput throttle
> may
> > have less effect
> >  JH> than many people think, but I still think an updated default would
> > make sense.
> >
> >  JH> For number of vnodes, Michael Shuler made the point in the
> discussion
> > that we already test
> >  JH> with 32, which is a far better number than the 256 default.  I know
> > many new users that
> >  JH> just leave the 256 default and then discover later that it's better
> > to go lower.  I think
> >  JH> 32 is a good balance.  One could go lower with the new algorithm but
> > I think 32 is much
> >  JH> better than 256 without being too skewed, and it's what we currently
> > test.
> >
> >  JH> Jeff brought up a good point that we want to be careful with
> defaults
> > since changing them
> >  JH> could come as an unpleasant surprise to people who don't explicitly
> > set them.  As a
> >  JH> general rule, we should always update release notes to clearly state
> > that a default has
> >  JH> changed.  For these two defaults in particular, I think it's safe.
> > For compaction
> >  JH> throughput I think a release not is sufficient in case they want to
> > modify it.  For number
> >  JH> of vnodes, it won't affect existing deployments with data - it would
> > be for new clusters,
> >  JH> which would honestly benefit from this anyway.
> >
> >  JH> The other point is whether it's too late to go into 4.0.  For these
> > two changes, I think
> >  JH> significant testing can still be done with these new defaults before
> > release and I think
> >  JH> testing more explicitly with 32 vnodes in particular will give
> people
> > more confidence in
> >  JH> the lower number with a wider array of testing (where we don't
> > already use 32 explicitly).
> >
> >  JH> In summary, are people okay with considering updating 

Re: [Discuss] num_tokens default in Cassandra 4.0

2020-01-29 Thread Jon Haddad
Ive put a lot of my previous clients on 4 tokens, all of which have
resulted in a major improvement.

I wouldn't use any more than 4 except under some pretty unusual
circumstances.

Jon

On Wed, Jan 29, 2020, 11:18 AM Ben Bromhead  wrote:

> +1 to reducing the number of tokens as low as possible for availability
> issues. 4 lgtm
>
> On Wed, Jan 29, 2020 at 1:14 AM Dinesh Joshi  wrote:
>
> > Thanks for restarting this discussion Jeremy. I personally think 4 is a
> > good number as a default. I think whatever we pick, we should have enough
> > documentation for operators to make sense of the new defaults in 4.0.
> >
> > Dinesh
> >
> > > On Jan 28, 2020, at 9:25 PM, Jeremy Hanna 
> > wrote:
> > >
> > > I wanted to start a discussion about the default for num_tokens that
> > we'd like for people starting in Cassandra 4.0.  This is for ticket
> > CASSANDRA-13701 
> > (which has been duplicated a number of times, most recently by me).
> > >
> > > TLDR, based on availability concerns, skew concerns, operational
> > concerns, and based on the fact that the new allocation algorithm can be
> > configured fairly simply now, this is a proposal to go with 4 as the new
> > default and the allocate_tokens_for_local_replication_factor set to 3.
> > That gives a good experience out of the box for people and is the most
> > conservative.  It does assume that racks and DCs have been configured
> > correctly.  We would, of course, go into some detail in the NEWS.txt.
> > >
> > > Joey Lynch and Josh Snyder did an extensive analysis of availability
> > concerns with high num_tokens/virtual nodes in their paper <
> >
> http://mail-archives.apache.org/mod_mbox/cassandra-dev/201804.mbox/%3CCALShVHcz5PixXFO_4bZZZNnKcrpph-=5QmCyb0M=w-mhdyl...@mail.gmail.com%3E
> >.
> > This worsens as clusters grow larger.  I won't quote the paper here but
> in
> > order to have a conservative default and with the accompanying new
> > allocation algorithm, I think it makes sense as a default.
> > >
> > > The difficulties have always been that virtual nodes have been
> > beneficial for operations but that 256 is too high for the purposes of
> > repair and as Joey and Josh cover, for availability.  Going lower with
> the
> > original allocation algorithm has produced skew in allocation in its
> naive
> > distribution.  Enter CASSANDRA-7032 <
> > https://issues.apache.org/jira/browse/CASSANDRA-7032> and the new token
> > allocation algorithm.  CASSANDRA-15260 <
> > https://issues.apache.org/jira/browse/CASSANDRA-15260> makes the new
> > algorithm operationally simpler.
> > >
> > > One other item of note - since Joey and Josh's analysis, there have
> been
> > improvements in streaming and other considerations that can reduce the
> > probability of more than one node representing some token range being
> > unavailable, but it would still be good to be conservative.
> > >
> > > Please chime in with any concerns with having num_tokens=4 and
> > allocate_tokens_for_local_replication_factor=3 and the accompanying
> > rationale so we can improve the experience for all users.
> > >
> > > Other resources:
> > >
> >
> https://thelastpickle.com/blog/2019/02/21/set-up-a-cluster-with-even-token-distribution.html
> > >
> >
> https://docs.datastax.com/en/dse/6.7/dse-admin/datastax_enterprise/config/configVnodes.html
> > >
> >
> https://www.datastax.com/blog/2016/01/new-token-allocation-algorithm-cassandra-30
> > >
> >
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > For additional commands, e-mail: dev-h...@cassandra.apache.org
> >
> >
>
> --
>
> Ben Bromhead
>
> Instaclustr | www.instaclustr.com | @instaclustr
>  | (650) 284 9692
>


Re: [Discuss] num_tokens default in Cassandra 4.0

2020-01-30 Thread Jon Haddad
Yes, I'm against it. We should be using the default value that benefits the
most people, rather than an arbitrary compromise.

Most clusters don't shrink, they stay the same size or grow. I'd say 90% or
more fall in this category.  Let's do the right thing by default and
include good comments that help people make the right decision if they
think they'll be outside the usual case.

On Thu, Jan 30, 2020, 8:07 PM Joseph Lynch  wrote:

> Any objections to the compromise of 16 as proposed in Chris's original
> patch?
>
> -Joey
>
> On Thu, Jan 30, 2020, 3:47 PM Anthony Grasso 
> wrote:
>
> > I think lowering the number of tokens is a great idea! Similar to Jon,
> when
> > I have reduced the number of tokens for clients it has been improvement
> in
> > repair performance.
> >
> > I am concerned that the proposed default value for num_tokens is too low.
> > If you set up a cluster using the proposed defaults, you will get a
> > balanced cluster. However, if you decommission nodes you will start to
> see
> > large imbalances especially for small clusters (< 20 nodes). This is
> > because the allocate_tokens_for_local_replication_factor setting is only
> > applied during the bootstrap process.
> >
> > I have recommended very low values for num_tokens to clients. This was
> > because it was very unlikely that they would reduce their cluster size
> and
> > I warned them of the caveats with using a small value for num_tokens.
> >
> > The proposed num_token default value is fine for devs and operators that
> > know what they are doing. However, the general Cassandra community will
> be
> > unaware of the potential issue with such a low value. We should consider
> > setting num_tokens to 16 - 32 as the default. This will at least help
> > reduce the severity of the imbalance when decommissioning a node whilst
> > still providing the benefits of having a low number of tokens. In
> addition,
> > we can add a comment to num_tokens that clusters over 100 nodes (per
> > datacenter) should consider reducing it down to 4.
> >
> > Cheers,
> > Anthony
> >
> > On Fri, 31 Jan 2020 at 01:58, Jon Haddad  wrote:
> >
> > > Larger clusters is where high token counts do the most damage. That's
> why
> > > it's such a problem. You start out with a small cluster using 256, as
> you
> > > grow into the hundreds it becomes more and more unstable.
> > >
> > >
> > > On Thu, Jan 30, 2020, 8:19 AM onmstester onmstester
> > >  wrote:
> > >
> > > > Shouldn't we consider the cluster size to configure num_tokens?
> > > >
> > > > For example is it OK to use num_tokens=4 for a cluster of more than
> 100
> > > of
> > > > nodes?
> > > >
> > > >
> > > >
> > > > Another question that is not so much relevant to this :
> > > >
> > > > When we use the token assignment algorithm (the new/non-random one)
> > for a
> > > > specific keyspace, why should we use initial token for all the seeds,
> > > isn't
> > > > one seed enough and then just set the keyspace for all other nodes?
> > > >
> > > >
> > > >
> > > > Also i do not understand why should we consider rack topology and
> > number
> > > > of racks for configuration of num_tokens?
> > > >
> > > >
> > > >
> > > > Sent using https://www.zoho.com/mail/
> > > >
> > > >
> > > >
> > > >
> > > >  On Thu, 30 Jan 2020 04:33:57 +0330 Jeremy Hanna <
> > > > jeremy.hanna1...@gmail.com> wrote 
> > > >
> > > >
> > > > The new default wouldn't be retroactively set for 3.x, but the same
> > > > principles apply.  The new algorithm is in 3.x as well as the
> > > > simplification of the configuration.  So no reason not to use the
> same
> > > > configuration on 3.x.
> > > >
> > > > > On Jan 30, 2020, at 4:34 AM, Chen-Becker, Derek  > > > dchen...@amazon.com.INVALID> wrote:
> > > > >
> > > > > Does the same guidance apply to 3.x clusters? I read through the
> JIRA
> > > > ticket linked below, along with tickets that it links to, but it's
> not
> > > > clear that the new allocation algorithm is available in 3.x or if
> there
> > > are
> > > > other reasons that this would be problematic.
> > > > >
> > > > > Thanks,
> > > > >
>

Re: [Discuss] num_tokens default in Cassandra 4.0

2020-02-03 Thread Jon Haddad
I think it's a good idea to take a step back and get a high level view of
the problem we're trying to solve.

First, high token counts result in decreased availability as each node has
data overlap with with more nodes in the cluster.  Specifically, a node can
share data with RF-1 * 2 * num_tokens.  So a 256 token cluster at RF=3 is
going to almost always share data with every other node in the cluster that
isn't in the same rack, unless you're doing something wild like using more
than a thousand nodes in a cluster.  We advertise

With 16 tokens, that is vastly improved, but you still have up to 64 nodes
each node needs to query against, so you're again, hitting every node
unless you go above ~96 nodes in the cluster (assuming 3 racks / AZs).  I
wouldn't use 16 here, and I doubt any of you would either.  I've advocated
for 4 tokens because you'd have overlap with only 16 nodes, which works
well for small clusters as well as large.  Assuming I was creating a new
cluster for myself (in a hypothetical brand new application I'm building) I
would put this in production.  I have worked with several teams where I
helped them put 4 token clusters in prod and it has worked very well.  We
didn't see any wild imbalance issues.

As Mick's pointed out, our current method of using random token assignment
for the default number of problematic for 4 tokens.  I fully agree with
this, and I think if we were to try to use 4 tokens, we'd want to address
this in tandem.  We can discuss how to better allocate tokens by default
(something more predictable than random), but I'd like to avoid the
specifics of that for the sake of this email.

To Alex's point, repairs are problematic with lower token counts due to
over streaming.  I think this is a pretty serious issue and I we'd have to
address it before going all the way down to 4.  This, in my opinion, is a
more complex problem to solve and I think trying to fix it here could make
shipping 4.0 take even longer, something none of us want.

For the sake of shipping 4.0 without adding extra overhead and time, I'm ok
with moving to 16 tokens, and in the process adding extensive documentation
outlining what we recommend for production use.  I think we should also try
to figure out something better than random as the default to fix the data
imbalance issues.  I've got a few ideas here I've been noodling on.

As long as folks are fine with potentially changing the default again in C*
5.0 (after another discussion / debate), 16 is enough of an improvement
that I'm OK with the change, and willing to author the docs to help people
set up their first cluster.  For folks that go into production with the
defaults, we're at least not setting them up for total failure once their
clusters get large like we are now.

In future versions, we'll probably want to address the issue of data
imbalance by building something in that shifts individual tokens around.  I
don't think we should try to do this in 4.0 either.

Jon



On Fri, Jan 31, 2020 at 2:04 PM Jeremy Hanna 
wrote:

> I think Mick and Anthony make some valid operational and skew points for
> smaller/starting clusters with 4 num_tokens. There’s an arbitrary line
> between small and large clusters but I think most would agree that most
> clusters are on the small to medium side. (A small nuance is afaict the
> probabilities have to do with quorum on a full token range, ie it has to do
> with the size of a datacenter not the full cluster
>
> As I read this discussion I’m personally more inclined to go with 16 for
> now. It’s true that if we could fix the skew and topology gotchas for those
> starting things up, 4 would be ideal from an availability perspective.
> However we’re still in the brainstorming stage for how to address those
> challenges. I think we should create tickets for those issues and go with
> 16 for 4.0.
>
> This is about an out of the box experience. It balances availability,
> operations (such as skew and general bootstrap friendliness and
> streaming/repair), and cluster sizing. Balancing all of those, I think for
> now I’m more comfortable with 16 as the default with docs on considerations
> and tickets to unblock 4 as the default for all users.
>
> >>> On Feb 1, 2020, at 6:30 AM, Jeff Jirsa  wrote:
> >> On Fri, Jan 31, 2020 at 11:25 AM Joseph Lynch 
> wrote:
> >> I think that we might be bikeshedding this number a bit because it is
> easy
> >> to debate and there is not yet one right answer.
> >
> >
> > https://www.youtube.com/watch?v=v465T5u9UKo
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
>


Re: [RELEASE] Apache Cassandra 4.0-alpha3 released

2020-02-07 Thread Jon Haddad
Thanks for handling this, Mick!

On Fri, Feb 7, 2020 at 12:02 PM Mick Semb Wever  wrote:

>
>
> The Cassandra team is pleased to announce the release of Apache Cassandra
> version 4.0-alpha3.
>
> Apache Cassandra is a fully distributed database. It is the right choice
> when you need scalability and high availability without compromising
> performance.
>
>  http://cassandra.apache.org/
>
> Downloads of source and binary distributions are listed in our download
> section:
>  http://cassandra.apache.org/download/
>
>
> Downloads of source and binary distributions:
>
> http://www.apache.org/dyn/closer.lua/cassandra/4.0-alpha3/apache-cassandra-4.0-alpha3-bin.tar.gz
>
> http://www.apache.org/dyn/closer.lua/cassandra/4.0-alpha3/apache-cassandra-4.0-alpha3-src.tar.gz
>
> Debian and Redhat configurations.
>
>   sources.list:
>   deb http://www.apache.org/dist/cassandra/debian 40x main
>
>   yum config:
>   baseurl=https://www.apache.org/dist/cassandra/redhat/40x/
>
> See http://cassandra.apache.org/download/ for full install instructions.
>
> This is an ALPHA version! It is not intended for production use, however
> the project would appreciate your testing and feedback to make the final
> release better. As always, please pay attention to the release notes[2]
> and let us know[3] if you encounter any problems.
>
> Enjoy!
>
> [1]: CHANGES.txt
> ?p=cassandra.git;a=blob_plain;f=CHANGES.txt;hb=refs/tags/cassandra-4.0-alpha3
> [2]: NEWS.txt
> ?p=cassandra.git;a=blob_plain;f=NEWS.txt;hb=refs/tags/cassandra-4.0-alpha3
> [3]: https://issues.apache.org/jira/browse/CASSANDRA
>
> -
> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: user-h...@cassandra.apache.org
>
>


Re: [Discuss] num_tokens default in Cassandra 4.0

2020-01-30 Thread Jon Haddad
Larger clusters is where high token counts do the most damage. That's why
it's such a problem. You start out with a small cluster using 256, as you
grow into the hundreds it becomes more and more unstable.


On Thu, Jan 30, 2020, 8:19 AM onmstester onmstester
 wrote:

> Shouldn't we consider the cluster size to configure num_tokens?
>
> For example is it OK to use num_tokens=4 for a cluster of more than 100 of
> nodes?
>
>
>
> Another question that is not so much relevant to this :
>
> When we use the token assignment algorithm (the new/non-random one) for a
> specific keyspace, why should we use initial token for all the seeds, isn't
> one seed enough and then just set the keyspace for all other nodes?
>
>
>
> Also i do not understand why should we consider rack topology and number
> of racks for configuration of num_tokens?
>
>
>
> Sent using https://www.zoho.com/mail/
>
>
>
>
>  On Thu, 30 Jan 2020 04:33:57 +0330 Jeremy Hanna <
> jeremy.hanna1...@gmail.com> wrote 
>
>
> The new default wouldn't be retroactively set for 3.x, but the same
> principles apply.  The new algorithm is in 3.x as well as the
> simplification of the configuration.  So no reason not to use the same
> configuration on 3.x.
>
> > On Jan 30, 2020, at 4:34 AM, Chen-Becker, Derek  dchen...@amazon.com.INVALID> wrote:
> >
> > Does the same guidance apply to 3.x clusters? I read through the JIRA
> ticket linked below, along with tickets that it links to, but it's not
> clear that the new allocation algorithm is available in 3.x or if there are
> other reasons that this would be problematic.
> >
> > Thanks,
> >
> > Derek
> >
> > On 1/29/20, 9:54 AM, "Jon Haddad" <mailto:j...@jonhaddad.com> wrote:
> >
> >Ive put a lot of my previous clients on 4 tokens, all of which have
> >resulted in a major improvement.
> >
> >I wouldn't use any more than 4 except under some pretty unusual
> >circumstances.
> >
> >Jon
> >
> >On Wed, Jan 29, 2020, 11:18 AM Ben Bromhead  b...@instaclustr.com> wrote:
> >
> >> +1 to reducing the number of tokens as low as possible for availability
> >> issues. 4 lgtm
> >>
> >> On Wed, Jan 29, 2020 at 1:14 AM Dinesh Joshi <mailto:djo...@apache.org>
> wrote:
> >>
> >>> Thanks for restarting this discussion Jeremy. I personally think 4 is
> a
> >>> good number as a default. I think whatever we pick, we should have
> enough
> >>> documentation for operators to make sense of the new defaults in 4.0.
> >>>
> >>> Dinesh
> >>>
> >>>> On Jan 28, 2020, at 9:25 PM, Jeremy Hanna  jeremy.hanna1...@gmail.com>
> >>> wrote:
> >>>>
> >>>> I wanted to start a discussion about the default for num_tokens that
> >>> we'd like for people starting in Cassandra 4.0.  This is for ticket
> >>> CASSANDRA-13701 <https://issues.apache.org/jira/browse/CASSANDRA-13701>
>
> >>> (which has been duplicated a number of times, most recently by me).
> >>>>
> >>>> TLDR, based on availability concerns, skew concerns, operational
> >>> concerns, and based on the fact that the new allocation algorithm can
> be
> >>> configured fairly simply now, this is a proposal to go with 4 as the
> new
> >>> default and the allocate_tokens_for_local_replication_factor set to 3.
> >>> That gives a good experience out of the box for people and is the most
> >>> conservative.  It does assume that racks and DCs have been configured
> >>> correctly.  We would, of course, go into some detail in the NEWS.txt.
> >>>>
> >>>> Joey Lynch and Josh Snyder did an extensive analysis of availability
> >>> concerns with high num_tokens/virtual nodes in their paper <
> >>>
> >>
> http://mail-archives.apache.org/mod_mbox/cassandra-dev/201804.mbox/%3CCALShVHcz5PixXFO_4bZZZNnKcrpph-=5QmCyb0M=w-mhdyl...@mail.gmail.com%3E
> >>> .
> >>> This worsens as clusters grow larger.  I won't quote the paper here
> but
> >> in
> >>> order to have a conservative default and with the accompanying new
> >>> allocation algorithm, I think it makes sense as a default.
> >>>>
> >>>> The difficulties have always been that virtual nodes have been
> >>> beneficial for operations but that 256 is too high for the purposes of
> >>> repair and as Joey and Josh cover, for availabi

Re: [Discuss] num_tokens default in Cassandra 4.0

2020-02-19 Thread Jon Haddad
Joey Lynch had a good idea - that if the allocate tokens for RF isn't set
we use 1 as the RF.  I suggested we take it a step further and use the rack
count as the RF if it's not set.

This should take care of most clusters even if they don't set the RF, and
will handle the uneven distribution when provisioning a new cluster.

The only case where you'd want more tokens is to scale down, which I saw in
very few clusters of the hundreds I've worked on.



On Wed, Feb 19, 2020 at 4:35 AM Jeremiah Jordan 
wrote:

> If you don’t know what you are doing you will have one rack which will
> also be safe. If you are setting up racks then you most likely read
> something about doing that, and should also be fine.
> This discussion has gone off the rails 100 times with what ifs that are
> “letting perfect be the enemy of good”. The setting doesn’t need to be
> perfect. It just needs to be “good enough“.
>
> > On Feb 19, 2020, at 1:44 AM, Mick Semb Wever 
> wrote:
> >
> > Why do we have to assume random assignment?
> >
> >
> >
> > Because token allocation only works once you have a node in RF racks. If
> > you don't bootstrap nodes in alternating racks, or just never have RF
> racks
> > setup (but more than one rack) it's going to be random.
> >
> > Whatever default we choose should be a safe choice, not the best for
> > experts. Making it safe (4 as the default would be great) shouldn't be
> > difficult, and I thought Joey was building a  list of related issues?
> >
> > Seeing these issues put together summarised would really help build the
> > consensus IMHO.
> >
> >>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
>


Re: Do we need Javadoc in binary distribution? Was: [RELEASE] Apache Cassandra 4.0-alpha3 released

2020-02-08 Thread Jon Haddad
+1 as well

On Sat, Feb 8, 2020, 12:25 PM Joshua McKenzie  wrote:

> +1 to removing javadoc from the distro from me.
>
> On Sat, Feb 8, 2020 at 9:24 AM Michael Shuler 
> wrote:
>
> > I like this idea for keeping binary deployment size down. I'm not sure
> > how to handle it for the tarballs, but we could certainly split the docs
> > out of the debian and rpm packages to add
> > cassandra-docs_.{deb,rpm} packages, so they are installable
> > separately, if the user wants them. This is common when docs get large.
> > I suppose the same could be done for
> > apache-cassandra-docs-,tar.gz, but I'm not sure about the
> > release policy part of things here. Needs research.
> >
> > Please, open a JIRA on this as a packaging improvement.
> >
> > Kind regards,
> > Michael
> >
> > On 2/8/20 3:06 AM, Alex Ott wrote:
> > > Hi
> > >
> > > I've unpacked binary distribution & noticed that we ship many files in
> > the
> > > javadoc directory - more than 5 thousand files, that occupy 99Mb on
> disk
> > > out of 149Mb for whole unpacked Cassandra.
> > >
> > > If we look from practical standpoint - do we expect that people who run
> > > Cassandra will use javadoc for any purpose?  I know that it often
> > contains
> > > useful details about implementation, but if we talk about day-to-day
> > work,
> > > imho, these files aren't required, at least not on every machine that
> has
> > > Cassandra on it.
> > >
> > > Maybe we can generate a separate artifact for Javadoc files?
> > >
> > > Mick Semb Wever  at "Fri, 07 Feb 2020 21:02:09 +0100" wrote:
> > >   MSW> The Cassandra team is pleased to announce the release of Apache
> > Cassandra version 4.0-alpha3.
> > >
> > >   MSW> Apache Cassandra is a fully distributed database. It is the
> right
> > choice when you need scalability and high availability without
> compromising
> > performance.
> > >
> > >   MSW>  http://cassandra.apache.org/
> > >
> > >   MSW> Downloads of source and binary distributions are listed in our
> > download section:
> > >   MSW>  http://cassandra.apache.org/download/
> > >
> > >
> > >   MSW> Downloads of source and binary distributions:
> > >   MSW>
> >
> http://www.apache.org/dyn/closer.lua/cassandra/4.0-alpha3/apache-cassandra-4.0-alpha3-bin.tar.gz
> > >   MSW>
> >
> http://www.apache.org/dyn/closer.lua/cassandra/4.0-alpha3/apache-cassandra-4.0-alpha3-src.tar.gz
> > >
> > >   MSW> Debian and Redhat configurations.
> > >
> > >   MSW>   sources.list:
> > >   MSW>   deb http://www.apache.org/dist/cassandra/debian 40x main
> > >
> > >   MSW>   yum config:
> > >   MSW>   baseurl=https://www.apache.org/dist/cassandra/redhat/40x/
> > >
> > >   MSW> See http://cassandra.apache.org/download/ for full install
> > instructions.
> > >
> > >   MSW> This is an ALPHA version! It is not intended for production use,
> > however
> > >   MSW> the project would appreciate your testing and feedback to make
> > the final
> > >   MSW> release better. As always, please pay attention to the release
> > notes[2]
> > >   MSW> and let us know[3] if you encounter any problems.
> > >
> > >   MSW> Enjoy!
> > >
> > >   MSW> [1]: CHANGES.txt
> >
> ?p=cassandra.git;a=blob_plain;f=CHANGES.txt;hb=refs/tags/cassandra-4.0-alpha3
> > >   MSW> [2]: NEWS.txt
> >
> ?p=cassandra.git;a=blob_plain;f=NEWS.txt;hb=refs/tags/cassandra-4.0-alpha3
> > >   MSW> [3]: https://issues.apache.org/jira/browse/CASSANDRA
> > >
> > >   MSW>
> > -
> > >   MSW> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
> > >   MSW> For additional commands, e-mail: user-h...@cassandra.apache.org
> > >
> > >
> > >
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > For additional commands, e-mail: dev-h...@cassandra.apache.org
> >
> >
>


Re: [proposal] Introduce AssertJ in test framework

2020-03-10 Thread Jon Haddad
I've used assertj in a lot of projects, I prefer it by a wide margin over
using only junit.

On Tue, Mar 10, 2020 at 9:45 AM David Capwell  wrote:

> +1 from me
>
> In CASSANDRA-15564 I build my own assert chain to make the tests cleaner;
> did it since assertj wasn't there.
>
> On Tue, Mar 10, 2020, 9:28 AM Kevin Gallardo 
> wrote:
>
> > I would like to propose adding AssertJ 
> as
> > a test dependency and therefore have it available for writing
> > unit/distributed/any test assertions.
> >
> > In addition to the examples mentioned on the AssertJ docs page (allows to
> > do elaborate and comprehensible assertions on Collections, String, and
> > *custom
> > assertions*), here's an example of a dtest I was looking at, that could
> be
> > translated to AssertJ syntax, just to give an idea of how the syntax
> would
> > apply:
> >
> > *JUnit asserts*:
> > try {
> >[...]
> > } catch (Exception e) {
> > Assert.assertTrue(e instanceof RuntimeException);
> > RuntimeException re = ((RuntimeException) e);
> > Assert.assertTrue(re.getCause() instanceof ReadTimeoutException);
> > ReadTimeoutException rte = ((ReadTimeoutException) e.getCause());
> > Assert.assertTrue(rte.getMessage().contains("blabla")
> >   && rte.getMessage().contains("andblablo"));
> > }
> >
> > *AssertJ style:*
> > try {
> > [...]
> > } catch (Exception e) {
> > Assertions.assertThat(e)
> > .isInstanceOf(RuntimeException.class)
> > .hasCauseExactlyInstanceOf(ReadTimeoutException.class)
> > .hasMessageContaining("blabla")
> > .hasMessageContaining("andblablo");
> > }
> >
> > The syntax is more explicit and more comprehensible, but more
> importantly,
> > when one of the JUnit assertTrue() fails, you don't know *why*, you only
> > know that the resulting boolean expression is false.
> > If a failure happened with the assertJ tests, the failure would say
> > "Exception
> > did not contain expected message, expected "blabla", actual "notblabla""
> > (same for a lot of other situations), this makes debugging a failure,
> after
> > a test ran and failed much easier. With JUnit asserts you would have to
> > additionally add a message explaining what the expected value is *and*
> > what the
> > actual value is, for each assert that is more complex than a assertEquals
> > on a number, I suppose. I have seen a lot of tests so far that only test
> > the expected behavior via assertTrue and does not show the incorrect
> values
> > when the test fails, which would come for free with AssertJ.
> >
> > Other examples randomly picked from the test suite:
> >
> >
> >
> *org.apache.cassandra.repair.RepairJobTest#testNoTreeRetainedAfterDistance:*
> > Replace assertion:
> > assertTrue(messages.stream().allMatch(m -> m.verb() == Verb.SYNC_REQ));
> > With:
> > assertThat(messages)
> > .extracting(Message::verb)
> > .containsOnly(Verb.SYNC_REQ);
> >
> > As a result, if any of the messages is not a Verb.SYNC_REQ, the test
> > failure will show the actual "Verb"s of messages.
> >
> > Replace:
> > assertTrue(millisUntilFreed < TEST_TIMEOUT_S * 1000);
> > With:
> > assertThat(millisUntilFreed)
> > .isLessThan(TEST_TIMEOUT_S * 1000);
> >
> > Same effect if the condition is not satisfied, more explicit error
> message
> > explaining why the test failed.
> >
> > AssertJ also allows Custom assertions which are also very useful and
> could
> > potentially be leveraged in the future.
> >
> > This would only touch on the tests' assertions, the rest of the test
> setup
> > and execution remains untouched (still uses JUnit for the test
> execution).
> >
> > Thanks.
> >
> > --
> > Kévin Gallardo.
> >
>


Re: Simplify voting rules for in-jvm-dtest-api releases

2020-04-16 Thread Jon Haddad
I lean towards the snapshot builds as well.  I'd prefer we didn't introduce
git submodules.. I have had enough facepalm experienced with them in
the past that I'd prefer not to see us go down that path.

On Thu, Apr 16, 2020 at 4:34 PM J. D. Jordan 
wrote:

> I was taking with Alex on slack earlier today brainstorming ideas and two
> that might work are using a git submodule to reference the code by git
> hash, so no release needed, or using jitpack.io to be able to pull the
> jar down by git hash without doing a release.
>
> Does anyone find either of those options more appealing than 1/2/3?
>
> -Jeremiah
>
> > On Apr 16, 2020, at 6:14 PM, David Capwell  wrote:
> >
> > Not a fan of 2 or 3.  For #2 there is also talk about getting rid of the
> > jars in /lib so that would complicate things.
> >
> > I think frequent releases with snapshots per commit is good.  Agree with
> > Nate we should document this so we have something we can always point to.
> >
> >> On Thu, Apr 16, 2020 at 2:54 PM Nate McCall  wrote:
> >>
> >> (1) sounds reasonable to me. I'd like us to document the vote cycle and
> >> release train specifics on cassandra.a.o somewhere (developer and
> releases
> >> pages maybe?). Nothing exhaustive, just 'we do X with Y'.
> >>
> >>
> >> On Thu, Apr 16, 2020 at 11:03 PM Oleksandr Petrov <
> >> oleksandr.pet...@gmail.com> wrote:
> >>
> >>> I've posted the question on the legal-discussion mailing list, and got
> >> some
> >>> helpful responses.
> >>>
> >>> We can't work around the vote, best we can do is make it shorter (3 +1
> >>> votes / 24 hours). We have several options now:
> >>>
> >>> 1. Release SNAPSHOT builds prefixed with in-jvm dtest commit SHAs and
> cut
> >>> release every week or so (release-train if you wish)
> >>> 2. Avoid using Apache repository for releases altogether, and just push
> >>> jars to Cassandra repository
> >>> 3. Make this code "unofficial" (publish and manage outside Apache)
> >>>
> >>> I'm not a big fan of (2), since we already tried that with Python and
> >> Java
> >>> drivers, also I'm not sure about binaries in git. As regards (3), I'm
> not
> >>> sure if this makes it harder for the folks who rely on Apache legal
> >>> framework for contributions.
> >>>
> >>> Unless there are strong opinions against (1), which seems to be a
> >>> reasonable middle ground, we can do it. Please let me know if you also
> >> have
> >>> other ideas.
> >>>
> >>> Thank you,
> >>> -- Alex
> >>>
> >>> On Wed, Apr 15, 2020 at 10:33 PM Jeremiah D Jordan <
> >> jerem...@datastax.com>
> >>> wrote:
> >>>
>  I think as long as we don’t publish the artifacts to maven central or
> >>> some
>  other location that is for “anyone” we do not need a formal release.
> >> Even
>  then since the artifact is only meant for use by people developing C*
> >>> that
>  might be fine.
> 
>  If artifacts are only for use by individuals actively participating in
> >>> the
>  development process, then no formal release is needed.  See the
> >>> definition
>  of “release” and “publication” found here:
> 
>  http://www.apache.org/legal/release-policy.html#release-definition
> > DEFINITION OF "RELEASE" <
>  http://www.apache.org/legal/release-policy.html#release-definition>
> > Generically, a release is anything that is published beyond the group
>  that owns it. For an Apache project, that means any publication
> outside
> >>> the
>  development community, defined as individuals actively participating
> in
>  development or following the dev list.
> >
> > More narrowly, an official Apache release is one which has been
> >>> endorsed
>  as an "act of the Foundation" by a PMC.
> >
> >
> 
> > PUBLICATION <
> >>> http://www.apache.org/legal/release-policy.html#publication
> >
> > Projects SHALL publish official releases and SHALL NOT publish
>  unreleased materials outside the development community.
> >
> > During the process of developing software and preparing a release,
>  various packages are made available to the development community for
>  testing purposes. Projects MUST direct outsiders towards official
> >>> releases
>  rather than raw source repositories, nightly builds, snapshots,
> release
>  candidates, or any other similar packages. The only people who are
> >>> supposed
>  to know about such developer resources are individuals actively
>  participating in development or following the dev list and thus aware
> >> of
>  the conditions placed on unreleased materials.
> >
> 
> 
>  -Jeremiah
> 
> > On Apr 15, 2020, at 3:05 PM, Nate McCall  wrote:
> >
> > Open an issue with the LEGAL jira project and ask there.
> >
> > I'm like 62% sure they will say nope. The vote process and the time
> >> for
> > such is to allow for PMC to review the release to give the ASF a
>  reasonable
> > degree of assurance for indemnification. 

Re: Drivers support for Cassandra 4.0

2020-04-10 Thread Jon Haddad
Love it - thanks for the update Alex!  I agree with Jordan this will be a
big help with 4.0 adoption.

Jon



On Fri, Apr 10, 2020 at 10:40 AM Jordan West  wrote:

> On Thu, Apr 9, 2020 at 7:30 AM Alexandre Dutra <
> alexandre.du...@datastax.com>
> wrote:
>
> > * Java drivers 3.9.0 and 4.6.0 will be released in the next few weeks.
> > They will include
> > support for missing features (transient replication and
> > now-in-seconds), effectively
> > providing complete support for protocol v5 in its current state. To
> > make it as easy as
> > possible for users to adopt C* 4.0, we decided to release both major
> > branches of the Java
> > driver, including 3.x, even if this branch is now in maintenance mode.
>
>
>
> This is great to hear and I think will be a big benefit to 4.0 adoption.
>
> Thanks for the update!
> Jordan
>


Re: Keeping test-only changes out of CHANGES.txt

2020-04-10 Thread Jon Haddad
In a conversation with Mick we discussed keeping doc changes out as well.
Anyone object to eliding documentation changes from CHANGES.txt?






On Thu, Apr 9, 2020 at 1:07 AM Benjamin Lerer 
wrote:

> +1
>
>
>
> On Thu, Apr 9, 2020 at 9:28 AM Eduard Tudenhoefner <
> eduard.tudenhoef...@datastax.com> wrote:
>
> > updated docs in https://github.com/apache/cassandra/pull/528
> >
> > On Wed, Apr 8, 2020 at 11:39 PM Jordan West  wrote:
> >
> > > +1 (nb) to the change and +1 (nb) to updating the docs to reflect this.
> > >
> > > Jordan
> > >
> > > On Wed, Apr 8, 2020 at 11:30 AM  wrote:
> > >
> > > > +1
> > > >
> > > > > El 8 abr 2020, a las 19:05, e.dimitr...@gmail.com escribió:
> > > > >
> > > > > +1
> > > > >
> > > > > Sent from my iPhone
> > > > >
> > > > >> On 8 Apr 2020, at 13:50, Joshua McKenzie 
> > > wrote:
> > > > >>
> > > > >> +1
> > > > >>
> > > >  On Wed, Apr 8, 2020 at 12:26 PM Sam Tunnicliffe  >
> > > > wrote:
> > > > >>>
> > > > >>> +1
> > > > >>>
> > > > > On 8 Apr 2020, at 15:08, Mick Semb Wever 
> wrote:
> > > > 
> > > >  Can we agree on keeping such test changes out of CHANGES.txt ?
> > > > 
> > > >  We already don't put entries into CHANGES.txt if it is not a
> > change
> > > >  from any previous release.
> > > > 
> > > >  There was some discussion before¹ about this, and the problem
> that
> > > >  being selective meant what ended up there being arbitrary. I
> think
> > > >  this can be solved with an easy rule of thumb that if it only
> > > touches
> > > >  *Test.java classes, or it is only about fixing a test, then it
> > > >  shouldn't be in CHANGES.txt. That means if the patch does touch
> > any
> > > >  runtime code then you do still need to add an entry to
> > CHANGES.txt.
> > > >  This avoids the whole "arbitrary" problem,  and maintains
> > > CHANGES.txt
> > > >  as user-facing formatted text to be searched through.
> > > > 
> > > >  If there's agreement I can commit to going through 4.0 changes
> and
> > > >  removing those that never touched runtime code.
> > > > 
> > > >  regards,
> > > >  Mick
> > > > 
> > > >  ¹)
> > > > >>>
> > > >
> > >
> >
> https://lists.apache.org/thread.html/a94946887081d8a408dd5cd01a203664f4d0197df713f0c63364a811%40%3Cdev.cassandra.apache.org%3E
> > > > 
> > > > 
> > > -
> > > >  To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > > >  For additional commands, e-mail: dev-h...@cassandra.apache.org
> > > > 
> > > > >>>
> > > > >>>
> > > > >>>
> > -
> > > > >>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > > > >>> For additional commands, e-mail: dev-h...@cassandra.apache.org
> > > > >>>
> > > > >>>
> > > > >
> > > > >
> -
> > > > > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > > > > For additional commands, e-mail: dev-h...@cassandra.apache.org
> > > > >
> > > >
> > > > -
> > > > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > > > For additional commands, e-mail: dev-h...@cassandra.apache.org
> > > >
> > > >
> > >
> >
> >
> > --
> > Eduard Tudenhoefner
> > e. eduard.tudenhoef...@datastax.com
> > w. www.datastax.com
> >
>


  1   2   >