Re: Jira Suggestion

2019-05-14 Thread Jeff Jirsa
Please

-- 
Jeff Jirsa


> On May 14, 2019, at 7:53 AM, Benedict Elliott Smith  
> wrote:
> 
> How would people feel about introducing a field for the (git) commit SHA, to 
> be required on (Jira) commit?
> 
> The norm is that we comment the SHA, but given this is the norm perhaps we 
> should codify it instead, while we have the chance?  It would also make it 
> easier to find.
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
> 

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: How Apache Cassandra handles flaky tests

2019-02-26 Thread Jeff Jirsa




> On Feb 26, 2019, at 8:26 AM, Stanislav Kozlovski 
>  wrote:
> 
> Hey there Cassandra community,
> 
> I work on a fellow open-source project - Apache Kafka - and there we have 
> been fighting flaky tests a lot. We run Java 8 and Java 11 builds on every 
> Pull Request and due to test flakiness, almost all of them turn out red with 
> 1 or 2 tests (completely unrelated to the change in the PR) failing. This has 
> resulted in committers ignoring them and merging the changes either way, or 
> in the worst case - rerunning the hour-long build until it becomes green.

I hope most committers wont commit unless the flakey rest is definitely not in 
the subsystem they touched. But yes, one of the motivations for speeding up 
tests (parallelized on a containerized hosted CI platform) was to cut down the 
time for (re-)running
 
> This test flakiness has also slowed down our releases significantly.
> 
> In general, I was just curious to understand if this is a problem that 
> Cassandra faces as well.

Yes


> Does your project have a lot of intermittently failing tests,

Sometimes more than others. There were a few big pushes to get green, though it 
naturally regresses a bit over time 

> do you have any active process of addressing such tests (during the initial 
> review, after realizing it is flaky, etc). Any pointers will be greatly 
> appreciated!

I don’t think we’ve solved this convincingly. Different large (corporate) 
contributors have done long one time passes, and that helped a ton, but I don’t 
think there are any silver bullets yet.
-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: CASSANDRA-14482

2019-02-15 Thread Jeff Jirsa
+1

-- 
Jeff Jirsa


> On Feb 15, 2019, at 9:35 AM, Jonathan Ellis  wrote:
> 
> IMO "add a new compression class that has demonstrable benefits to Sushma
> and Joseph" is sufficiently noninvasive that we should allow it into 4.0.
> 
> On Fri, Feb 15, 2019 at 10:48 AM Dinesh Joshi
>  wrote:
> 
>> Hey folks,
>> 
>> Just wanted to get a pulse on whether we can proceed with ZStd support.
>> The consensus on the ticket was that it’s a very valuable addition without
>> any risk of destabilizing 4.0. It’s ready to go if there aren’t any
>> objections.
>> 
>> Dinesh
>> 
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
>> For additional commands, e-mail: dev-h...@cassandra.apache.org
>> 
>> 
> 
> -- 
> Jonathan Ellis
> co-founder, http://www.datastax.com
> @spyced

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: [VOTE] Release Apache Cassandra 3.0.18

2019-02-05 Thread Jeff Jirsa
+1

On Sat, Feb 2, 2019 at 4:32 PM Michael Shuler 
wrote:

> I propose the following artifacts for release as 3.0.18.
>
> sha1: edd52cef50a6242609a20d0d84c8eb74c580035e
> Git:
>
> https://gitbox.apache.org/repos/asf?p=cassandra.git;a=shortlog;h=refs/tags/3.0.18-tentative
> Artifacts:
>
> https://repository.apache.org/content/repositories/orgapachecassandra-1171/org/apache/cassandra/apache-cassandra/3.0.18/
> Staging repository:
> https://repository.apache.org/content/repositories/orgapachecassandra-1171/
>
> The Debian and RPM packages are available here:
> http://people.apache.org/~mshuler
>
> The vote will be open for 72 hours (longer if needed).
>
> [1]: CHANGES.txt:
>
> https://gitbox.apache.org/repos/asf?p=cassandra.git;a=blob_plain;f=CHANGES.txt;hb=refs/tags/3.0.18-tentative
> [2]: NEWS.txt:
>
> https://gitbox.apache.org/repos/asf?p=cassandra.git;a=blob_plain;f=CHANGES.txt;hb=refs/tags/3.0.18-tentative
>
>


Re: [VOTE] Release Apache Cassandra 2.2.14

2019-02-05 Thread Jeff Jirsa
+1

On Sat, Feb 2, 2019 at 4:32 PM Michael Shuler 
wrote:

> I propose the following artifacts for release as 2.2.14.
>
> sha1: af91658353ba601fc8cd08627e8d36bac62e936a
> Git:
>
> https://gitbox.apache.org/repos/asf?p=cassandra.git;a=shortlog;h=refs/tags/2.2.14-tentative
> Artifacts:
>
> https://repository.apache.org/content/repositories/orgapachecassandra-1172/org/apache/cassandra/apache-cassandra/2.2.14/
> Staging repository:
> https://repository.apache.org/content/repositories/orgapachecassandra-1172/
>
> The Debian and RPM packages are available here:
> http://people.apache.org/~mshuler
>
> The vote will be open for 72 hours (longer if needed).
>
> [1]: CHANGES.txt:
>
> https://gitbox.apache.org/repos/asf?p=cassandra.git;a=blob_plain;f=CHANGES.txt;hb=refs/tags/2.2.14-tentative
> [2]: NEWS.txt:
>
> https://gitbox.apache.org/repos/asf?p=cassandra.git;a=blob_plain;f=CHANGES.txt;hb=refs/tags/2.2.14-tentative
>
>


Re: [VOTE] Release Apache Cassandra 2.1.21

2019-02-05 Thread Jeff Jirsa
+1 (to the release, I see no reason to force this to be EOL; leaving the
branch open has zero cost, and if a serious enough patch comes up, we'll
likely be happy we have the option to fix it).


On Sat, Feb 2, 2019 at 4:32 PM Michael Shuler 
wrote:

> *EOL* release for the 2.1 series. There will be no new releases from the
> 'cassandra-2.1' branch after this release.
>
> 
>
> I propose the following artifacts for release as 2.1.21.
>
> sha1: 9bb75358dfdf1b9824f9a454e70ee2c02bc64a45
> Git:
>
> https://gitbox.apache.org/repos/asf?p=cassandra.git;a=shortlog;h=refs/tags/2.1.21-tentative
> Artifacts:
>
> https://repository.apache.org/content/repositories/orgapachecassandra-1173/org/apache/cassandra/apache-cassandra/2.1.21/
> Staging repository:
> https://repository.apache.org/content/repositories/orgapachecassandra-1173/
>
> The Debian and RPM packages are available here:
> http://people.apache.org/~mshuler
>
> The vote will be open for 72 hours (longer if needed).
>
> [1]: CHANGES.txt:
>
> https://gitbox.apache.org/repos/asf?p=cassandra.git;a=blob_plain;f=CHANGES.txt;hb=refs/tags/2.1.21-tentative
> [2]: NEWS.txt:
>
> https://gitbox.apache.org/repos/asf?p=cassandra.git;a=blob_plain;f=CHANGES.txt;hb=refs/tags/2.1.21-tentative
>
>


Re: [VOTE] Release Apache Cassandra 3.11.4

2019-02-05 Thread Jeff Jirsa
+1

On Sat, Feb 2, 2019 at 4:38 PM Michael Shuler 
wrote:

> I propose the following artifacts for release as 3.11.4.
>
> sha1: fd47391aae13bcf4ee995abcde1b0e180372d193
> Git:
>
> https://gitbox.apache.org/repos/asf?p=cassandra.git;a=shortlog;h=refs/tags/3.11.4-tentative
> Artifacts:
>
> https://repository.apache.org/content/repositories/orgapachecassandra-1170/org/apache/cassandra/apache-cassandra/3.11.4/
> Staging repository:
> https://repository.apache.org/content/repositories/orgapachecassandra-1170/
>
> The Debian and RPM packages are available here:
> http://people.apache.org/~mshuler
>
> The vote will be open for 72 hours (longer if needed).
>
> [1]: CHANGES.txt:
>
> https://gitbox.apache.org/repos/asf?p=cassandra.git;a=blob_plain;f=CHANGES.txt;hb=refs/tags/3.11.4-tentative
> [2]: NEWS.txt:
>
> https://gitbox.apache.org/repos/asf?p=cassandra.git;a=blob_plain;f=CHANGES.txt;hb=refs/tags/3.11.4-tentative
>
>


Re: [VOTE] Release Apache Cassandra 3.0.18

2019-02-02 Thread Jeff Jirsa
There’s SO MUCH that needs to go out, let’s just get out what we have 

-- 
Jeff Jirsa


> On Feb 2, 2019, at 5:35 PM, Benedict Elliott Smith  
> wrote:
> 
> CASSANDRA-14812 should probably land in this release, given that it is a 
> critical bug, has been patch available for a while, and is relatively simple.
> 
> That said, we are sorely due a release.  But if we go ahead without it, we 
> should follow up soon after.
> 
> 
>> On 3 Feb 2019, at 00:32, Michael Shuler  wrote:
>> 
>> I propose the following artifacts for release as 3.0.18.
>> 
>> sha1: edd52cef50a6242609a20d0d84c8eb74c580035e
>> Git:
>> https://gitbox.apache.org/repos/asf?p=cassandra.git;a=shortlog;h=refs/tags/3.0.18-tentative
>> Artifacts:
>> https://repository.apache.org/content/repositories/orgapachecassandra-1171/org/apache/cassandra/apache-cassandra/3.0.18/
>> Staging repository:
>> https://repository.apache.org/content/repositories/orgapachecassandra-1171/
>> 
>> The Debian and RPM packages are available here:
>> http://people.apache.org/~mshuler
>> 
>> The vote will be open for 72 hours (longer if needed).
>> 
>> [1]: CHANGES.txt:
>> https://gitbox.apache.org/repos/asf?p=cassandra.git;a=blob_plain;f=CHANGES.txt;hb=refs/tags/3.0.18-tentative
>> [2]: NEWS.txt:
>> https://gitbox.apache.org/repos/asf?p=cassandra.git;a=blob_plain;f=CHANGES.txt;hb=refs/tags/3.0.18-tentative
>> 
> 
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
> 

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: SSTable exclusion from read path based on sstable metadata marked by custom compaction strategies

2019-02-01 Thread Jeff Jirsa
Iterate over all of the possible time buckets.


On Fri, Feb 1, 2019 at 1:36 PM Carl Mueller
 wrote:

> I'd still need a "all events for app_id" query. We have seconds-level
> events :-(
>
>
> On Fri, Feb 1, 2019 at 3:02 PM Jeff Jirsa  wrote:
>
> > On Fri, Feb 1, 2019 at 12:58 PM Carl Mueller
> >  wrote:
> >
> > > Jeff: so the partition key with timestamp would then need a separate
> > index
> > > table to track the appid->partition keys. Which isn't horrible, but
> also
> > > tracks into another desire of mine: some way to make the replica
> mapping
> > > match locally between the index table and the data table:
> > >
> > > So in the composite partition key for the TWCS table, you'd have
> app_id +
> > > timestamp, BUT ONLY THE app_id GENERATES the hash/key.
> > >
> > >
> > Huh? No, you'd have a composite partition key of app_id + timestamp
> > ROUNDED/CEIL/FLOOR to some time window, and both would be used for
> > hash/key.
> >
> > And you dont need any extra table, because app_id is known and the
> > timestamp can be calculated (e.g., 4 digits of year + 3 digits for day of
> > year makes today 2019032 )
> >
> >
> >
> > > Thus it would match with the index table that is just partition key
> > app_id,
> > > column key timestamp.
> > >
> > > And then theoretically a node-local "join" could be done without an
> > > additional query hop, and batched updates would be more easily atomic
> to
> > a
> > > single node.
> > >
> > > Now how we would communicate all that in CQL/etc: who knows. Hm. Maybe
> > > materialized views cover this, but I haven't tracked that since we
> don't
> > > have versions that support them and they got "deprecated".
> > >
> > >
> > > On Fri, Feb 1, 2019 at 2:53 PM Carl Mueller <
> > carl.muel...@smartthings.com>
> > > wrote:
> > >
> > > > Interesting. Now that we have semiautomated upgrades, we are going to
> > > > hopefully get everything to 3.11X once we get the intermediate hop to
> > > 2.2.
> > > >
> > > > I'm thinking we could also use sstable metadata markings + custom
> > > > compactors for things like multiple customers on the same table. So
> you
> > > > could sequester the data for a customer in their own sstables and
> then
> > > > queries could effectively be subdivided against only the sstables
> that
> > > had
> > > > that customer. Maybe the min and max would cover that, I'd have to
> look
> > > at
> > > > the details.
> > > >
> > > > On Thu, Jan 31, 2019 at 8:11 PM Jonathan Haddad 
> > > wrote:
> > > >
> > > >> In addition to what Jeff mentioned, there was an optimization in 3.4
> > > that
> > > >> can significantly reduce the number of sstables accessed when a
> LIMIT
> > > >> clause was used.  This can be a pretty big win with TWCS.
> > > >>
> > > >>
> > > >>
> > >
> >
> http://thelastpickle.com/blog/2017/03/07/The-limit-clause-in-cassandra-might-not-work-as-you-think.html
> > > >>
> > > >> On Thu, Jan 31, 2019 at 5:50 PM Jeff Jirsa 
> wrote:
> > > >>
> > > >> > In my original TWCS talk a few years back, I suggested that people
> > > make
> > > >> > the partitions match the time window to avoid exactly what you’re
> > > >> > describing. I added that to the talk because my first team that
> used
> > > >> TWCS
> > > >> > (the team for which I built TWCS) had a data model not unlike
> yours,
> > > and
> > > >> > the read-every-sstable thing turns out not to work that well if
> you
> > > have
> > > >> > lots of windows (or very large partitions). If you do this, you
> can
> > > fan
> > > >> out
> > > >> > a bunch of async reads for the first few days and ask for more as
> > you
> > > >> need
> > > >> > to fill the page - this means the reads are more distributed, too,
> > > >> which is
> > > >> > an extra bonus when you have noisy partitions.
> > > >> >
> > > >> > In 3.0 and newer (I think, don’t quote me in the specific
> version),
> > > the
> > > >> > sstable metadata has the min and max 

Re: SSTable exclusion from read path based on sstable metadata marked by custom compaction strategies

2019-02-01 Thread Jeff Jirsa
FWIW you can skip 2.2 and go 2.1 -> 3.11. I would wait for 3.11.4 though.



On Fri, Feb 1, 2019 at 12:53 PM Carl Mueller
 wrote:

> Interesting. Now that we have semiautomated upgrades, we are going to
> hopefully get everything to 3.11X once we get the intermediate hop to 2.2.
>
> I'm thinking we could also use sstable metadata markings + custom
> compactors for things like multiple customers on the same table. So you
> could sequester the data for a customer in their own sstables and then
> queries could effectively be subdivided against only the sstables that had
> that customer. Maybe the min and max would cover that, I'd have to look at
> the details.
>
> On Thu, Jan 31, 2019 at 8:11 PM Jonathan Haddad  wrote:
>
> > In addition to what Jeff mentioned, there was an optimization in 3.4 that
> > can significantly reduce the number of sstables accessed when a LIMIT
> > clause was used.  This can be a pretty big win with TWCS.
> >
> >
> >
> http://thelastpickle.com/blog/2017/03/07/The-limit-clause-in-cassandra-might-not-work-as-you-think.html
> >
> > On Thu, Jan 31, 2019 at 5:50 PM Jeff Jirsa  wrote:
> >
> > > In my original TWCS talk a few years back, I suggested that people make
> > > the partitions match the time window to avoid exactly what you’re
> > > describing. I added that to the talk because my first team that used
> TWCS
> > > (the team for which I built TWCS) had a data model not unlike yours,
> and
> > > the read-every-sstable thing turns out not to work that well if you
> have
> > > lots of windows (or very large partitions). If you do this, you can fan
> > out
> > > a bunch of async reads for the first few days and ask for more as you
> > need
> > > to fill the page - this means the reads are more distributed, too,
> which
> > is
> > > an extra bonus when you have noisy partitions.
> > >
> > > In 3.0 and newer (I think, don’t quote me in the specific version), the
> > > sstable metadata has the min and max clustering which helps exclude
> > > sstables from the read path quite well if everything in the table is
> > using
> > > timestamp clustering columns. I know there was some issue with this and
> > RTs
> > > recently, so I’m not sure if it’s current state, but worth considering
> > that
> > > this may be much better on 3.0+
> > >
> > >
> > >
> > > --
> > > Jeff Jirsa
> > >
> > >
> > > > On Jan 31, 2019, at 1:56 PM, Carl Mueller <
> > carl.muel...@smartthings.com.invalid>
> > > wrote:
> > > >
> > > > Situation:
> > > >
> > > > We use TWCS for a task history table (partition is user, column key
> is
> > > > timeuuid of task, TWCS is used due to tombstone TTLs that rotate out
> > the
> > > > tasks every say month. )
> > > >
> > > > However, if we want to get a "slice" of tasks (say, tasks in the last
> > two
> > > > days and we are using TWCS sstable blocks of 12 hours).
> > > >
> > > > The problem is, this is a frequent user and they have tasks in ALL
> the
> > > > sstables that are organized by the TWCS into time-bucketed sstables.
> > > >
> > > > So Cassandra has to first read in, say 80 sstables to reconstruct the
> > > row,
> > > > THEN it can exclude/slice on the column key.
> > > >
> > > > Question:
> > > >
> > > > Or am I wrong that the read path needs to grab all relevant sstables
> > > before
> > > > applying column key slicing and this is possible? Admittedly we are
> in
> > > 2.1
> > > > for this table (we in the process of upgrading now that we have an
> > > > automated upgrading program that seems to work pretty well)
> > > >
> > > > If my assumption is correct, then the compaction strategy knows as it
> > > > writes the sstables what it is bucketing them as (and could encode in
> > > > sstable metadata?). If my assumption about slicing is that the whole
> > row
> > > > needs reconstruction, if we had a perfect infinite monkey coding team
> > > that
> > > > could generate whatever we wanted within some feasibility, could we
> > > provide
> > > > special hooks to do sstable exclusion based on metadata if we know
> that
> > > > that the metadata will indicate exclusion/inclusion of columns based
> on
> > > > metadata?
> > > >
&

Re: SSTable exclusion from read path based on sstable metadata marked by custom compaction strategies

2019-02-01 Thread Jeff Jirsa
On Fri, Feb 1, 2019 at 12:58 PM Carl Mueller
 wrote:

> Jeff: so the partition key with timestamp would then need a separate index
> table to track the appid->partition keys. Which isn't horrible, but also
> tracks into another desire of mine: some way to make the replica mapping
> match locally between the index table and the data table:
>
> So in the composite partition key for the TWCS table, you'd have app_id +
> timestamp, BUT ONLY THE app_id GENERATES the hash/key.
>
>
Huh? No, you'd have a composite partition key of app_id + timestamp
ROUNDED/CEIL/FLOOR to some time window, and both would be used for hash/key.

And you dont need any extra table, because app_id is known and the
timestamp can be calculated (e.g., 4 digits of year + 3 digits for day of
year makes today 2019032 )



> Thus it would match with the index table that is just partition key app_id,
> column key timestamp.
>
> And then theoretically a node-local "join" could be done without an
> additional query hop, and batched updates would be more easily atomic to a
> single node.
>
> Now how we would communicate all that in CQL/etc: who knows. Hm. Maybe
> materialized views cover this, but I haven't tracked that since we don't
> have versions that support them and they got "deprecated".
>
>
> On Fri, Feb 1, 2019 at 2:53 PM Carl Mueller 
> wrote:
>
> > Interesting. Now that we have semiautomated upgrades, we are going to
> > hopefully get everything to 3.11X once we get the intermediate hop to
> 2.2.
> >
> > I'm thinking we could also use sstable metadata markings + custom
> > compactors for things like multiple customers on the same table. So you
> > could sequester the data for a customer in their own sstables and then
> > queries could effectively be subdivided against only the sstables that
> had
> > that customer. Maybe the min and max would cover that, I'd have to look
> at
> > the details.
> >
> > On Thu, Jan 31, 2019 at 8:11 PM Jonathan Haddad 
> wrote:
> >
> >> In addition to what Jeff mentioned, there was an optimization in 3.4
> that
> >> can significantly reduce the number of sstables accessed when a LIMIT
> >> clause was used.  This can be a pretty big win with TWCS.
> >>
> >>
> >>
> http://thelastpickle.com/blog/2017/03/07/The-limit-clause-in-cassandra-might-not-work-as-you-think.html
> >>
> >> On Thu, Jan 31, 2019 at 5:50 PM Jeff Jirsa  wrote:
> >>
> >> > In my original TWCS talk a few years back, I suggested that people
> make
> >> > the partitions match the time window to avoid exactly what you’re
> >> > describing. I added that to the talk because my first team that used
> >> TWCS
> >> > (the team for which I built TWCS) had a data model not unlike yours,
> and
> >> > the read-every-sstable thing turns out not to work that well if you
> have
> >> > lots of windows (or very large partitions). If you do this, you can
> fan
> >> out
> >> > a bunch of async reads for the first few days and ask for more as you
> >> need
> >> > to fill the page - this means the reads are more distributed, too,
> >> which is
> >> > an extra bonus when you have noisy partitions.
> >> >
> >> > In 3.0 and newer (I think, don’t quote me in the specific version),
> the
> >> > sstable metadata has the min and max clustering which helps exclude
> >> > sstables from the read path quite well if everything in the table is
> >> using
> >> > timestamp clustering columns. I know there was some issue with this
> and
> >> RTs
> >> > recently, so I’m not sure if it’s current state, but worth considering
> >> that
> >> > this may be much better on 3.0+
> >> >
> >> >
> >> >
> >> > --
> >> > Jeff Jirsa
> >> >
> >> >
> >> > > On Jan 31, 2019, at 1:56 PM, Carl Mueller <
> >> carl.muel...@smartthings.com.invalid>
> >> > wrote:
> >> > >
> >> > > Situation:
> >> > >
> >> > > We use TWCS for a task history table (partition is user, column key
> is
> >> > > timeuuid of task, TWCS is used due to tombstone TTLs that rotate out
> >> the
> >> > > tasks every say month. )
> >> > >
> >> > > However, if we want to get a "slice" of tasks (say, tasks in the
> last
> >> two
> >> > > days and we are using TWCS sstable blocks of 12 hours).
> >> > >

Re: SSTable exclusion from read path based on sstable metadata marked by custom compaction strategies

2019-01-31 Thread Jeff Jirsa
In my original TWCS talk a few years back, I suggested that people make the 
partitions match the time window to avoid exactly what you’re describing. I 
added that to the talk because my first team that used TWCS (the team for which 
I built TWCS) had a data model not unlike yours, and the read-every-sstable 
thing turns out not to work that well if you have lots of windows (or very 
large partitions). If you do this, you can fan out a bunch of async reads for 
the first few days and ask for more as you need to fill the page - this means 
the reads are more distributed, too, which is an extra bonus when you have 
noisy partitions.

In 3.0 and newer (I think, don’t quote me in the specific version), the sstable 
metadata has the min and max clustering which helps exclude sstables from the 
read path quite well if everything in the table is using timestamp clustering 
columns. I know there was some issue with this and RTs recently, so I’m not 
sure if it’s current state, but worth considering that this may be much better 
on 3.0+



-- 
Jeff Jirsa


> On Jan 31, 2019, at 1:56 PM, Carl Mueller 
>  wrote:
> 
> Situation:
> 
> We use TWCS for a task history table (partition is user, column key is
> timeuuid of task, TWCS is used due to tombstone TTLs that rotate out the
> tasks every say month. )
> 
> However, if we want to get a "slice" of tasks (say, tasks in the last two
> days and we are using TWCS sstable blocks of 12 hours).
> 
> The problem is, this is a frequent user and they have tasks in ALL the
> sstables that are organized by the TWCS into time-bucketed sstables.
> 
> So Cassandra has to first read in, say 80 sstables to reconstruct the row,
> THEN it can exclude/slice on the column key.
> 
> Question:
> 
> Or am I wrong that the read path needs to grab all relevant sstables before
> applying column key slicing and this is possible? Admittedly we are in 2.1
> for this table (we in the process of upgrading now that we have an
> automated upgrading program that seems to work pretty well)
> 
> If my assumption is correct, then the compaction strategy knows as it
> writes the sstables what it is bucketing them as (and could encode in
> sstable metadata?). If my assumption about slicing is that the whole row
> needs reconstruction, if we had a perfect infinite monkey coding team that
> could generate whatever we wanted within some feasibility, could we provide
> special hooks to do sstable exclusion based on metadata if we know that
> that the metadata will indicate exclusion/inclusion of columns based on
> metadata?
> 
> Goal:
> 
> The overall goal would be to support exclusion of sstables from a read
> path, in case we had compaction strategies hand-tailored for other queries.
> Essentially we would be doing a first-pass bucketsort exclusion with the
> sstable metadata marking the buckets. This might aid support of superwide
> rows and paging through column keys if we allowed the table creator to
> specify bucketing as flushing occurs. In general it appears query
> performance quickly degrades based on # sstables required for a lookup.
> 
> I still don't know the code nearly well enough to do patches, it would seem
> based on my looking at custom compaction strategies and the basic read path
> that this would be a useful extension for advanced users.
> 
> The fallback would be a set of tables to serve as buckets and we span the
> buckets with queries when one bucket runs out. The tables rotate.

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: Warn about SASI usage and allow to disable them

2019-01-16 Thread Jeff Jirsa
The cost is in how many users you scare away 

-- 
Jeff Jirsa


> On Jan 16, 2019, at 2:34 PM, Brandon Williams  wrote:
> 
> Also it costs us nothing to add it.
> 
>> On Wed, Jan 16, 2019 at 4:29 PM Jonathan Haddad  wrote:
>> 
>> I'm +1 on the warning for two reasons.
>> 
>>> A cqlsh warning only applies to those that create the sasi via cqlsh.
>> 
>> 1. When people are creating their schemas in development, this is usually
>> the first step.  You use the REPL to figure out what you need, then you
>> copy your schema somewhere else.  The warning here should prevent a lot of
>> folks from making a serious mistake.
>> 
>> 2. It's consistent with how we warn when people try to use materialized
>> views.
>> 
>> 
>> 
>> 
>>> On Wed, Jan 16, 2019 at 2:15 PM Mick Semb Wever  wrote:
>>> 
>>> Regarding the warning, we might add it at least in 3.11, since for that
>>> version the property to enable SASI is going to be present but not
>> disabled
>>> by default. WDYT?
>>> 
>>> 
>>> I'm  -0 on this.
>>> 
>>> A single line warning in the logs on the sasi creation won't be noticed
>> by
>>> many users.
>>> A cqlsh warning only applies to those that create the sasi via cqlsh.
>>> And we're not talking about patching client drivers to generate a warning
>>> there.
>>> 
>>> So I'd be happy with a yaml comment on the config flag explaining that
>>> it's a beta feature and that users should check open tickets and
>> understand
>>> current limitations on sasi before using them.
>>> 
>>> regards,
>>> Mick
>>> 
>>> -
>>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
>>> For additional commands, e-mail: dev-h...@cassandra.apache.org
>>> 
>>> 
>> 
>> --
>> Jon Haddad
>> http://www.rustyrazorblade.com
>> twitter: rustyrazorblade
>> 

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: Warn about SASI usage and allow to disable them

2019-01-14 Thread Jeff Jirsa
+1 on config
-0 on warning 
-0 on disabling by default


-- 
Jeff Jirsa


> On Jan 14, 2019, at 9:22 PM, Taylor Cressy  wrote:
> 
> +1 on config. +1 on disabling. 
> 
> +1 on applying it to materialized views as well. 
> 
>> On Jan 14, 2019, at 17:29, Joshua McKenzie  wrote:
>> 
>> +1 on config change, +1 on disabling, and so long as the comments make the
>> limitations and risks extremely clear, I'm fine w/out the client warning.
>> 
>> On Mon, Jan 14, 2019 at 12:28 PM Andrés de la Peña 
>> wrote:
>> 
>>> I mean disabling the creation of new SASI indices with CREATE INDEX
>>> statement, the existing indexes would continue working. The CQL client
>>> warning will be thrown with that creation statement as well (if they are
>>> enabled).
>>> 
>>>> On Mon, 14 Jan 2019 at 20:18, Jeff Jirsa  wrote:
>>>> 
>>>> When we say disable, do you mean disable creation of new SASI indices, or
>>>> disable using existing ones? I assume it's just creation of new?
>>>> 
>>>> On Mon, Jan 14, 2019 at 11:19 AM Andrés de la Peña <
>>>> a.penya.gar...@gmail.com>
>>>> wrote:
>>>> 
>>>>> Hello all,
>>>>> 
>>>>> It is my understanding that SASI is still to be considered an
>>>>> experimental/beta feature, and they apparently are not being very
>>>> actively
>>>>> developed. Some higlighted problems in SASI are:
>>>>> 
>>>>> - OOMs during flush, as it is described in CASSANDRA-12662
>>>>> - General secondary index consistency problems described in
>>>> CASSANDRA-8272.
>>>>> There is a pending-review patch addressing the problem for regular 2i.
>>>>> However, the proposed solution is based on indexing tombstones. SASI
>>>>> doesn't index tombstones, so it wouldn't be enterely trivial to extend
>>>> the
>>>>> approach to SASI.
>>>>> - Probably insufficient testing. As far as I know, we don't have a
>>> single
>>>>> dtest for SASI nor tests dealing with large SSTables.
>>>>> 
>>>>> Similarly to what CASSANDRA-13959 did with materialized views,
>>>>> CASSANDRA-14866 aims to throw a native protocol warning about SASI
>>>>> experimental state, and to add a config property to disable them.
>>> Perhaps
>>>>> this property could be disabled by default in trunk. This should raise
>>>>> awareness about SASI maturity until we let them in a more stable state.
>>>>> 
>>>>> The purpose for this thread is discussing whether we want to add this
>>>>> warning, the config property and, more controversially, if we want to
>>> set
>>>>> SASI as disabled by default in trunk.
>>>>> 
>>>>> WDYT?
>>>>> 
>>>> 
>>> 
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
> 

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: Warn about SASI usage and allow to disable them

2019-01-14 Thread Jeff Jirsa
When we say disable, do you mean disable creation of new SASI indices, or
disable using existing ones? I assume it's just creation of new?

On Mon, Jan 14, 2019 at 11:19 AM Andrés de la Peña 
wrote:

> Hello all,
>
> It is my understanding that SASI is still to be considered an
> experimental/beta feature, and they apparently are not being very actively
> developed. Some higlighted problems in SASI are:
>
> - OOMs during flush, as it is described in CASSANDRA-12662
> - General secondary index consistency problems described in CASSANDRA-8272.
> There is a pending-review patch addressing the problem for regular 2i.
> However, the proposed solution is based on indexing tombstones. SASI
> doesn't index tombstones, so it wouldn't be enterely trivial to extend the
> approach to SASI.
> - Probably insufficient testing. As far as I know, we don't have a single
> dtest for SASI nor tests dealing with large SSTables.
>
> Similarly to what CASSANDRA-13959 did with materialized views,
> CASSANDRA-14866 aims to throw a native protocol warning about SASI
> experimental state, and to add a config property to disable them. Perhaps
> this property could be disabled by default in trunk. This should raise
> awareness about SASI maturity until we let them in a more stable state.
>
> The purpose for this thread is discussing whether we want to add this
> warning, the config property and, more controversially, if we want to set
> SASI as disabled by default in trunk.
>
> WDYT?
>


Re: Who should be in our distribution KEYS file?

2019-01-07 Thread Jeff Jirsa
I dont think it's awkward, I think a lot of us know there are serious bugs
and we need a release, but we keep finding other bugs and it's super
tempting to say "one more fix"

We should probably just cut next 3.0.x and 3.11.x though, because there are
some nasty bugs hiding in there that the testing for 4.0 has uncovered.


On Mon, Jan 7, 2019 at 2:14 PM Jonathan Haddad  wrote:

> > I don't understand how adding keys changes release frequency. Did
> someone request a release to be made or are we on some assumed date
> interval?
>
> I don't know if it would (especially by itself), I just know that if more
> people are able to do releases that's more opportunity to do so.
>
> I think getting more folks involved in the release process is a good idea
> for other reasons.  People take vacations, there's job conflicts, there's
> life stuff (kids usually take priority), etc.
>
> The last release of 3.11 was almost half a year ago, and there's 30+ bug
> fixes in the 3.11 branch.
>
> > Did someone request a release to be made or are we on some assumed date
> interval?
>
> I can't recall (and a search didn't find) anyone asking for a 3.11.4
> release, but I think part of the point is that requesting a release from a
> static release manager is a sign of a flaw in the release process.
>
> On a human, note, it feels a little awkward asking for a release.  I might
> be alone on this though.
>
> Jon
>
>
> On Mon, Jan 7, 2019 at 1:16 PM Michael Shuler 
> wrote:
>
> > Mick and I have discussed this previously, but I don't recall if it was
> > email or irc. Apologies if I was unable to describe the problem to a
> > point of general understanding.
> >
> > To reiterate the problem, changing gpg signature keys screws our debian
> > and redhat package repositories for all users. Tarballs are not
> > installed with a client that checks signatures in a known trust
> > database. When gpg key signer changes, users need to modify their trust
> > on every node, importing new key(s), in order for packages to
> > install/upgrade with apt or yum.
> >
> > I don't understand how adding keys changes release frequency. Did
> > someone request a release to be made or are we on some assumed date
> > interval?
> >
> > Michael
> >
> > On 1/7/19 2:30 PM, Jonathan Haddad wrote:
> > > That's a good point.  Looking at the ASF docs I had assumed the release
> > > manager was per-project, but on closer inspection it appears to be
> > > per-release.  You're right, it does say that it can be any committer.
> > >
> > > http://www.apache.org/dev/release-publishing.html#release_manager
> > >
> > > We definitely need more frequent releases, if this is the first step
> > > towards that goal, I think it's worth it.
> > >
> > > Glad you brought this up!
> > > Jon
> > >
> > >
> > > On Mon, Jan 7, 2019 at 11:58 AM Mick Semb Wever 
> wrote:
> > >
> > >>
> > >>
> > >>> I don't see any reason to have any keys in there, except from release
> > >>> managers who are signing releases.
> > >>
> > >>
> > >> Shouldn't any PMC (or committer) should be able to be a release
> manager?
> > >>
> > >> The release process should be reliable and reproducible enough to be
> > safe
> > >> for rotating release managers every release. I would have thought
> > security
> > >> concerns were better addressed by a more tested process? And AFAIK no
> > other
> > >> asf projects are as restrictive on who can be the release manager role
> > (but
> > >> i've only checked a few projects).
> > >>
> > >>
> > >>
> > >> -
> > >> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > >> For additional commands, e-mail: dev-h...@cassandra.apache.org
> > >>
> > >>
> > >
> >
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > For additional commands, e-mail: dev-h...@cassandra.apache.org
> >
> >
>
> --
> Jon Haddad
> http://www.rustyrazorblade.com
> twitter: rustyrazorblade
>


Re: Git Repo Migration

2019-01-04 Thread Jeff Jirsa
+1



-- 
Jeff Jirsa


> On Jan 4, 2019, at 2:49 AM, Sam Tunnicliffe  wrote:
> 
> As per the announcement on 7th December 2018[1], ASF infra are planning to 
> shutdown the service behind git-wip-us.apache.org and migrate all existing 
> repos to gitbox.apache.org 
> 
> There are further details in the original mail, but apparently one of the 
> benefits of the migration is that we'll have full write access via Github, 
> including the ability finally to close PRs.

Fwiw we can sorta close PRs now (on commit via commit msg and through infra 
ticket)

> This affects the cassandra, cassandra-dtest and cassandra-build repos (but 
> not the new cassandra-sidecar repo).
> 
> A pre-requisite of the migration is to demonstrate consensus within the 
> community, so to satisfy that formality I'm starting this thread to gather 
> any objections or specific requests regarding the timing of the move.
> 
> I'll collate responses in a week or so and file the necessary INFRA Jira.
> 
> Thanks,
> Sam
> 
> [1] 
> https://lists.apache.org/thread.html/667772efdabf49a0a23d585539c127f335477e033f1f9b6f5079aced@%3Cdev.cassandra.apache.org%3E
> 
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
> 

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: Question about PartitionUpdate.singleRowUpdate()

2018-12-19 Thread Jeff Jirsa
Definitely worth a JIRA. Suspect it may be slow to get a response this
close to the holidays, but a JIRA will be a bit more durable than the
mailing list post.


On Wed, Dec 19, 2018 at 1:58 PM Sam Klock  wrote:

> Cassandra devs,
>
> I have a question about the implementation of
> PartitionUpdate.singleRowUpdate(), in particular the choice to use
> EncodingStats.NO_STATS when building the resulting PartitionUpdate.  Is
> there a functional reason for that -- i.e., is it safe to modify it to
> use an EncodingStats built from deletionInfo, row, and staticRow?
>
> Context: under 3.0.17, we have a table using TWCS and a secondary index.
> We've been having a problem with the sstables for the index lingering
> essentially forever, despite the correlated sstables for the parent
> table being removed pretty much when we expect them to.  We traced the
> problem to the use of EncodingStats.NO_STATS in singleRowUpdate(), which
> is being used to create the index updates when we write to the parent
> table.  It appears that NO_STATS is making Cassandra think the memtables
> for the index have data from September 2015 in them, which in turn
> prevents it from dropping expired sstables (all of which are much more
> recent than that) for the index.
>
> Experimentally, modifying singleRowUpdate() to build an EncodingStats
> from its inputs (plus the MutableDeletionInfo it creates) seems to fix
> the problem.  We don't have any insight into why the existing logic uses
> NO_STATS, however, so we don't know if this change is really safe.  Does
> it sound like we're on the right track?  (Also: I'm sure we'd be happy
> to open an issue and submit a patch if this sounds like it would be
> useful generally.)
>
> Thanks,
> SK
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
>


Re: [VOTE] Change Jira Workflow

2018-12-17 Thread Jeff Jirsa
+1

-- 
Jeff Jirsa


> On Dec 17, 2018, at 7:19 AM, Benedict Elliott Smith  
> wrote:
> 
> I propose these changes 
> <https://cwiki.apache.org/confluence/display/CASSANDRA/JIRA+Workflow+Proposals>*
>  to the Jira Workflow for the project.  The vote will be open for 72 hours**.
> 
> I am, of course, +1.
> 
> * With the addendum of the mailing list discussion 
> <https://lists.apache.org/thread.html/e4668093169aa4ef52f2bea779333f04a0afde8640c9a79a8c86ee74@%3Cdev.cassandra.apache.org%3E>;
>  in case of any conflict arising from a mistake on my part in the wiki, the 
> consensus reached by polling the mailing list will take precedence.
> ** I won’t be around to close the vote, as I will be on vacation.  Everyone 
> is welcome to ignore the result until I get back in a couple of weeks, or if 
> anybody is eager feel free to close the vote and take some steps towards 
> implementation.

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: Inter-node messaging latency

2018-11-28 Thread Jeff Jirsa
Are you sure you’re blocked on internode and not commitlog? Batch is typically 
not what people expect (group commitlog in 4.0 is probably closer to what you 
think batch does).

-- 
Jeff Jirsa


> On Nov 27, 2018, at 10:55 PM, Yuji Ito  wrote:
> 
> Hi,
> 
> Thank you for the reply.
> I've measured LWT throughput in 4.0.
> 
> I used the cassandra-stress tool to insert rows with LWT for 3 minutes on 
> i3.xlarge and i3.4xlarge
> For 3.11, I modified the tool to support LWT.
> Before each measurement, I cleaned up all Cassandra data.
> 
> The throughput in 4.0 is 5 % faster than 3.11.
> The CPU load of i3.4xlarge (16 vCPUs) is only up to 75% in both versions.
> And, the throughput was slower than 4 times that of i3.xlarge.
> I think the throughput wasn't bounded by CPU also in 4.0.
> 
> The CPU load of i3.4xlarge is up to 80 % with non-LWT write.
> 
> I wonder what is the bottleneck for writes on a many-core machine if the 
> issue about messaging has been resolved in 4.0.
> Can I use up CPU to insert rows by changing any parameter?
> 
> # LWT insert
> * Cassandra 3.11.3
> | instance type | # of threads | concurrent_writes | Throughput [op/s] |
> | i3.xlarge |   64 |32 |  2815 |
> |i3.4xlarge |  256 |   128 |  9506 |
> |i3.4xlarge |  512 |   256 | 10540 |
> 
> * Cassandra 4.0 (trunk)
> | instance type | # of threads | concurrent_writes | Throughput [op/s] |
> | i3.xlarge |   64 |32 |  2951 |
> |i3.4xlarge |  256 |   128 |  9816 |
> |i3.4xlarge |  512 |   256 | 11055 |
> 
> * Environment
> - 3 node cluster
> - Replication factor: 3
> - Node instance: AWS EC2 i3.xlarge / i3.4xlarge
> 
> * C* configuration
> - Apache Cassandra 3.11.3 / 4.0 (trunk)
> - commitlog_sync: batch
> - concurrent_writes: 32, 256
> - native_transport_max_threads: 128(default), 256 (when concurrent_writes is 
> 256)
> 
> Thanks,
> Yuji
> 
> 
> 2018年11月26日(月) 17:27 sankalp kohli :
>> Inter-node messaging is rewritten using Netty in 4.0. It will be better to 
>> test it using that as potential changes will mostly land on top of that. 
>> 
>>> On Mon, Nov 26, 2018 at 7:39 AM Yuji Ito  wrote:
>>> Hi,
>>> 
>>> I'm investigating LWT performance with C* 3.11.3.
>>> It looks that the performance is bounded by messaging latency when many 
>>> requests are issued concurrently.
>>> 
>>> According to the source code, the number of messaging threads per node is 
>>> only 1 thread for incoming and 1 thread for outbound "small" message to 
>>> another node.
>>> 
>>> I guess these threads are frequently interrupted because many threads are 
>>> executed when many requests are issued.
>>> Especially, I think it affects the LWT performance when many LWT requests 
>>> which need lots of inter-node messaging are issued.
>>> 
>>> I measured that latency. It took 2.5 ms in average to enqueue a message at 
>>> a node and to receive the message at the **same** node with 96 concurrent 
>>> LWT writes.
>>> Is it normal? I think it is too big latency, though a message was sent to 
>>> the same node.
>>> 
>>> Decreasing numbers of other threads like `concurrent_counter_writes`, 
>>> `concurrent_materialized_view_writes` reduced a bit the latency.
>>> Can I change any other parameter to reduce the latency?
>>> I've tried using message coalescing, but they didn't reduce that.
>>> 
>>> * Environment
>>> - 3 node cluster
>>> - Replication factor: 3
>>> - Node instance: AWS EC2 i3.xlarge
>>> 
>>> * C* configuration
>>> - Apache Cassandra 3.11.3
>>> - commitlog_sync: batch
>>> - concurrent_reads: 32 (default)
>>> - concurrent_writes: 32 (default)
>>> 
>>> Thanks,
>>> Yuji
>>> 
>>> 
>>> -
>>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
>>> For additional commands, e-mail: dev-h...@cassandra.apache.org


Re: Request for reviewer: CASSANDRA-14829

2018-11-16 Thread Jeff Jirsa
The assignment is just so you get “credit” for the patch - asking for a 
reviewer is good but not strictly necessary. 

(Some of the committers will try to review it when we can, usually waiting for 
someone who’s comfortable with that code to come along)

-- 
Jeff Jirsa


> On Nov 16, 2018, at 11:33 AM, Georg Dietrich  wrote:
> 
> Hi here,
> 
> I've posted https://issues.apache.org/jira/browse/CASSANDRA-14829 together 
> with a pull request, now I've been assigned the task... I assume that means I 
> should go look for a reviewer?
> 
> Regards
> Georg
> 
> --
> 
> Georg Dietrich
> Senior System Developer
> imbus TestBench
> Tel. +49 9131 7518-944
> E-Mail: georg.dietr...@imbus.de
> 
> Tel. +49 9131 7518-0, Fax +49 9131 7518-50
> i...@imbus.de www.imbus.de
> 
> imbus AG, Kleinseebacher Str. 9,  91096 Möhrendorf, DEUTSCHLAND
> Vorsitzender des Aufsichtsrates: Wolfgang Wieser
> Vorstand: Tilo Linz, Bernd Nossem, Thomas Roßner
> Sitz der Gesellschaft: Möhrendorf; Registergericht: Fürth/Bay, HRB 8365
> 
> Post/Besuchsadresse: imbus AG, Hauptstraße 8a, 91096 Möhrendorf, Deutschland
> =
> 

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: Deprecating/removing PropertyFileSnitch?

2018-10-29 Thread Jeff Jirsa
...@gmail.com>
> >>>>>>>> escreveu:
> >>>>>>>>
> >>>>>>>>> Yes it will happen. I am worried that same way DC or rack info
> can go
> >>>>>>>>> missing.
> >>>>>>>>>
> >>>>>>>>> On Mon, Oct 22, 2018 at 12:52 PM Paulo Motta <
> >>>>> pauloricard...@gmail.com>
> >>>>>>>>> wrote:
> >>>>>>>>>
> >>>>>>>>>>> the new host won’t learn about the host whose status is
> missing and
> >>>>>>>> the
> >>>>>>>>>> view of this host will be wrong.
> >>>>>>>>>>
> >>>>>>>>>> Won't this happen even with PropertyFileSnitch as the token(s)
> for
> >>>>> this
> >>>>>>>>>> host will be missing from gossip/system.peers?
> >>>>>>>>>>
> >>>>>>>>>> Em sáb, 20 de out de 2018 às 00:34, Sankalp Kohli <
> >>>>>>>>> kohlisank...@gmail.com>
> >>>>>>>>>> escreveu:
> >>>>>>>>>>
> >>>>>>>>>>> Say you restarted all instances in the cluster and status for
> some
> >>>>>>>> host
> >>>>>>>>>>> goes missing. Now when you start a host replacement, the new
> host
> >>>>>>>> won’t
> >>>>>>>>>>> learn about the host whose status is missing and the view of
> this
> >>>>>>>> host
> >>>>>>>>>> will
> >>>>>>>>>>> be wrong.
> >>>>>>>>>>>
> >>>>>>>>>>> PS: I will be happy to be proved wrong as I can also start
> using
> >>>>>>>> Gossip
> >>>>>>>>>>> snitch :)
> >>>>>>>>>>>
> >>>>>>>>>>>> On Oct 19, 2018, at 2:41 PM, Jeremy Hanna <
> >>>>>>>>> jeremy.hanna1...@gmail.com>
> >>>>>>>>>>> wrote:
> >>>>>>>>>>>>
> >>>>>>>>>>>> Do you mean to say that during host replacement there may be
> a time
> >>>>>>>>>> when
> >>>>>>>>>>> the old->new host isn’t fully propagated and therefore
> wouldn’t yet
> >>>>>>>> be
> >>>>>>>>> in
> >>>>>>>>>>> all system tables?
> >>>>>>>>>>>>
> >>>>>>>>>>>>> On Oct 17, 2018, at 4:20 PM, sankalp kohli <
> >>>>>>>> kohlisank...@gmail.com>
> >>>>>>>>>>> wrote:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> This is not the case during host replacement correct?
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> On Tue, Oct 16, 2018 at 10:04 AM Jeremiah D Jordan <
> >>>>>>>>>>>>> jeremiah.jor...@gmail.com> wrote:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>> As long as we are correctly storing such things in the
> system
> >>>>>>>>> tables
> >>>>>>>>>>> and
> >>>>>>>>>>>>>> reading them out of the system tables when we do not have
> the
> >>>>>>>>>>> information
> >>>>>>>>>>>>>> from gossip yet, it should not be a problem. (As far as I
> know
> >>>>>>>> GPFS
> >>>>>>>>>>> does
> >>>>>>>>>>>>>> this, but I have not done extensive code diving or testing
> to
> >>>>>>>> make
> >>>>>>>>>>> sure all
> >>>>>>>>>>>>>> edge cases are covered there)
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> -Jeremiah
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> On Oct 16, 2018, at 11:56 AM, sankalp kohli <
> >>>>>>>>> kohlisank...@gmail.com
> >>>>>>>>>>>
> >>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Will GossipingPropertyFileSnitch not be vulnerable to
> Gossip
> >>>>>>>> bugs
> >>>>>>>>>>> where
> >>>>>>>>>>>>>> we
> >>>>>>>>>>>>>>> lose hostId or some other fields when we restart C* for
> large
> >>>>>>>>>>>>>>> clusters(~1000 instances)?
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> On Tue, Oct 16, 2018 at 7:59 AM Jeff Jirsa <
> jji...@gmail.com>
> >>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> We should, but the 4.0 features that log/reject verbs to
> >>>>>>>> invalid
> >>>>>>>>>>>>>> replicas
> >>>>>>>>>>>>>>>> solves a lot of the concerns here
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> --
> >>>>>>>>>>>>>>>> Jeff Jirsa
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> On Oct 16, 2018, at 4:10 PM, Jeremy Hanna <
> >>>>>>>>>>> jeremy.hanna1...@gmail.com>
> >>>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> We have had PropertyFileSnitch for a long time even
> though
> >>>>>>>>>>>>>>>> GossipingPropertyFileSnitch is effectively a superset of
> what
> >>>>>>>> it
> >>>>>>>>>>> offers
> >>>>>>>>>>>>>> and
> >>>>>>>>>>>>>>>> is much less error prone.  There are some unexpected
> behaviors
> >>>>>>>>> when
> >>>>>>>>>>>>>> things
> >>>>>>>>>>>>>>>> aren’t configured correctly with PFS.  For example, if you
> >>>>>>>>> replace
> >>>>>>>>>>>>>> nodes in
> >>>>>>>>>>>>>>>> one DC and add those nodes to that DCs property files and
> not
> >>>>>>>> the
> >>>>>>>>>>> other
> >>>>>>>>>>>>>> DCs
> >>>>>>>>>>>>>>>> property files - the resulting problems aren’t very
> >>>>>>>>> straightforward
> >>>>>>>>>>> to
> >>>>>>>>>>>>>>>> troubleshoot.
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> We could try to improve the resilience and fail fast
> error
> >>>>>>>>>> checking
> >>>>>>>>>>> and
> >>>>>>>>>>>>>>>> error reporting of PFS, but honestly, why wouldn’t we
> deprecate
> >>>>>>>>> and
> >>>>>>>>>>>>>> remove
> >>>>>>>>>>>>>>>> PropertyFileSnitch?  Are there reasons why GPFS wouldn’t
> be
> >>>>>>>>>>> sufficient
> >>>>>>>>>>>>>> to
> >>>>>>>>>>>>>>>> replace it?
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>> -
> >>>>>>>>>>>>>>>>> To unsubscribe, e-mail:
> dev-unsubscr...@cassandra.apache.org
> >>>>>>>>>>>>>>>>> For additional commands, e-mail:
> >>>>>>>> dev-h...@cassandra.apache.org
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>
> -
> >>>>>>>>>>>>>>>> To unsubscribe, e-mail:
> dev-unsubscr...@cassandra.apache.org
> >>>>>>>>>>>>>>>> For additional commands, e-mail:
> dev-h...@cassandra.apache.org
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>
> -
> >>>>>>>>>>>>>> To unsubscribe, e-mail:
> dev-unsubscr...@cassandra.apache.org
> >>>>>>>>>>>>>> For additional commands, e-mail:
> dev-h...@cassandra.apache.org
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>
> -
> >>>>>>>>>>>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> >>>>>>>>>>>> For additional commands, e-mail:
> dev-h...@cassandra.apache.org
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>> -
> >>>>>>>>>>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> >>>>>>>>>>> For additional commands, e-mail: dev-h...@cassandra.apache.org
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>
> >>>>>>
> -
> >>>>>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> >>>>>> For additional commands, e-mail: dev-h...@cassandra.apache.org
> >>>>>>
> >>>>>
> >>>
> >>>
> >>> -
> >>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> >>> For additional commands, e-mail: dev-h...@cassandra.apache.org
> >>>
> >>
> >> -
> >> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> >> For additional commands, e-mail: dev-h...@cassandra.apache.org
> >>
> >
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > For additional commands, e-mail: dev-h...@cassandra.apache.org
> >
>
>


Re: CASSANDRA-13241 lower default chunk_length_in_kb

2018-10-23 Thread Jeff Jirsa
My objection (-0.5) is based on freeze not in code complexity



-- 
Jeff Jirsa


> On Oct 23, 2018, at 8:59 AM, Benedict Elliott Smith  
> wrote:
> 
> To discuss the concerns about the patch for a more efficient representation:
> 
> The risk from such a patch is very low.  It’s a very simple in-memory data 
> structure, that we can introduce thorough fuzz tests for.  The reason to 
> exclude it would be for reasons of wanting to begin strictly enforcing the 
> freeze only.  This is a good enough reason in my book, which is why I’m 
> neutral on its addition.  I just wanted to provide some context for everyone 
> else's voting intention.
> 
> 
>> On 23 Oct 2018, at 16:51, Ariel Weisberg  wrote:
>> 
>> Hi,
>> 
>> I just asked Jeff. He is -0 and -0.5 respectively.
>> 
>> Ariel
>> 
>>> On Tue, Oct 23, 2018, at 11:50 AM, Benedict Elliott Smith wrote:
>>> I’m +1 change of default.  I think Jeff was -1 on that though.
>>> 
>>> 
>>>> On 23 Oct 2018, at 16:46, Ariel Weisberg  wrote:
>>>> 
>>>> Hi,
>>>> 
>>>> To summarize who we have heard from so far
>>>> 
>>>> WRT to changing just the default:
>>>> 
>>>> +1:
>>>> Jon Haddadd
>>>> Ben Bromhead
>>>> Alain Rodriguez
>>>> Sankalp Kohli (not explicit)
>>>> 
>>>> -0:
>>>> Sylvaine Lebresne 
>>>> Jeff Jirsa
>>>> 
>>>> Not sure:
>>>> Kurt Greaves
>>>> Joshua Mckenzie
>>>> Benedict Elliot Smith
>>>> 
>>>> WRT to change the representation:
>>>> 
>>>> +1:
>>>> There are only conditional +1s at this point
>>>> 
>>>> -0:
>>>> Sylvaine Lebresne
>>>> 
>>>> -.5:
>>>> Jeff Jirsa
>>>> 
>>>> This 
>>>> (https://github.com/aweisberg/cassandra/commit/a9ae85daa3ede092b9a1cf84879fb1a9f25b9dce)
>>>>  is a rough cut of the change for the representation. It needs better 
>>>> naming, unit tests, javadoc etc. but it does implement the change.
>>>> 
>>>> Ariel
>>>>> On Fri, Oct 19, 2018, at 3:42 PM, Jonathan Haddad wrote:
>>>>> Sorry, to be clear - I'm +1 on changing the configuration default, but I
>>>>> think changing the compression in memory representations warrants further
>>>>> discussion and investigation before making a case for or against it yet.
>>>>> An optimization that reduces in memory cost by over 50% sounds pretty good
>>>>> and we never were really explicit that those sort of optimizations would 
>>>>> be
>>>>> excluded after our feature freeze.  I don't think they should necessarily
>>>>> be excluded at this time, but it depends on the size and risk of the 
>>>>> patch.
>>>>> 
>>>>>> On Sat, Oct 20, 2018 at 8:38 AM Jonathan Haddad  
>>>>>> wrote:
>>>>>> 
>>>>>> I think we should try to do the right thing for the most people that we
>>>>>> can.  The number of folks impacted by 64KB is huge.  I've worked on a lot
>>>>>> of clusters created by a lot of different teams, going from brand new to
>>>>>> pretty damn knowledgeable.  I can't think of a single time over the last 
>>>>>> 2
>>>>>> years that I've seen a cluster use non-default settings for compression.
>>>>>> With only a handful of exceptions, I've lowered the chunk size 
>>>>>> considerably
>>>>>> (usually to 4 or 8K) and the impact has always been very noticeable,
>>>>>> frequently resulting in hardware reduction and cost savings.  Of all the
>>>>>> poorly chosen defaults we have, this is one of the biggest offenders 
>>>>>> that I
>>>>>> see.  There's a good reason ScyllaDB  claims they're so much faster than
>>>>>> Cassandra - we ship a DB that performs poorly for 90+% of teams because 
>>>>>> we
>>>>>> ship for a specific use case, not a general one (time series on memory
>>>>>> constrained boxes being the specific use case)
>>>>>> 
>>>>>> This doesn't impact existing tables, just new ones.  More and more teams
>>>>>> are using Cassandra as a general purpose database, we should acknowledge
>>>>>> that adjusting our defaul

Re: Deprecating/removing PropertyFileSnitch?

2018-10-22 Thread Jeff Jirsa
On Mon, Oct 22, 2018 at 7:09 PM J. D. Jordan 
wrote:

> Do you have a specific gossip bug that you have seen recently which caused
> a problem that would make this happen?  Do you have a specific JIRA in mind?


Sankalp linked a few others, but also
https://issues.apache.org/jira/browse/CASSANDRA-13700


>   “We can’t remove this because what if there is a bug” doesn’t seem like
> a good enough reason to me. If that was a reason we would never make any
> changes to anything.
>

How about "we know that certain fields that are gossiped go missing even
after all of the known races are fixed, so removing an existing
low-maintenance feature and forcing users to rely on gossip for topology
may be worth some discussion".


> I think many people have seen PFS actually cause real problems, where with
> GPFS the issue being talked about is predicated on some theoretical gossip
> bug happening.
>

How many of those were actually caused by incorrect fallback from GPFS to
PFS, rather than PFS itself?


> In the past year at DataStax we have done a lot of testing on 3.0 and 3.11
> around adding nodes, adding DC’s, replacing nodes, replacing racks, and
> replacing DC’s, all while using GPFS, and as far as I know we have not seen
> any “lost” rack/DC information during such testing.
>

I've also run very large GPFS clusters in the past without much gossip
pain, and I'm in the "we should deprecate PFS" camp, but it is also true
that PFS is low maintenance and mostly works. Perhaps the first step is
breaking the GPFS->PFS fallback that people don't know about, maybe that'll
help?


Re: CASSANDRA-13241 lower default chunk_length_in_kb

2018-10-19 Thread Jeff Jirsa
Agree with Sylvain (and I think Benedict) - there’s no compelling reason to 
violate the freeze here. We’ve had the wrong default for years - add a note to 
the docs that we’ll be changing it in the future, but let’s not violate the 
freeze now.

-- 
Jeff Jirsa


> On Oct 19, 2018, at 10:06 AM, Sylvain Lebresne  wrote:
> 
> Fwiw, as much as I agree this is a change worth doing in general, I do am
> -0 for 4.0. Both the "compact sequencing" and the change of default really.
> We're closing on 2 months within the freeze, and for me a freeze do include
> not changing defaults, because changing default ideally imply a decent
> amount of analysis/benchmark of the consequence of that change[1] and that
> doesn't enter in my definition of a freeze.
> 
> [1]: to be extra clear, I'm not saying we've always done this, far from it.
> But I hope we can all agree we were wrong to no do it when we didn't and
> should strive to improve, not repeat past mistakes.
> --
> Sylvain
> 
> 
>> On Thu, Oct 18, 2018 at 8:55 PM Ariel Weisberg  wrote:
>> 
>> Hi,
>> 
>> For those who were asking about the performance impact of block size on
>> compression I wrote a microbenchmark.
>> 
>> https://pastebin.com/RHDNLGdC
>> 
>> [java] Benchmark   Mode
>> Cnt  Score  Error  Units
>> [java] CompactIntegerSequenceBench.benchCompressLZ4Fast16kthrpt
>> 15  331190055.685 ±  8079758.044  ops/s
>> [java] CompactIntegerSequenceBench.benchCompressLZ4Fast32kthrpt
>> 15  353024925.655 ±  7980400.003  ops/s
>> [java] CompactIntegerSequenceBench.benchCompressLZ4Fast64kthrpt
>> 15  365664477.654 ± 10083336.038  ops/s
>> [java] CompactIntegerSequenceBench.benchCompressLZ4Fast8k thrpt
>> 15  305518114.172 ± 11043705.883  ops/s
>> [java] CompactIntegerSequenceBench.benchDecompressLZ4Fast16k  thrpt
>> 15  688369529.911 ± 25620873.933  ops/s
>> [java] CompactIntegerSequenceBench.benchDecompressLZ4Fast32k  thrpt
>> 15  703635848.895 ±  5296941.704  ops/s
>> [java] CompactIntegerSequenceBench.benchDecompressLZ4Fast64k  thrpt
>> 15  695537044.676 ± 17400763.731  ops/s
>> [java] CompactIntegerSequenceBench.benchDecompressLZ4Fast8k   thrpt
>> 15  727725713.128 ±  4252436.331  ops/s
>> 
>> To summarize, compression is 8.5% slower and decompression is 1% faster.
>> This is measuring the impact on compression/decompression not the huge
>> impact that would occur if we decompressed data we don't need less often.
>> 
>> I didn't test decompression of Snappy and LZ4 high, but I did test
>> compression.
>> 
>> Snappy:
>> [java] CompactIntegerSequenceBench.benchCompressSnappy16k   thrpt
>> 2  196574766.116  ops/s
>> [java] CompactIntegerSequenceBench.benchCompressSnappy32k   thrpt
>> 2  198538643.844  ops/s
>> [java] CompactIntegerSequenceBench.benchCompressSnappy64k   thrpt
>> 2  194600497.613  ops/s
>> [java] CompactIntegerSequenceBench.benchCompressSnappy8kthrpt
>> 2  186040175.059  ops/s
>> 
>> LZ4 high compressor:
>> [java] CompactIntegerSequenceBench.bench16k thrpt2
>> 20822947.578  ops/s
>> [java] CompactIntegerSequenceBench.bench32k thrpt2
>> 12037342.253  ops/s
>> [java] CompactIntegerSequenceBench.bench64k  thrpt2
>> 6782534.469  ops/s
>> [java] CompactIntegerSequenceBench.bench8k   thrpt2
>> 32254619.594  ops/s
>> 
>> LZ4 high is the one instance where block size mattered a lot. It's a bit
>> suspicious really when you look at the ratio of performance to block size
>> being close to 1:1. I couldn't spot a bug in the benchmark though.
>> 
>> Compression ratios with LZ4 fast for the text of Alice in Wonderland was:
>> 
>> Chunk size 8192, ratio 0.709473
>> Chunk size 16384, ratio 0.667236
>> Chunk size 32768, ratio 0.634735
>> Chunk size 65536, ratio 0.607208
>> 
>> By way of comparison I also ran deflate with maximum compression:
>> 
>> Chunk size 8192, ratio 0.426434
>> Chunk size 16384, ratio 0.402423
>> Chunk size 32768, ratio 0.381627
>> Chunk size 65536, ratio 0.364865
>> 
>> Ariel
>> 
>>> On Thu, Oct 18, 2018, at 5:32 AM, Benedict Elliott Smith wrote:
>>> FWIW, I’m not -0, just think that long after the freeze date a change
>>> like this needs a strong mandate from the community.  I think the change
>>> is a good one.
>>> 
>>>

Re: Built in trigger: double-write for app migration

2018-10-18 Thread Jeff Jirsa
Could be done with CDC
Could be done with triggers
(Could be done with vtables — double writes or double reads — if they were 
extended to be user facing)

Would be very hard to generalize properly, especially handling failure cases 
(write succeeds in one cluster/table but not the other) which are often app 
specific


-- 
Jeff Jirsa


> On Oct 18, 2018, at 6:47 PM, Jonathan Ellis  wrote:
> 
> Isn't this what CDC was designed for?
> 
> https://issues.apache.org/jira/browse/CASSANDRA-8844
> 
> On Thu, Oct 18, 2018 at 10:54 AM Carl Mueller
>  wrote:
> 
>> tl;dr: a generic trigger on TABLES that will mirror all writes to
>> facilitate data migrations between clusters or systems. What is necessary
>> to ensure full write mirroring/coherency?
>> 
>> When cassandra clusters have several "apps" aka keyspaces serving
>> applications colocated on them, but the app/keyspace bandwidth and size
>> demands begin impacting other keyspaces/apps, then one strategy is to
>> migrate the keyspace to its own dedicated cluster.
>> 
>> With backups/sstableloading, this will entail a delay and therefore a
>> "coherency" shortfall between the clusters. So typically one would employ a
>> "double write, read once":
>> 
>> - all updates are mirrored to both clusters
>> - writes come from the current most coherent.
>> 
>> Often two sstable loads are done:
>> 
>> 1) first load
>> 2) turn on double writes/write mirroring
>> 3) a second load is done to finalize coherency
>> 4) switch the app to point to the new cluster now that it is coherent
>> 
>> The double writes and read is the sticking point. We could do it at the app
>> layer, but if the app wasn't written with that, it is a lot of testing and
>> customization specific to the framework.
>> 
>> We could theoretically do some sort of proxying of the java-driver somehow,
>> but all the async structures and complex interfaces/apis would be difficult
>> to proxy. Maybe there is a lower level in the java-driver that is possible.
>> This also would only apply to the java-driver, and not
>> python/go/javascript/other drivers.
>> 
>> Finally, I suppose we could do a trigger on the tables. It would be really
>> nice if we could add to the cassandra toolbox the basics of a write
>> mirroring trigger that could be activated "fairly easily"... now I know
>> there are the complexities of inter-cluster access, and if we are even
>> using cassandra as the target mirror system (for example there is an
>> article on triggers write-mirroring to kafka:
>> https://dzone.com/articles/cassandra-to-kafka-data-pipeline-part-1).
>> 
>> And this starts to get into the complexities of hinted handoff as well. But
>> fundamentally this seems something that would be a very nice feature
>> (especially when you NEED it) to have in the core of cassandra.
>> 
>> Finally, is the mutation hook in triggers sufficient to track all incoming
>> mutations (outside of "shudder" other triggers generating data)
>> 
> 
> 
> -- 
> Jonathan Ellis
> co-founder, http://www.datastax.com
> @spyced

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: Built in trigger: double-write for app migration

2018-10-18 Thread Jeff Jirsa
The write sampling is adding an extra instance with the same schema to test 
things like yaml params or compaction without impacting reads or correctness - 
it’s different than what you describe



-- 
Jeff Jirsa


> On Oct 18, 2018, at 5:57 PM, Carl Mueller 
>  wrote:
> 
> I guess there is also write-survey-mode from cass 1.1:
> 
> https://issues.apache.org/jira/browse/CASSANDRA-3452
> 
> Were triggers intended to supersede this capability? I can't find a lot of
> "user level" info on it.
> 
> 
> On Thu, Oct 18, 2018 at 10:53 AM Carl Mueller 
> wrote:
> 
>> tl;dr: a generic trigger on TABLES that will mirror all writes to
>> facilitate data migrations between clusters or systems. What is necessary
>> to ensure full write mirroring/coherency?
>> 
>> When cassandra clusters have several "apps" aka keyspaces serving
>> applications colocated on them, but the app/keyspace bandwidth and size
>> demands begin impacting other keyspaces/apps, then one strategy is to
>> migrate the keyspace to its own dedicated cluster.
>> 
>> With backups/sstableloading, this will entail a delay and therefore a
>> "coherency" shortfall between the clusters. So typically one would employ a
>> "double write, read once":
>> 
>> - all updates are mirrored to both clusters
>> - writes come from the current most coherent.
>> 
>> Often two sstable loads are done:
>> 
>> 1) first load
>> 2) turn on double writes/write mirroring
>> 3) a second load is done to finalize coherency
>> 4) switch the app to point to the new cluster now that it is coherent
>> 
>> The double writes and read is the sticking point. We could do it at the
>> app layer, but if the app wasn't written with that, it is a lot of testing
>> and customization specific to the framework.
>> 
>> We could theoretically do some sort of proxying of the java-driver
>> somehow, but all the async structures and complex interfaces/apis would be
>> difficult to proxy. Maybe there is a lower level in the java-driver that is
>> possible. This also would only apply to the java-driver, and not
>> python/go/javascript/other drivers.
>> 
>> Finally, I suppose we could do a trigger on the tables. It would be really
>> nice if we could add to the cassandra toolbox the basics of a write
>> mirroring trigger that could be activated "fairly easily"... now I know
>> there are the complexities of inter-cluster access, and if we are even
>> using cassandra as the target mirror system (for example there is an
>> article on triggers write-mirroring to kafka:
>> https://dzone.com/articles/cassandra-to-kafka-data-pipeline-part-1).
>> 
>> And this starts to get into the complexities of hinted handoff as well.
>> But fundamentally this seems something that would be a very nice feature
>> (especially when you NEED it) to have in the core of cassandra.
>> 
>> Finally, is the mutation hook in triggers sufficient to track all incoming
>> mutations (outside of "shudder" other triggers generating data)
>> 
>> 
>> 
>> 

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: Using Cassandra as local db without cluster

2018-10-18 Thread Jeff Jirsa
I can’t think of a situation where I’d choose Cassandra as a database in a 
single-host use case (if you’re sure it’ll never be more than one machine).

-- 
Jeff Jirsa


> On Oct 18, 2018, at 12:31 PM, Abdelkrim Fitouri  wrote:
> 
> Hello,
> 
> I am wondering if using cassandra as one local database without the cluster
> capabilities has a sens, (i cannot do multi node cluster due to a technical
> constraint)
> 
> I have an application with a purpose to store a dynamic number of colones
> on each rows (thing that i cannot do with classical relational database),
> and i don't want to use documents based nosql database to avoid using Json
> marshal and unmarshal treatments...
> 
> Does cassandra with only one node and with a well designer model based on
> queries and partition keys can lead to best performance than postgresql ?
> 
> Does cassandra have some limitation about the size of data ? about the
> number of partition on a node ?
> 
> Thanks for any details or help.
> 
> --
> 
> Best Regards.

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: Deprecating/removing PropertyFileSnitch?

2018-10-16 Thread Jeff Jirsa
We should, but the 4.0 features that log/reject verbs to invalid replicas 
solves a lot of the concerns here 

-- 
Jeff Jirsa


> On Oct 16, 2018, at 4:10 PM, Jeremy Hanna  wrote:
> 
> We have had PropertyFileSnitch for a long time even though 
> GossipingPropertyFileSnitch is effectively a superset of what it offers and 
> is much less error prone.  There are some unexpected behaviors when things 
> aren’t configured correctly with PFS.  For example, if you replace nodes in 
> one DC and add those nodes to that DCs property files and not the other DCs 
> property files - the resulting problems aren’t very straightforward to 
> troubleshoot.
> 
> We could try to improve the resilience and fail fast error checking and error 
> reporting of PFS, but honestly, why wouldn’t we deprecate and remove 
> PropertyFileSnitch?  Are there reasons why GPFS wouldn’t be sufficient to 
> replace it?
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
> 

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: CASSANDRA-13241 lower default chunk_length_in_kb

2018-10-12 Thread Jeff Jirsa




> On Oct 12, 2018, at 6:46 AM, Pavel Yaskevich  wrote:
> 
>> On Thu, Oct 11, 2018 at 4:31 PM Ben Bromhead  wrote:
>> 
>> This is something that's bugged me for ages, tbh the performance gain for
>> most use cases far outweighs the increase in memory usage and I would even
>> be in favor of changing the default now, optimizing the storage cost later
>> (if it's found to be worth it).
>> 
>> For some anecdotal evidence:
>> 4kb is usually what we end setting it to, 16kb feels more reasonable given
>> the memory impact, but what would be the point if practically, most folks
>> set it to 4kb anyway?
>> 
>> Note that chunk_length will largely be dependent on your read sizes, but 4k
>> is the floor for most physical devices in terms of ones block size.
>> 
> 
> It might be worth while to investigate how splitting chunk size into data,
> index and compaction sizes would affect performance.
> 

Data chunk and index chunk are already different (though one is table level and 
one is per instance), but I’m not parsing the compaction comment? 
-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: CASSANDRA-13241 lower default chunk_length_in_kb

2018-10-11 Thread Jeff Jirsa



I think 16k is a better default, but it should only affect new tables. Whoever 
changes it, please make sure you think about the upgrade path. 


> On Oct 12, 2018, at 2:31 AM, Ben Bromhead  wrote:
> 
> This is something that's bugged me for ages, tbh the performance gain for
> most use cases far outweighs the increase in memory usage and I would even
> be in favor of changing the default now, optimizing the storage cost later
> (if it's found to be worth it).
> 
> For some anecdotal evidence:
> 4kb is usually what we end setting it to, 16kb feels more reasonable given
> the memory impact, but what would be the point if practically, most folks
> set it to 4kb anyway?
> 
> Note that chunk_length will largely be dependent on your read sizes, but 4k
> is the floor for most physical devices in terms of ones block size.
> 
> +1 for making this change in 4.0 given the small size and the large
> improvement to new users experience (as long as we are explicit in the
> documentation about memory consumption).
> 
> 
>> On Thu, Oct 11, 2018 at 7:11 PM Ariel Weisberg  wrote:
>> 
>> Hi,
>> 
>> This is regarding https://issues.apache.org/jira/browse/CASSANDRA-13241
>> 
>> This ticket has languished for a while. IMO it's too late in 4.0 to
>> implement a more memory efficient representation for compressed chunk
>> offsets. However I don't think we should put out another release with the
>> current 64k default as it's pretty unreasonable.
>> 
>> I propose that we lower the value to 16kb. 4k might never be the correct
>> default anyways as there is a cost to compression and 16k will still be a
>> large improvement.
>> 
>> Benedict and Jon Haddad are both +1 on making this change for 4.0. In the
>> past there has been some consensus about reducing this value although maybe
>> with more memory efficiency.
>> 
>> The napkin math for what this costs is:
>> "If you have 1TB of uncompressed data, with 64k chunks that's 16M chunks
>> at 8 bytes each (128MB).
>> With 16k chunks, that's 512MB.
>> With 4k chunks, it's 2G.
>> Per terabyte of data (pre-compression)."
>> 
>> https://issues.apache.org/jira/browse/CASSANDRA-13241?focusedCommentId=15886621=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15886621
>> 
>> By way of comparison memory mapping the files has a similar cost per 4k
>> page of 8 bytes. Multiple mappings makes this more expensive. With a
>> default of 16kb this would be 4x less expensive than memory mapping a file.
>> I only mention this to give a sense of the costs we are already paying. I
>> am not saying they are directly related.
>> 
>> I'll wait a week for discussion and if there is consensus make the change.
>> 
>> Regards,
>> Ariel
>> 
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
>> For additional commands, e-mail: dev-h...@cassandra.apache.org
>> 
>> --
> Ben Bromhead
> CTO | Instaclustr 
> +1 650 284 9692
> Reliability at Scale
> Cassandra, Spark, Elasticsearch on AWS, Azure, GCP and Softlayer

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: MD5 in the read path

2018-09-26 Thread Jeff Jirsa
In some installations, it's used for hashing the partition key to find the
host ( RandomPartitioner )
It's used for prepared statement IDs
It's used for hashing the data for reads to know if the data matches on all
different replicas.

We don't use CRC because conflicts would be really bad. There's probably
something in the middle that's slightly faster than md5 without the
drawbacks of crc32


On Wed, Sep 26, 2018 at 3:47 PM Tyagi, Preetika 
wrote:

> Hi all,
>
> I have a question about MD5 being used in the read path in Cassandra.
> I wanted to understand what exactly it is being used for and why not
> something like CRC is used which is less complex in comparison to MD5.
>
> Thanks,
> Preetika
>
>


Re: [DISCUSS] changing default token behavior for 4.0

2018-09-21 Thread Jeff Jirsa
Also agree it should be lowered, but definitely not to 1, and probably 
something closer to 32 than 4.

-- 
Jeff Jirsa


> On Sep 21, 2018, at 8:24 PM, Jeremy Hanna  wrote:
> 
> I agree that it should be lowered. What I’ve seen debated a bit in the past 
> is the number but I don’t think anyone thinks that it should remain 256.
> 
>> On Sep 21, 2018, at 7:05 PM, Jonathan Haddad  wrote:
>> 
>> One thing that's really, really bothered me for a while is how we default
>> to 256 tokens still.  There's no experienced operator that leaves it as is
>> at this point, meaning the only people using 256 are the poor folks that
>> just got started using C*.  I've worked with over a hundred clusters in the
>> last couple years, and I think I only worked with one that had lowered it
>> to something else.
>> 
>> I think it's time we changed the default to 4 (or 8, up for debate).
>> 
>> To improve the behavior, we need to change a couple other things.  The
>> allocate_tokens_for_keyspace setting is... odd.  It requires you have a
>> keyspace already created, which doesn't help on new clusters.  What I'd
>> like to do is add a new setting, allocate_tokens_for_rf, and set it to 3 by
>> default.
>> 
>> To handle clusters that are already using 256 tokens, we could prevent the
>> new node from joining unless a -D flag is set to explicitly allow
>> imbalanced tokens.
>> 
>> We've agreed to a trunk freeze, but I feel like this is important enough
>> (and pretty trivial) to do now.  I'd also personally characterize this as a
>> bug fix since 256 is horribly broken when the cluster gets to any
>> reasonable size, but maybe I'm alone there.
>> 
>> I honestly can't think of a use case where random tokens is a good choice
>> anymore, so I'd be fine / ecstatic with removing it completely and
>> requiring either allocate_tokens_for_keyspace (for existing clusters)
>> or allocate_tokens_for_rf
>> to be set.
>> 
>> Thoughts?  Objections?
>> -- 
>> Jon Haddad
>> http://www.rustyrazorblade.com
>> twitter: rustyrazorblade
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
> 

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: [VOTE] Development Approach for Apache Cassandra Management process

2018-09-12 Thread Jeff Jirsa
On Wed, Sep 12, 2018 at 12:41 PM Sylvain Lebresne 
wrote:

> That's probably a stupid question, and excuse me if it is, but what does
> those votes on the dev mailing list even mean?
>
> How do you count votes at the end? Just by counting all votes cast,
> irregardless of whomever cast it? Or are we intending to only count PMC
> members, or maybe committers votes?
>

I believe the intent is to try to see if there exists consensus.
Ultimately, PMC is going to matter more than random email addresses from
people nobody recognizes. This should be in public, though, not private, so
seeing what feedback is beyond the PMC is useful (primarily because it will
matter when it comes time to extend and maintain it - if people strongly
prefer one or the other, then maintenance is going to be a problem).

If there's 100 random non-contributor votes for one option and 20 pmc votes
for another options, I think the real answer will be "we don't have
consensus, and either we don't do it, or we do it the way the PMC thinks is
best", for all of the reasons you describe in the paragraphs below.


> If the former, that is a bit weird to me because we simply don't know who
> votes. And I don't mean to be rude towards anyone, but 1) someone could
> easily create 10 email addresses to vote 10 times (and sure, you could
> invoke trust, and I'm not entirely against trust in general, but it's the
> internet...) and 2) this kind of decision will have non-trivial
> consequences for the project, particularly on those that maintain it, so I
> admit I'm not entirely comfortable with "anyone's voice has the same
> weight".
> If the latter, then this makes more sense to me (why are we even bothering
> voting PMC members in if it's not to handle these kinds of decisions, which
> are very "project management" related), but we should be very clear about
> this from the get go (we could still use the dev list for transparency
> sake, that I don't mind)? We should probably also have some deadline to the
> vote, one that isn't too short.
>

Like releases, I think PMC votes count


>
> Anyway, fwiw, my opinion on this vote is not far from the one on the golang
> driver acceptance vote (for which my remark above also apply btw): no yet
> 100% convinced adding more pieces and scope to the project is what the
> project needs just right now, but not strongly opposed if people really
> wants this (and this one makes more sense to me than the golang driver
> actually). But if I'm to pick between a) and b), I'm leaning b).
>

FWIW, two of the main reasons I'm in favor is as a way to lower barrier to
entry to both using the software AND contributing to the project, so I
think your points are valid (both on gocql thread and on this note above),
but I think that's also part of why we should be encouraging both.

- Jeff


Re: [VOTE] Development Approach for Apache Cassandra Management process

2018-09-12 Thread Jeff Jirsa
d - good with either option, but would probably slightly prefer b, as it
can be build towards the design doc.



On Wed, Sep 12, 2018 at 8:19 AM sankalp kohli 
wrote:

> Hi,
> Community has been discussing about Apache Cassandra Management process
> since April and we had lot of discussion about which approach to take to
> get started. Several contributors have been interested in doing this and we
> need to make a decision of which approach to take.
>
> The current approaches being evaluated are
> a. Donate an existing project to Apache Cassandra like Reaper. If this
> option is selected, we will evaluate various projects and see which one
> fits best.
> b. Take a piecemeal approach and use the features from different OSS
> projects and build a new project.
>
> Available options to vote
> a. +1 to use existing project.
> b. +1 to take piecemeal approach
> c  -1 to both
> d +0 I dont mind either option
>
> You can also just type a,b,c,d as well to chose an option.
>
> Dev threads with discussions
>
>
> https://lists.apache.org/thread.html/4eace8cb258aab83fc3a220ff2203a281ea59f4d6557ebeb1af7b7f1@%3Cdev.cassandra.apache.org%3E
>
>
> https://lists.apache.org/thread.html/4a7e608c46aa2256e8bcb696104a4e6d6aaa1f302834d211018ec96e@%3Cdev.cassandra.apache.org%3E
>


Re: [VOTE] Accept GoCQL driver donation and begin incubation process

2018-09-12 Thread Jeff Jirsa
+1

(Incubation looks like it may be challenging to get acceptance from all 
existing contributors, though)

-- 
Jeff Jirsa


> On Sep 12, 2018, at 8:12 AM, Nate McCall  wrote:
> 
> This will be the same process used for dtest. We will need to walk
> this through the incubator per the process outlined here:
> 
> https://incubator.apache.org/guides/ip_clearance.html
> 
> Pending the outcome of this vote, we will create the JIRA issues for
> tracking and after we go through the process, and discuss adding
> committers in a separate thread (we need to do this atomically anyway
> per general ASF committer adding processes).
> 
> Thanks,
> -Nate
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
> 

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: UDF

2018-09-11 Thread Jeff Jirsa
+1 as well.

On Tue, Sep 11, 2018 at 10:27 AM Aleksey Yeschenko 
wrote:

> If this is about inclusion in 4.0, then I support it.
>
> Technically this is *mostly* just a move+donation of some code from
> java-driver to Cassandra. Given how important this seemingly is to the
> board and PMC for us to not have the dependency on the driver, the sooner
> it’s gone, the better.
>
> I’d be +1 for committing to trunk.
>
> —
> AY
>
> On 11 September 2018 at 14:43:29, Robert Stupp (sn...@snazy.de) wrote:
>
> The patch is technically complete - i.e. it works and does its thing.
>
> It's not strictly a bug fix but targets trunk. That's why I started the
> discussion.
>
>
> On 09/11/2018 02:53 PM, Jason Brown wrote:
> > Hi Robert,
> >
> > Thanks for taking on this work. Is this message a heads up that a patch
> is
> > coming/complete, or to spawn a discussion about including this in 4.0?
> >
> > Thanks,
> >
> > -Jason
> >
> > On Tue, Sep 11, 2018 at 2:32 AM, Robert Stupp  wrote:
> >
> >> In an effort to clean up our hygiene and limit the dependencies used
> by
> >> UDFs/UDAs, I think we should refactor the UDF code parts and remove
> the
> >> dependency to the Java Driver in that area without breaking existing
> >> UDFs/UDAs.
> >>
> >> A working prototype is in this branch: https://github.com/snazy/
> >> cassandra/tree/feature/remove-udf-driver-dep-trunk <
> >> https://github.com/snazy/cassandra/tree/feature/remove-
> >> udf-driver-dep-trunk> . The changes are rather trivial and provide
> 100%
> >> backwards compatibility for existing UDFs.
> >>
> >> The prototype copies the necessary parts from the Java Driver into the
> C*
> >> source tree to org.apache.cassandra.cql3.functions.types and adopts
> its
> >> usages - i.e. UDF/UDA code plus CQLSSTableWriter +
> StressCQLSSTableWriter.
> >> The latter two classes have a reference to UDF’s UDHelper and had to
> be
> >> changed as well.
> >>
> >> Some functionality, like type parsing & handling, is duplicated in the
> >> code base with this prototype - once in the “current” source tree and
> once
> >> for UDFs. However, unifying the code paths is not trivial, since the
> UDF
> >> sandbox prohibits the use of internal classes (direct and likely
> indirect
> >> dependencies).
> >>
> >> Robert
> >>
> >> —
> >> Robert Stupp
> >> @snazy
> >>
> >>
>
> --
> Robert Stupp
> @snazy
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
>


Re: Proposing an Apache Cassandra Management process

2018-09-07 Thread Jeff Jirsa
The benefit is that it more closely matched the design doc, from 5 months ago, 
which is decidedly not about coordinating repair - it’s about a general purpose 
management tool, where repair is one of many proposed tasks

https://docs.google.com/document/d/1UV9pE81NaIUF3g4L1wxq09nT11AkSQcMijgLFwGsY3s/edit


By starting with a tool that is built to run repair, you’re sacrificing 
generality and accepting something purpose built for one sub task. It’s an 
important subtask, and it’s a nice tool, but it’s not an implementation of the 
proposal, it’s an alternative that happens to do some of what was proposed.

-- 
Jeff Jirsa


> On Sep 7, 2018, at 6:53 PM, Blake Eggleston  wrote:
> 
> What’s the benefit of doing it that way vs starting with reaper and 
> integrating the netflix scheduler? If reaper was just a really inappropriate 
> choice for the cassandra management process, I could see that being a better 
> approach, but I don’t think that’s the case.
> 
> If our management process isn’t a drop in replacement for reaper, then reaper 
> will continue to exist, which will split the user and developers base between 
> the 2 projects. That won't be good for either project.
> 
> On September 7, 2018 at 6:12:01 PM, Jeff Jirsa (jji...@gmail.com) wrote:
> 
> I’d also like to see the end state you describe: reaper UI wrapping the 
> Netflix management process with pluggable scheduling (either as is with 
> reaper now, or using the Netflix scheduler), but I don’t think that means we 
> need to start with reaper - if personally prefer the opposite direction, 
> starting with something small and isolated and layering on top.  
> 
> --  
> Jeff Jirsa  
> 
> 
>> On Sep 7, 2018, at 5:42 PM, Blake Eggleston  wrote:  
>> 
>> I think we should accept the reaper project as is and make that cassandra 
>> management process 1.0, then integrate the netflix scheduler (and other new 
>> features) into that.  
>> 
>> The ultimate goal would be for the netflix scheduler to become the default 
>> repair scheduler, but I think using reaper as the starting point makes it 
>> easier to get there.  
>> 
>> Reaper would bring a prod user base that would realistically take 2-3 years 
>> to build up with a new project. As an operator, switching to a cassandra 
>> management process that’s basically a re-brand of an existing and commonly 
>> used management process isn’t super risky. Asking operators to switch to a 
>> new process is a much harder sell.  
>> 
>> On September 7, 2018 at 4:17:10 PM, Jeff Jirsa (jji...@gmail.com) wrote:  
>> 
>> How can we continue moving this forward?  
>> 
>> Mick/Jon/TLP folks, is there a path here where we commit the  
>> Netflix-provided management process, and you augment Reaper to work with it? 
>>  
>> Is there a way we can make a larger umbrella that's modular that can  
>> support either/both?  
>> Does anyone believe there's a clear, objective argument that one is  
>> strictly better than the other? I haven't seen one.  
>> 
>> 
>> 
>> On Mon, Aug 20, 2018 at 4:14 PM Roopa Tangirala  
>>  wrote:  
>> 
>>> +1 to everything that Joey articulated with emphasis on the fact that  
>>> contributions should be evaluated based on the merit of code and their  
>>> value add to the whole offering. I hope it does not matter whether that  
>>> contribution comes from PMC member or a person who is not a committer. I  
>>> would like the process to be such that it encourages the new members to be  
>>> a part of the community and not shy away from contributing to the code  
>>> assuming their contributions are valued differently than committers or PMC  
>>> members. It would be sad to see the contributions decrease if we go down  
>>> that path.  
>>> 
>>> *Regards,*  
>>> 
>>> *Roopa Tangirala*  
>>> 
>>> Engineering Manager CDE  
>>> 
>>> *(408) 438-3156 - mobile*  
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> On Mon, Aug 20, 2018 at 2:58 PM Joseph Lynch   
>>> wrote:  
>>> 
>>>>> We are looking to contribute Reaper to the Cassandra project.  
>>>>> 
>>>> Just to clarify are you proposing contributing Reaper as a project via  
>>>> donation or you are planning on contributing the features of Reaper as  
>>>> patches to Cassandra? If the former how far along are you on the donation  
>>>> process? If the latter, when do you think you would have patches ready  
>>> for  
>>>> consideration / review?  
>>>> 
>>>&

Re: Proposing an Apache Cassandra Management process

2018-09-07 Thread Jeff Jirsa
I’d also like to see the end state you describe: reaper UI wrapping the Netflix 
management process with pluggable scheduling (either as is with reaper now, or 
using the Netflix scheduler), but I don’t think that means we need to start 
with reaper - if personally prefer the opposite direction, starting with 
something small and isolated and layering on top. 

-- 
Jeff Jirsa


> On Sep 7, 2018, at 5:42 PM, Blake Eggleston  wrote:
> 
> I think we should accept the reaper project as is and make that cassandra 
> management process 1.0, then integrate the netflix scheduler (and other new 
> features) into that.
> 
> The ultimate goal would be for the netflix scheduler to become the default 
> repair scheduler, but I think using reaper as the starting point makes it 
> easier to get there. 
> 
> Reaper would bring a prod user base that would realistically take 2-3 years 
> to build up with a new project. As an operator, switching to a cassandra 
> management process that’s basically a re-brand of an existing and commonly 
> used management process isn’t super risky. Asking operators to switch to a 
> new process is a much harder sell. 
> 
> On September 7, 2018 at 4:17:10 PM, Jeff Jirsa (jji...@gmail.com) wrote:
> 
> How can we continue moving this forward?  
> 
> Mick/Jon/TLP folks, is there a path here where we commit the  
> Netflix-provided management process, and you augment Reaper to work with it?  
> Is there a way we can make a larger umbrella that's modular that can  
> support either/both?  
> Does anyone believe there's a clear, objective argument that one is  
> strictly better than the other? I haven't seen one.  
> 
> 
> 
> On Mon, Aug 20, 2018 at 4:14 PM Roopa Tangirala  
>  wrote:  
> 
>> +1 to everything that Joey articulated with emphasis on the fact that  
>> contributions should be evaluated based on the merit of code and their  
>> value add to the whole offering. I hope it does not matter whether that  
>> contribution comes from PMC member or a person who is not a committer. I  
>> would like the process to be such that it encourages the new members to be  
>> a part of the community and not shy away from contributing to the code  
>> assuming their contributions are valued differently than committers or PMC  
>> members. It would be sad to see the contributions decrease if we go down  
>> that path.  
>> 
>> *Regards,*  
>> 
>> *Roopa Tangirala*  
>> 
>> Engineering Manager CDE  
>> 
>> *(408) 438-3156 - mobile*  
>> 
>> 
>> 
>> 
>> 
>> 
>> On Mon, Aug 20, 2018 at 2:58 PM Joseph Lynch   
>> wrote:  
>> 
>>>> We are looking to contribute Reaper to the Cassandra project.  
>>>> 
>>> Just to clarify are you proposing contributing Reaper as a project via  
>>> donation or you are planning on contributing the features of Reaper as  
>>> patches to Cassandra? If the former how far along are you on the donation  
>>> process? If the latter, when do you think you would have patches ready  
>> for  
>>> consideration / review?  
>>> 
>>> 
>>>> Looking at the patch it's very similar in its base design already, but  
>>>> Reaper does has a lot more to offer. We have all been working hard to  
>>> move  
>>>> it to also being a side-car so it can be contributed. This raises a  
>>> number  
>>>> of relevant questions to this thread: would we then accept both works  
>> in  
>>>> the Cassandra project, and what burden would it put on the current PMC  
>> to  
>>>> maintain both works.  
>>>> 
>>> I would hope that we would collaborate on merging the best parts of all  
>>> into the official Cassandra sidecar, taking the always on, shared  
>> nothing,  
>>> highly available system that we've contributed a patchset for and adding  
>> in  
>>> many of the repair features (e.g. schedules, a nice web UI) that Reaper  
>>> has.  
>>> 
>>> 
>>>> I share Stefan's concern that consensus had not been met around a  
>>>> side-car, and that it was somehow default accepted before a patch  
>> landed.  
>>> 
>>> 
>>> I feel this is not correct or fair. The sidecar and repair discussions  
>> have  
>>> been anything _but_ "default accepted". The timeline of consensus  
>> building  
>>> involving the management sidecar and repair scheduling plans:  
>>> 
>>> Dec 2016: Vinay worked with Jon and Alex to try to collaborate on Reaper  
>> to  
>>> come up wi

Re: Proposing an Apache Cassandra Management process

2018-09-07 Thread Jeff Jirsa
How can we continue moving this forward?

Mick/Jon/TLP folks, is there a path here where we commit the
Netflix-provided management process, and you augment Reaper to work with it?
Is there a way we can make a larger umbrella that's modular that can
support either/both?
Does anyone believe there's a clear, objective argument that one is
strictly better than the other? I haven't seen one.



On Mon, Aug 20, 2018 at 4:14 PM Roopa Tangirala
 wrote:

> +1 to everything that Joey articulated with emphasis on the fact that
> contributions should be evaluated based on the merit of code and their
> value add to the whole offering. I  hope it does not matter whether that
> contribution comes from PMC member or a person who is not a committer. I
> would like the process to be such that it encourages the new members to be
> a part of the community and not shy away from contributing to the code
> assuming their contributions are valued differently than committers or PMC
> members. It would be sad to see the contributions decrease if we go down
> that path.
>
> *Regards,*
>
> *Roopa Tangirala*
>
> Engineering Manager CDE
>
> *(408) 438-3156 - mobile*
>
>
>
>
>
>
> On Mon, Aug 20, 2018 at 2:58 PM Joseph Lynch 
> wrote:
>
> > > We are looking to contribute Reaper to the Cassandra project.
> > >
> > Just to clarify are you proposing contributing Reaper as a project via
> > donation or you are planning on contributing the features of Reaper as
> > patches to Cassandra? If the former how far along are you on the donation
> > process? If the latter, when do you think you would have patches ready
> for
> > consideration / review?
> >
> >
> > > Looking at the patch it's very similar in its base design already, but
> > > Reaper does has a lot more to offer. We have all been working hard to
> > move
> > > it to also being a side-car so it can be contributed. This raises a
> > number
> > > of relevant questions to this thread: would we then accept both works
> in
> > > the Cassandra project, and what burden would it put on the current PMC
> to
> > > maintain both works.
> > >
> > I would hope that we would collaborate on merging the best parts of all
> > into the official Cassandra sidecar, taking the always on, shared
> nothing,
> > highly available system that we've contributed a patchset for and adding
> in
> > many of the repair features (e.g. schedules, a nice web UI) that Reaper
> > has.
> >
> >
> > > I share Stefan's concern that consensus had not been met around a
> > > side-car, and that it was somehow default accepted before a patch
> landed.
> >
> >
> > I feel this is not correct or fair. The sidecar and repair discussions
> have
> > been anything _but_ "default accepted". The timeline of consensus
> building
> > involving the management sidecar and repair scheduling plans:
> >
> > Dec 2016: Vinay worked with Jon and Alex to try to collaborate on Reaper
> to
> > come up with design goals for a repair scheduler that could work at
> Netflix
> > scale.
> >
> > ~Feb 2017: Netflix believes that the fundamental design gaps prevented us
> > from using Reaper as it relies heavily on remote JMX connections and
> > central coordination.
> >
> > Sep. 2017: Vinay gives a lightning talk at NGCC about a highly available
> > and distributed repair scheduling sidecar/tool. He is encouraged by
> > multiple committers to build repair scheduling into the daemon itself and
> > not as a sidecar so the database is truly eventually consistent.
> >
> > ~Jun. 2017 - Feb. 2018: Based on internal need and the positive feedback
> at
> > NGCC, Vinay and myself prototype the distributed repair scheduler within
> > Priam and roll it out at Netflix scale.
> >
> > Mar. 2018: I open a Jira (CASSANDRA-14346) along with a detailed 20 page
> > design document for adding repair scheduling to the daemon itself and
> open
> > the design up for feedback from the community. We get feedback from Alex,
> > Blake, Nate, Stefan, and Mick. As far as I know there were zero proposals
> > to contribute Reaper at this point. We hear the consensus that the
> > community would prefer repair scheduling in a separate distributed
> sidecar
> > rather than in the daemon itself and we re-work the design to match this
> > consensus, re-aligning with our original proposal at NGCC.
> >
> > Apr 2018: Blake brings the discussion of repair scheduling to the dev
> list
> > (
> >
> >
> https://lists.apache.org/thread.html/760fbef677f27aa5c2ab4c375c7efeb81304fea428deff986ba1c2eb@%3Cdev.cassandra.apache.org%3E
> > ).
> > Many community members give positive feedback that we should solve it as
> > part of Cassandra and there is still no mention of contributing Reaper at
> > this point. The last message is my attempted summary giving context on
> how
> > we want to take the best of all the sidecars (OpsCenter, Priam, Reaper)
> and
> > ship them with Cassandra.
> >
> > Apr. 2018: Dinesh opens CASSANDRA-14395 along with a public design
> document
> > for gathering feedback on a general 

Re: Request for post-freeze merge exception

2018-09-04 Thread Jeff Jirsa
Seems like a reasonable thing to merge to me. Nothing else has been
committed, it was approved pre-freeze, seems like the rush to merge was
bound to have some number of rebase casualties.

On Tue, Sep 4, 2018 at 11:15 AM Sam Tunnicliffe  wrote:

> Hey all,
>
> On 2018-31-08 CASSANDRA-14145 had been +1'd by two reviewers and CI was
> green, and so it was marked Ready To Commit. This was before the 4.0
> feature freeze but before it landed, CASSANDRA-14408, which touched a few
> common areas of the code, was merged. I didn't have chance to finish the
> rebase over the weekend but in the end it turned out that most of the
> conflicts were in test code and were straightforward to resolve. I'd like
> to commit this now; the rebase is done (& has been re-reviewed), and the CI
> is still green so I suspect most of the community would probably be ok with
> that. We did vote for a freeze though and I don't want to subvert or
> undermine that decision, so I wanted to check and give a chance for anyone
> to raise objections before I did.
>
> I'll wait 24 hours, and if nobody objects before then I'll merge to trunk.
>
> Thanks,
> Sam
>


Re: Java 11 Z garbage collector

2018-08-31 Thread Jeff Jirsa
Read heavy workload with wider partitions (like 1-2gb) and disable the key 
cache will be worst case for GC




-- 
Jeff Jirsa


> On Aug 31, 2018, at 10:51 AM, Carl Mueller 
>  wrote:
> 
> I'm assuming that p99 that Rocksandra tries to target is caused by GC
> pauses, does anyone have data patterns or datasets that will generate GC
> pauses in Cassandra to highlight the abilities of Rocksandra (and...
> Scylla?) and perhaps this GC approach?
> 
> On Thu, Aug 30, 2018 at 8:11 PM Carl Mueller 
> wrote:
> 
>> Oh nice, I'll check that out.
>> 
>> On Thu, Aug 30, 2018 at 11:07 AM Jonathan Haddad 
>> wrote:
>> 
>>> Advertised, yes, but so far I haven't found it to be any better than
>>> ParNew + CMS or G1 in the performance tests I did when writing
>>> http://thelastpickle.com/blog/2018/08/16/java11.html.
>>> 
>>> That said, I didn't try it with a huge heap (i think it was 16 or 24GB),
>>> so
>>> maybe it'll do better if I throw 50 GB RAM at it.
>>> 
>>> 
>>> 
>>> On Thu, Aug 30, 2018 at 8:42 AM Carl Mueller
>>>  wrote:
>>> 
>>>> https://www.opsian.com/blog/javas-new-zgc-is-very-exciting/
>>>> 
>>>> .. max of 4ms for stop the world, large terabyte heaps, seems promising.
>>>> 
>>>> Will this be a major boon to cassandra p99 times? Anyone know the
>>> aspects
>>>> of cassandra that cause the most churn and lead to StopTheWorld GC? I
>>> was
>>>> under the impression that bloom filters, caches, etc are statically
>>>> allocated at startup.
>>>> 
>>> 
>>> 
>>> --
>>> Jon Haddad
>>> http://www.rustyrazorblade.com
>>> twitter: rustyrazorblade
>>> 
>> 

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: Reaper as cassandra-admin

2018-08-29 Thread Jeff Jirsa
Agreed here - combining effort and making things pluggable seems like a good 
solution


-- 
Jeff Jirsa


On Aug 28, 2018, at 11:44 PM, Vinay Chella  wrote:

>> I haven’t settled on a position yet (will have more time think about
> things after the 9/1 freeze), but I wanted to point out that the argument
> that something new should be written because an existing project has tech
> debt, and we'll do it the right way this time, is a pretty common software
> engineering mistake. The thing you’re replacing usually needs to have some
> really serious problems to make it worth replacing.
> 
> Agreed, Yes, I don’t think we should write everything from the scratch, but
> carry forwarding tech debt (if any) and design decisions which makes new
> features in future difficult to develop is something that we need to
> consider. I second Dinesh’s thought on taking the best parts from available
> projects to move forward with the right solution which works great and
> easily pluggable.
> 
> -
> Vinay Chella
> 
> 
>> On Tue, Aug 28, 2018 at 10:03 PM Mick Semb Wever  wrote:
>> 
>> 
>>> the argument that something new should be written because an existing
>> project has tech debt, and we'll do it the right way this time, is a pretty
>> common software engineering mistake. The thing you’re replacing usually
>> needs to have some really serious problems to make it worth replacing.
>> 
>> 
>> Thanks for writing this Blake. I'm no fan of writing from scratch. Working
>> with other people's code is the joy of open-source, imho.
>> 
>> Reaper is not a big project. None of its java files are large or
>> complicated.
>> This is not the C* codebase we're talking about.
>> 
>> It comes with strict code style in place (which the build enforces), unit
>> and integration tests. The tech debt that I think of first is removing
>> stuff that we would no longer want to support if it were inside the
>> Cassandra project. A number of recent refactorings  have proved it's an
>> easy codebase to work with.
>> 
>> It's also worth noting that Cassandra-4.x adoption is still some away, in
>> which time Reaper will only continue to grow and gain users.
>> 
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
>> For additional commands, e-mail: dev-h...@cassandra.apache.org
>> 
>> 

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: Supporting multiple JDKs

2018-08-28 Thread Jeff Jirsa
+1 from me on both points below 

-- 
Jeff Jirsa


> On Aug 28, 2018, at 1:40 PM, Sumanth Pasupuleti 
>  wrote:
> 
> Correct me if I am wrong, but I see the following consensus so far, on the
> proposal.
> 
> C* 2.2
> AnimalSniffer
> Use AnimalSniffer for compile-time feedback on JDK 1.7 compatibility -
> complete consensus so far
> Circle CI Builds
> In addition to existing JDK 1.8 support, build against JDK 1.7, and
> [optionally] run unit tests and DTests against JDK 1.7 - Dinesh and
> Sumanth +1 so far. Mick - I am not sure if you are +0 or -1 on this.
> 
> C* 4.0
> Circle CI Builds
> In addition to existing JDK 1.8 support, build against JDK 11 and
> [optionally] run unit tests and DTests against JDK 11. - complete consensus
> so far
> 
> If anyone has any further feedback, please comment.
> 
> Thanks,
> Sumanth
> 
> On Fri, Aug 24, 2018 at 7:27 AM Sumanth Pasupuleti
>  wrote:
> 
>>> I'm still a bit confused as to what's the benefit in compiling with
>> jdk1.7 and then testing with jdk1.7 or jdk1.8
>> I meant two separate workflows for each JDK i.e.
>> Workflow1: Build against jdk1.7, and optionally run UTs and Dtests against
>> 1.7
>> Workflow2: Build against jdk1.8, and run UTs and DTests against 1.8.
>> 
>>> If you find breakages here that otherwise don't exist when it's compiled
>> with jdk1.8 then that's just false-positives. As well as generally wasting
>> CI resources.
>> If we find breakages in workflow1, and not in workflow 2, how would they be
>> false positives? we will need to then look into whats causing breakages
>> with 1.7, isn't it?
>> 
>> Thanks,
>> Sumanth
>> 
>> On Thu, Aug 23, 2018 at 7:59 PM, Mick Semb Wever  wrote:
>> 
>>>> However, in addition to using such a
>>>> tool, I believe, when we make a release, we should build against the
>>> actual
>>>> JDKs we support (that way, we are not making a release just based on
>> the
>>>> result of an external tool), and we should be able to optionally run
>> UTs
>>>> and DTests against the JDK  (i.e. Java7 and Java8 for C* 2.2).
>>> 
>>> 
>>> I'm still a bit confused as to what's the benefit in compiling with
>> jdk1.7
>>> and then testing with jdk1.7 or jdk1.8
>>> 
>>> If you find breakages here that otherwise don't exist when it's compiled
>>> with jdk1.8 then that's just false-positives. As well as generally
>> wasting
>>> CI resources.
>>> 
>>> Either way, there's not much point discussing this as Cassandra-2.1 is
>>> about EOL, and Cassandra-4.0 is stuck with a very specific compile.
>>> 
>>> -
>>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
>>> For additional commands, e-mail: dev-h...@cassandra.apache.org
>>> 
>>> 
>> 

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: Reaper as cassandra-admin

2018-08-27 Thread Jeff Jirsa
As an aside, it’s frustrating that ya’ll would sit on this for months (first 
e-mail was April); you folks have enough people that know the process to know 
that communicating early and often helps avoid duplicating (expensive) work. 

The best tech needs to go in and we need to leave ourselves with the ability to 
meet the goals of the original proposal (and then some). The reaper UI is nice, 
I wish you’d have talked to the other group of folks to combine efforts in 
April, we’d be much further ahead. 

-- 
Jeff Jirsa


> On Aug 27, 2018, at 6:02 PM, Jeff Jirsa  wrote:
> 
> Can you get all of the contributors cleared?
> What’s the architecture? Is it centralized? Is there a sidecar?
> 
> 
>> On Aug 27, 2018, at 5:36 PM, Jonathan Haddad  wrote:
>> 
>> Hey folks,
>> 
>> Mick brought this up in the sidecar thread, but I wanted to have a clear /
>> separate discussion about what we're thinking with regard to contributing
>> Reaper to the C* project.  In my mind, starting with Reaper is a great way
>> of having an admin right now, that we know works well at the kind of scale
>> we need.  We've worked with a lot of companies putting Reaper in prod (at
>> least 50), running on several hundred clusters.  The codebase has evolved
>> as a direct result of production usage, and we feel it would be great to
>> pair it with the 4.0 release.  There was a LOT of work done on the repair
>> logic to make things work across every supported version of Cassandra, with
>> a great deal of documentation as well.
>> 
>> In case folks aren't aware, in addition to one off and scheduled repairs,
>> Reaper also does cluster wide snapshots, exposes thread pool stats, and
>> visualizes streaming (in trunk).
>> 
>> We're hoping to get some feedback on our side if that's something people
>> are interested in.  We've gone back and forth privately on our own
>> preferences, hopes, dreams, etc, but I feel like a public discussion would
>> be healthy at this point.  Does anyone share the view of using Reaper as a
>> starting point?  What concerns to people have?
>> -- 
>> Jon Haddad
>> http://www.rustyrazorblade.com
>> twitter: rustyrazorblade

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: Reaper as cassandra-admin

2018-08-27 Thread Jeff Jirsa
Can you get all of the contributors cleared?
What’s the architecture? Is it centralized? Is there a sidecar?


> On Aug 27, 2018, at 5:36 PM, Jonathan Haddad  wrote:
> 
> Hey folks,
> 
> Mick brought this up in the sidecar thread, but I wanted to have a clear /
> separate discussion about what we're thinking with regard to contributing
> Reaper to the C* project.  In my mind, starting with Reaper is a great way
> of having an admin right now, that we know works well at the kind of scale
> we need.  We've worked with a lot of companies putting Reaper in prod (at
> least 50), running on several hundred clusters.  The codebase has evolved
> as a direct result of production usage, and we feel it would be great to
> pair it with the 4.0 release.  There was a LOT of work done on the repair
> logic to make things work across every supported version of Cassandra, with
> a great deal of documentation as well.
> 
> In case folks aren't aware, in addition to one off and scheduled repairs,
> Reaper also does cluster wide snapshots, exposes thread pool stats, and
> visualizes streaming (in trunk).
> 
> We're hoping to get some feedback on our side if that's something people
> are interested in.  We've gone back and forth privately on our own
> preferences, hopes, dreams, etc, but I feel like a public discussion would
> be healthy at this point.  Does anyone share the view of using Reaper as a
> starting point?  What concerns to people have?
> -- 
> Jon Haddad
> http://www.rustyrazorblade.com
> twitter: rustyrazorblade

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: Side Car New Repo vs not

2018-08-23 Thread Jeff Jirsa
+1 for separate repo


-- 
Jeff Jirsa


> On Aug 23, 2018, at 1:00 PM, sankalp kohli  wrote:
> 
> Separate repo is in a majority so far. Please reply to this thread with
> your responses.
> 
> On Tue, Aug 21, 2018 at 4:34 PM Rahul Singh 
> wrote:
> 
>> +1 for separate repo. Especially on git. Maybe make it a submodule.
>> 
>> Rahul
>> On Aug 21, 2018, 3:33 PM -0500, Stefan Podkowinski ,
>> wrote:
>>> I'm also currently -1 on the in-tree option.
>>> 
>>> Additionally to what Aleksey mentioned, I also don't see how we could
>>> make this work with the current build and release process. Our scripts
>>> [0] for creating releases (tarballs and native packages), would need
>>> significant work to add support for an independent side-car. Our ant
>>> based build process is also not a great start for adding new tasks, let
>>> alone integrating other tool chains for web components for a potential
>> UI.
>>> 
>>> [0] https://git-wip-us.apache.org/repos/asf?p=cassandra-builds.git
>>> 
>>> 
>>>> On 21.08.18 19:20, Aleksey Yeshchenko wrote:
>>>> Sure, allow me to elaborate - at least a little bit. But before I do,
>> just let me note that this wasn’t a veto -1, just a shorthand for “I don’t
>> like this option”.
>>>> 
>>>> It would be nice to have sidecar and C* version and release cycles
>> fully decoupled. I know it *can* be done when in-tree, but the way we vote
>> on releases with tags off current branches would have to change somehow.
>> Probably painfully. It would be nice to be able to easily enforce freezes,
>> like the upcoming one, on the whole C* repo, while allowing feature
>> development on the sidecar. It would be nice to not have sidecar commits in
>> emails from commits@ mailing list. It would be nice to not have C* CI
>> trigger necessarily on sidecar commits. Groups of people working on the two
>> repos will mostly be different too, so what’s the point in sharing the repo?
>>>> 
>>>> Having an extra repo with its own set of branches is cheap and easy -
>> we already do that with dtests. I like cleanly separated things when
>> coupling is avoidable. As such I would prefer the sidecar to live in a
>> separate new repo, while still being part of the C* project.
>>>> 
>>>> —
>>>> AY
>>>> 
>>>> On 21 August 2018 at 17:06:39, sankalp kohli (kohlisank...@gmail.com)
>> wrote:
>>>> 
>>>> Hi Aleksey,
>>>> Can you please elaborate on the reasons for your -1? This
>>>> way we can make progress towards any one approach.
>>>> Thanks,
>>>> Sankalp
>>>> 
>>>> On Tue, Aug 21, 2018 at 8:39 AM Aleksey Yeshchenko 
>>>> wrote:
>>>> 
>>>>> FWIW I’m strongly -1 on in-tree approach, and would much prefer a
>> separate
>>>>> repo, dtest-style.
>>>>> 
>>>>> —
>>>>> AY
>>>>> 
>>>>> On 21 August 2018 at 16:36:02, Jeremiah D Jordan (
>>>>> jeremiah.jor...@gmail.com) wrote:
>>>>> 
>>>>> I think the following is a very big plus of it being in tree:
>>>>>>> * Faster iteration speed in general. For example when we need to
>> add a
>>>>>>> new
>>>>>>> JMX endpoint that the sidecar needs, or change something from
>> JMX to a
>>>>>>> virtual table (e.g. for repair, or monitoring) we can do all
>> changes
>>>>>>> including tests as one commit within the main repository and
>> don't
>>>>> have
>>>>>>> to
>>>>>>> commit to main repo, sidecar repo,
>>>>> 
>>>>> I also don’t see a reason why the sidecar being in tree means it
>> would not
>>>>> work in a mixed version cluster. The nodes themselves must work in a
>> mixed
>>>>> version cluster during a rolling upgrade, I would expect any
>> management
>>>>> side car to operate in the same manor, in tree or not.
>>>>> 
>>>>> This tool will be pretty tightly coupled with the server, and as
>> someone
>>>>> with experience developing such tightly coupled tools, it is *much*
>> easier
>>>>> to make sure you don’t accidentally break them if they are in tree.
>> How
>>>>> many times has someone updated some JMX interface, updated nodetool,
>> and
>>>>> then moved on

Re: Proposing an Apache Cassandra Management process

2018-08-20 Thread Jeff Jirsa
On Mon, Aug 20, 2018 at 4:14 PM Roopa Tangirala
 wrote:

> contributions should be evaluated based on the merit of code and their
> value add to the whole offering. I  hope it does not matter whether that
> contribution comes from PMC member or a person who is not a committer.


I hope this goes without saying.


Re: NGCC 2018?

2018-07-26 Thread Jeff Jirsa
Bay area event is interesting to me, in any format.


On Thu, Jul 26, 2018 at 9:03 PM, Ben Bromhead  wrote:

> It sounds like there may be an appetite for something, but the NGCC in its
> current format is likely to not be that useful?
>
> Is a bay area event focused on C* developers something that is interesting
> for the broader dev community? In whatever format that may be?
>
> On Tue, Jul 24, 2018 at 5:02 PM Nate McCall  wrote:
>
> > This was discussed amongst the PMC recently. We did not come to a
> > conclusion and there were not terribly strong feelings either way.
> >
> > I don't feel like we need to hustle to get "NGCC" in place,
> > particularly given our decided focus on 4.0. However, that should not
> > stop us from doing an additional 'c* developer' event in sept. to
> > coincide with distributed data summit.
> >
> > On Wed, Jul 25, 2018 at 5:03 AM, Patrick McFadin 
> > wrote:
> > > Ben,
> > >
> > > Lynn Bender had offered a space the day before Distributed Data Summit
> in
> > > September (http://distributeddatasummit.com/) since we are both
> platinum
> > > sponsors. I thought he and Nate had talked about that being a good
> place
> > > for NGCC since many of us will be in town already.
> > >
> > > Nate, now that I've spoken for you, you can clarify, :D
> > >
> > > Patrick
> > >
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > For additional commands, e-mail: dev-h...@cassandra.apache.org
> >
> > --
> Ben Bromhead
> CTO | Instaclustr 
> +1 650 284 9692
> Reliability at Scale
> Cassandra, Spark, Elasticsearch on AWS, Azure, GCP and Softlayer
>


Re: [VOTE] Release Apache Cassandra 3.11.3 (Take 2)

2018-07-25 Thread Jeff Jirsa
+1

On Wed, Jul 25, 2018 at 12:16 AM, Michael Shuler 
wrote:

> I propose the following artifacts for release as 3.11.3.
>
> sha1: 31d5d870f9f5b56391db46ba6cdf9e0882d8a5c0
> Git:
> http://git-wip-us.apache.org/repos/asf?p=cassandra.git;a=
> shortlog;h=refs/tags/3.11.3-tentative
> Artifacts:
> https://repository.apache.org/content/repositories/
> orgapachecassandra-1164/org/apache/cassandra/apache-cassandra/3.11.3/
> Staging repository:
> https://repository.apache.org/content/repositories/
> orgapachecassandra-1164/
>
> The Debian and RPM packages are available here:
> http://people.apache.org/~mshuler
>
> The vote will be open for 72 hours (longer if needed).
>
> [1]: CHANGES.txt:
> http://git-wip-us.apache.org/repos/asf?p=cassandra.git;a=
> blob_plain;f=CHANGES.txt;hb=refs/tags/3.11.3-tentative
> [2]: NEWS.txt:
> http://git-wip-us.apache.org/repos/asf?p=cassandra.git;a=
> blob_plain;f=CHANGES.txt;hb=refs/tags/3.11.3-tentative
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
>


Re: [VOTE] Release Apache Cassandra 2.2.13

2018-07-25 Thread Jeff Jirsa
+1

On Wed, Jul 25, 2018 at 12:17 AM, Michael Shuler 
wrote:

> I propose the following artifacts for release as 2.2.13.
>
> sha1: 3482370df5672c9337a16a8a52baba53b70a4fe8
> Git:
> http://git-wip-us.apache.org/repos/asf?p=cassandra.git;a=
> shortlog;h=refs/tags/2.2.13-tentative
> Artifacts:
> https://repository.apache.org/content/repositories/
> orgapachecassandra-1167/org/apache/cassandra/apache-cassandra/2.2.13/
> Staging repository:
> https://repository.apache.org/content/repositories/
> orgapachecassandra-1167/
>
> The Debian and RPM packages are available here:
> http://people.apache.org/~mshuler
>
> The vote will be open for 72 hours (longer if needed).
>
> [1]: CHANGES.txt:
> http://git-wip-us.apache.org/repos/asf?p=cassandra.git;a=
> blob_plain;f=CHANGES.txt;hb=refs/tags/2.2.13-tentative
> [2]: NEWS.txt:
> http://git-wip-us.apache.org/repos/asf?p=cassandra.git;a=
> blob_plain;f=CHANGES.txt;hb=refs/tags/2.2.13-tentative
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
>


Re: [VOTE] Release Apache Cassandra 3.0.17 (Take 2)

2018-07-25 Thread Jeff Jirsa
+1

On Wed, Jul 25, 2018 at 12:17 AM, Michael Shuler 
wrote:

> I propose the following artifacts for release as 3.0.17.
>
> sha1: d52c7b8c595cc0d06fc3607bf16e3f595f016bb6
> Git:
> http://git-wip-us.apache.org/repos/asf?p=cassandra.git;a=
> shortlog;h=refs/tags/3.0.17-tentative
> Artifacts:
> https://repository.apache.org/content/repositories/
> orgapachecassandra-1165/org/apache/cassandra/apache-cassandra/3.0.17/
> Staging repository:
> https://repository.apache.org/content/repositories/
> orgapachecassandra-1165/
>
> The Debian and RPM packages are available here:
> http://people.apache.org/~mshuler
>
> The vote will be open for 72 hours (longer if needed).
>
> [1]: CHANGES.txt:
> http://git-wip-us.apache.org/repos/asf?p=cassandra.git;a=
> blob_plain;f=CHANGES.txt;hb=refs/tags/3.0.17-tentative
> [2]: NEWS.txt:
> http://git-wip-us.apache.org/repos/asf?p=cassandra.git;a=
> blob_plain;f=CHANGES.txt;hb=refs/tags/3.0.17-tentative
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
>


Re: Scratch an itch

2018-07-12 Thread Jeff Jirsa
On Thu, Jul 12, 2018 at 10:54 AM, Michael Burman 
wrote:

> On 07/12/2018 07:38 PM, Stefan Podkowinski wrote:
>
>> this point? Also, if we tell someone that their contribution will be
>> reviewed and committed later after 4.0-beta, how is that actually making
>> a difference for that person, compared to committing it now for a 4.x
>> version. It may be satisfying to get a patch committed, but what matters
>> more is when the code will actually be released and deferring committing
>> contributions after 4.0-beta doesn't necessarily mean that there's any
>> disadvantage when it comes to that.
>>
>> Deferring huge amount of commits gives rebase/redo hell. That's the
> biggest impact and the order in which these deferred commits are then
> actually committed can make it more painful or less painful depending on
> the commit. And that in turn will have to then wait for each contributor to
> rebase/redo their commit and those timings might make more rebase issues.
> If those committers will want to rebase something after n-months or have
> time at that point.
>
>
This is true, but it's also part of the point - if the people fixing bugs
for 4.0 proper have to spend a bunch of time rebasing around 4.next
features, then that rebase hell gets in the way of fixing bugs for a
release (because we wouldn't commit just to 4.0 without also rebasing for
trunk).


> That's a problem for all Cassandra patches that take huge time to commit
> and if this block takes a lot of time, then that will for sure be even more
> painful. I know products such as Kubernetes does the same (I guess that's
> where this idea might have come from) "trunk patches only", but their block
> is quite short.
>
> My wish is that this freeze does not last too long to kill enthusiasm
> towards committing to Cassandra. There are (I assume) many hobbyist who do
> this as a side-project instead of their daily work and might not have the
> capabilities to test 4.0 in a way that will trigger bugs (easy bugs are
> fixed quite quickly I hope). And if they feel like it's not worth the time
> at this point to invest time to Cassandra (because nothing they do will get
> merged) - they might move to another project. And there's no guarantee they
> will return. Getting stuff to the product is part of the satisfaction and
> without satisfaction there's no interest in continuing.
>

I wish for this too.


Re: [VOTE] Branching Change for 4.0 Freeze

2018-07-11 Thread Jeff Jirsa
+1


-- 
Jeff Jirsa


> On Jul 11, 2018, at 2:46 PM, sankalp kohli  wrote:
> 
> Hi,
>As discussed in the thread[1], we are proposing that we will not branch
> on 1st September but will only allow following merges into trunk.
> 
> a. Bug and Perf fixes to 4.0.
> b. Critical bugs in any version of C*.
> c. Testing changes to help test 4.0
> 
> If someone has a change which does not fall under these three, we can
> always discuss it and have an exception.
> 
> Vote will be open for 72 hours.
> 
> Thanks,
> Sankalp
> 
> [1]
> https://lists.apache.org/thread.html/494c3ced9e83ceeb53fa127e44eec6e2588a01b769896b25867fd59f@%3Cdev.cassandra.apache.org%3E

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: Testing 4.0 Post-Freeze

2018-07-10 Thread Jeff Jirsa
Ultimately, we have a consensus driven development. If Jonathan or Dave
strongly disagrees with this, they can share their strong disagreement.

Jonathan shared his concern about dissuading contributors.

What's absurd is trying the same thing we've tried for 10 years and
expecting things to magically change. We know that a lot of folks are
lining up to test the 4.0 release. If people who have contributed enough to
be able to commit have time to work on features, the proposal is that the
project make it known that we'd rather have them work on testing than
commit their patch, or hold their patch until testing is done. That doesn't
mean they're suddenly not allowed to commit, it's that we'd prefer they use
their time and attention in a more constructive manner.

- Jeff



On Tue, Jul 10, 2018 at 10:18 AM, Jonathan Haddad  wrote:

> I guess I look at the initial voting in of committers as the process
> by which people are trusted to merge things in.  This proposed process
> revokes that trust. If Jonathan Ellis or Dave Brosius (arbitrarily
> picked) wants to merge a new feature into trunk during the freeze, now
> they're not allowed?  That's absurd.  People have already met the bar
> and have been voted in by merit, they should not have their privilege
> revoked.
> On Tue, Jul 10, 2018 at 10:14 AM Ben Bromhead  wrote:
> >
> > Well put Mick
> >
> > +1
> >
> > On Tue, Jul 10, 2018 at 1:06 PM Aleksey Yeshchenko 
> > wrote:
> >
> > > +1 from me too.
> > >
> > > —
> > > AY
> > >
> > > On 10 July 2018 at 04:17:26, Mick Semb Wever (m...@apache.org) wrote:
> > >
> > >
> > > > We have done all this for previous releases and we know it has not
> > > worked
> > > > well. So how giving it one more try is going to help here. Can
> someone
> > > > outline what will change for 4.0 which will make it more successful?
> > >
> > >
> > > I (again) agree with you Sankalp :-)
> > >
> > > Why not try something new?
> > > It's easier to discuss these things more genuinely after trying it out.
> > >
> > > One of the differences in the branching approaches: to feature-freeze
> on a
> > > 4.0 branch or on trunk; is who it is that has to then merge and work
> with
> > > multiple branches.
> > >
> > > Where that small but additional effort is placed I think becomes a
> signal
> > > to what the community values most: new features or stability.
> > >
> > > I think most folk would vote for stability, so why not give this
> approach
> > > a go and to learn from it.
> > > It also creates an incentive to make the feature-freeze period as
> short as
> > > possible, moving us towards an eventual goal of not needing to
> > > feature-freeze at all.
> > >
> > > regards,
> > > Mick
> > >
> > > -
> > > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > > For additional commands, e-mail: dev-h...@cassandra.apache.org
> > >
> > > --
> > Ben Bromhead
> > CTO | Instaclustr 
> > +1 650 284 9692
> > Reliability at Scale
> > Cassandra, Spark, Elasticsearch on AWS, Azure, GCP and Softlayer
>
>
>
> --
> Jon Haddad
> http://www.rustyrazorblade.com
> twitter: rustyrazorblade
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
>


Re: Testing 4.0 Post-Freeze

2018-07-03 Thread Jeff Jirsa
Yes?

-- 
Jeff Jirsa


> On Jul 3, 2018, at 2:29 PM, Jonathan Ellis  wrote:
> 
> Is that worth the risk of demotivating new contributors who might have
> other priorities?
> 
>> On Tue, Jul 3, 2018 at 4:22 PM, Jeff Jirsa  wrote:
>> 
>> I think there's value in the psychological commitment that if someone has
>> time to contribute, their contributions should be focused on validating a
>> release, not pushing future features.
>> 
>> 
>>> On Tue, Jul 3, 2018 at 1:03 PM, Jonathan Haddad  wrote:
>>> 
>>> I agree with Josh. I don’t see how changing the convention around trunk
>>> will improve the process, seems like it’ll only introduce a handful of
>>> rollback commits when people forget.
>>> 
>>> Other than that, it all makes sense to me.
>>> 
>>> I’ve been working on a workload centric stress tool on and off for a
>> little
>>> bit in an effort to create something that will help with wider adoption
>> in
>>> stress testing. It differs from the stress we ship by including fully
>>> functional stress workloads as well as a validation process. The idea
>> being
>>> to be flexible enough to test both performance and correctness in LWT and
>>> MVs as well as other arbitrary workloads.
>>> 
>>> https://github.com/thelastpickle/tlp-stress
>>> 
>>> Jon
>>> 
>>> 
>>> On Tue, Jul 3, 2018 at 12:28 PM Josh McKenzie 
>>> wrote:
>>> 
>>>> Why not just branch a 4.0-rel and bugfix there and merge up while still
>>>> accepting new features or improvements on trunk?
>>>> 
>>>> I don't think the potential extra engagement in testing will balance
>> out
>>>> the atrophy and discouraging contributions / community engagement we'd
>>> get
>>>> by deferring all improvements and new features in an open-ended way.
>>>> 
>>>> On Tue, Jul 3, 2018 at 1:33 PM sankalp kohli 
>>>> wrote:
>>>> 
>>>>> Hi cassandra-dev@,
>>>>> 
>>>>> With the goal of making Cassandra's 4.0 the most stable major release
>>> to
>>>>> date, we would like all committers of the project to consider joining
>>> us
>>>> in
>>>>> dedicating their time and attention to testing, running, and fixing
>>>> issues
>>>>> in 4.0 between the September freeze and the 4.0 beta release. This
>>> would
>>>>> result in a freeze of new feature development on trunk or branches
>>> during
>>>>> this period, instead focusing on writing, improving, and running
>> tests
>>> or
>>>>> fixing and reviewing bugs or performance regressions found in 4.0 or
>>>>> earlier.
>>>>> 
>>>>> How would this work?
>>>>> 
>>>>> We propose that between the September freeze date and beta, a new
>>> branch
>>>>> would not be created and trunk would only have bug fixes and
>>> performance
>>>>> improvements committed to it. At the same time we do not want to
>>>> discourage
>>>>> community contributions. Not all contributors can be expected to be
>>> aware
>>>>> of such a decision or may be new to the project. In cases where new
>>>>> features are contributed during this time, the contributor can be
>>>> informed
>>>>> of the current status of the release process, be encouraged to
>>> contribute
>>>>> to testing or bug fixing, and have their feature reviewed after the
>>> beta
>>>> is
>>>>> reached.
>>>>> 
>>>>> 
>>>>> What happens when beta is reached?
>>>>> 
>>>>> Ideally, contributors who have made significant contributions to the
>>>>> release will stick around to continue testing between beta and final
>>>>> release. Any additional folks who continue this focus would also be
>>>> greatly
>>>>> appreciated.
>>>>> 
>>>>> What about before the freeze?
>>>>> 
>>>>> Testing new features is of course important. This isn't meant to
>>>> discourage
>>>>> development – only to enable us to focus on testing and hardening 4.0
>>> to
>>>>> deliver Cassandra's most stable major release. We would like to see
>>>>> adoption of 4.0 happen much more quickly than its predecessor.
>>>>> 
>>>>> Thanks for considering this proposal,
>>>>> Sankalp Kohli
>>>> 
>>> --
>>> Jon Haddad
>>> http://www.rustyrazorblade.com
>>> twitter: rustyrazorblade
>>> 
>> 
> 
> 
> 
> -- 
> Jonathan Ellis
> co-founder, http://www.datastax.com
> @spyced

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: Testing 4.0 Post-Freeze

2018-07-03 Thread Jeff Jirsa
I think there's value in the psychological commitment that if someone has
time to contribute, their contributions should be focused on validating a
release, not pushing future features.


On Tue, Jul 3, 2018 at 1:03 PM, Jonathan Haddad  wrote:

> I agree with Josh. I don’t see how changing the convention around trunk
> will improve the process, seems like it’ll only introduce a handful of
> rollback commits when people forget.
>
> Other than that, it all makes sense to me.
>
> I’ve been working on a workload centric stress tool on and off for a little
> bit in an effort to create something that will help with wider adoption in
> stress testing. It differs from the stress we ship by including fully
> functional stress workloads as well as a validation process. The idea being
> to be flexible enough to test both performance and correctness in LWT and
> MVs as well as other arbitrary workloads.
>
> https://github.com/thelastpickle/tlp-stress
>
> Jon
>
>
> On Tue, Jul 3, 2018 at 12:28 PM Josh McKenzie 
> wrote:
>
> > Why not just branch a 4.0-rel and bugfix there and merge up while still
> > accepting new features or improvements on trunk?
> >
> > I don't think the potential extra engagement in testing will balance out
> > the atrophy and discouraging contributions / community engagement we'd
> get
> > by deferring all improvements and new features in an open-ended way.
> >
> > On Tue, Jul 3, 2018 at 1:33 PM sankalp kohli 
> > wrote:
> >
> > > Hi cassandra-dev@,
> > >
> > > With the goal of making Cassandra's 4.0 the most stable major release
> to
> > > date, we would like all committers of the project to consider joining
> us
> > in
> > > dedicating their time and attention to testing, running, and fixing
> > issues
> > > in 4.0 between the September freeze and the 4.0 beta release. This
> would
> > > result in a freeze of new feature development on trunk or branches
> during
> > > this period, instead focusing on writing, improving, and running tests
> or
> > > fixing and reviewing bugs or performance regressions found in 4.0 or
> > > earlier.
> > >
> > > How would this work?
> > >
> > > We propose that between the September freeze date and beta, a new
> branch
> > > would not be created and trunk would only have bug fixes and
> performance
> > > improvements committed to it. At the same time we do not want to
> > discourage
> > > community contributions. Not all contributors can be expected to be
> aware
> > > of such a decision or may be new to the project. In cases where new
> > > features are contributed during this time, the contributor can be
> > informed
> > > of the current status of the release process, be encouraged to
> contribute
> > > to testing or bug fixing, and have their feature reviewed after the
> beta
> > is
> > > reached.
> > >
> > >
> > > What happens when beta is reached?
> > >
> > > Ideally, contributors who have made significant contributions to the
> > > release will stick around to continue testing between beta and final
> > > release. Any additional folks who continue this focus would also be
> > greatly
> > > appreciated.
> > >
> > > What about before the freeze?
> > >
> > > Testing new features is of course important. This isn't meant to
> > discourage
> > > development – only to enable us to focus on testing and hardening 4.0
> to
> > > deliver Cassandra's most stable major release. We would like to see
> > > adoption of 4.0 happen much more quickly than its predecessor.
> > >
> > > Thanks for considering this proposal,
> > > Sankalp Kohli
> >
> --
> Jon Haddad
> http://www.rustyrazorblade.com
> twitter: rustyrazorblade
>


Re: [VOTE] Release Apache Cassandra 3.0.17

2018-07-02 Thread Jeff Jirsa
+1

On Mon, Jul 2, 2018 at 1:10 PM, Michael Shuler 
wrote:

> I propose the following artifacts for release as 3.0.17.
>
> sha1: c4e6cd2a1aca84a88983192368bbcd4c8887c8b2
> Git: http://git-wip-us.apache.org/repos/asf?p=cassandra.git;a=sho
> rtlog;h=refs/tags/3.0.17-tentative
> Artifacts: https://repository.apache.org/content/repositories/orgapache
> cassandra-1160/org/apache/cassandra/apache-cassandra/3.0.17/
> Staging repository: https://repository.apache.org/
> content/repositories/orgapachecassandra-1160/
>
> The Debian and RPM packages are available here:
> http://people.apache.org/~mshuler/
>
> The vote will be open for 72 hours (longer if needed).
>
> [1]: (CHANGES.txt) http://git-wip-us.apache.org/r
> epos/asf?p=cassandra.git;a=blob_plain;f=CHANGES.txt;hb=refs/
> tags/3.0.17-tentative
> [2]: (NEWS.txt) http://git-wip-us.apache.org/r
> epos/asf?p=cassandra.git;a=blob_plain;f=NEWS.txt;hb=refs/tag
> s/3.0.17-tentative
>
> --
> Michael
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
>


Re: [VOTE] Release Apache Cassandra 3.11.3

2018-07-02 Thread Jeff Jirsa
+1

On Mon, Jul 2, 2018 at 1:11 PM, Michael Shuler 
wrote:

> I propose the following artifacts for release as 3.11.3.
>
> sha1: aed1b5fdf1e953d19bdd021ba603618772208cdd
> Git: http://git-wip-us.apache.org/repos/asf?p=cassandra.git;a=sho
> rtlog;h=refs/tags/3.11.3-tentative
> Artifacts: https://repository.apache.org/content/repositories/orgapache
> cassandra-1161/org/apache/cassandra/apache-cassandra/3.11.3/
> Staging repository: https://repository.apache.org/
> content/repositories/orgapachecassandra-1161/
>
> The Debian and RPM packages are available here:
> http://people.apache.org/~mshuler/
>
> The vote will be open for 72 hours (longer if needed).
>
> [1]: (CHANGES.txt) http://git-wip-us.apache.org/r
> epos/asf?p=cassandra.git;a=blob_plain;f=CHANGES.txt;hb=refs/
> tags/3.11.3-tentative
> [2]: (NEWS.txt) http://git-wip-us.apache.org/r
> epos/asf?p=cassandra.git;a=blob_plain;f=NEWS.txt;hb=refs/tag
> s/3.11.3-tentative
>
> --
> Michael
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
>


Re: [VOTE] Release Apache Cassandra 2.2.13

2018-07-02 Thread Jeff Jirsa
+1

On Mon, Jul 2, 2018 at 1:10 PM, Michael Shuler 
wrote:

> I propose the following artifacts for release as 2.2.13.
>
> sha1: 9ff78249a0a5e87bd04bf9804ef1a3b29b5e1645
> Git: http://git-wip-us.apache.org/repos/asf?p=cassandra.git;a=sho
> rtlog;h=refs/tags/2.2.13-tentative
> Artifacts: https://repository.apache.org/content/repositories/orgapache
> cassandra-1159/org/apache/cassandra/apache-cassandra/2.2.13/
> Staging repository: https://repository.apache.org/
> content/repositories/orgapachecassandra-1159/
>
> The Debian and RPM packages are available here:
> http://people.apache.org/~mshuler/
>
> The vote will be open for 72 hours (longer if needed).
>
> [1]: (CHANGES.txt) http://git-wip-us.apache.org/r
> epos/asf?p=cassandra.git;a=blob_plain;f=CHANGES.txt;hb=refs/
> tags/2.2.13-tentative
> [2]: (NEWS.txt) http://git-wip-us.apache.org/r
> epos/asf?p=cassandra.git;a=blob_plain;f=NEWS.txt;hb=refs/tag
> s/2.2.13-tentative
>
> --
> Michael
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
>


Re: Tombstone passed GC period causes un-repairable inconsistent data

2018-06-21 Thread Jeff Jirsa
Think he's talking about
https://issues.apache.org/jira/browse/CASSANDRA-6434

Doesn't solve every problem if you don't run repair at all, but if you're
not running repairs, you're nearly guaranteed problems with resurrection
after gcgs anyway.



On Thu, Jun 21, 2018 at 11:33 AM, Jay Zhuang  wrote:

> Yes, I also agree that the user should run (incremental) repair within GCGS
> to prevent it from happening.
>
> @Sankalp, would you please point us the patch you mentioned from Marcus?
> The problem is basically the same as
> https://issues.apache.org/jira/browse/CASSANDRA-14145
>
> CASSANDRA-11427  is
> actually the opposite of this problem. As purgeable tombstone is repaired,
> this un-repairable problem cannot be reproduced. I tried 2.2.5 (before the
> fix), it's able to repair the purgeable tombstone from node1 to node2, so
> the data is deleted as expected. But it doesn't mean that's the right
> behave, as it will also cause purgeable tombstones keeps bouncing around
> the nodes.
> I think https://issues.apache.org/jira/browse/CASSANDRA-14145 will fix the
> problem by detecting the repaired/un-repaired data.
>
> How about having hints dispatch to deliver/replay purgeable (not live)
> tombstones? It will reduce the chance to have this issue, especially when
> GCGS < hinted handoff window.
>
> On Wed, Jun 20, 2018 at 9:36 AM sankalp kohli 
> wrote:
>
> > I agree with Stefan that we should use incremental repair and use patches
> > from Marcus to drop tombstones only from repaired data.
> > Regarding deep repair, you can bump the read repair and run the repair.
> The
> > issue will be that you will stream lot of data and also your blocking
> read
> > repair will go up when you bump the gc grace to higher value.
> >
> > On Wed, Jun 20, 2018 at 1:10 AM Stefan Podkowinski 
> > wrote:
> >
> > > Sounds like an older issue that I tried to address two years ago:
> > > https://issues.apache.org/jira/browse/CASSANDRA-11427
> > >
> > > As you can see, the result hasn't been as expected and we got some
> > > unintended side effects based on the patch. I'm not sure I'd be willing
> > > to give this another try, considering the behaviour we like to fix in
> > > the first place is rather harmless and the read repairs shouldn't
> happen
> > > at all to any users who regularly run repairs within gc_grace.
> > >
> > > What I'd suggest is to think more into the direction of a
> > > post-full-repair-world and to fully embrace incremental repairs, as
> > > fixed by Blake in 4.0. In that case, we should stop doing read repairs
> > > at all for repaired data, as described in
> > > https://issues.apache.org/jira/browse/CASSANDRA-13912. RRs are
> certainly
> > > useful, but can be very risky if not very very carefully implemented.
> So
> > > I'm wondering if we shouldn't disable RRs for everything but unrepaired
> > > data. I'd btw also be interested to hear any opinions on this in
> context
> > > of transient replicas.
> > >
> > >
> > > On 20.06.2018 03:07, Jay Zhuang wrote:
> > > > Hi,
> > > >
> > > > We know that the deleted data may re-appear if repair is not run
> within
> > > > gc_grace_seconds. When the tombstone is not propagated to all nodes,
> > the
> > > > data will re-appear. But it's also causing following 2 issues before
> > the
> > > > tombstone is compacted away:
> > > > a. inconsistent query result
> > > >
> > > > With consistency level ONE or QUORUM, it may or may not return the
> > value.
> > > > b. lots of read repairs, but doesn't repair anything
> > > >
> > > > With consistency level ALL, it always triggers a read repair.
> > > > With consistency level QUORUM, it also very likely (2/3) causes a
> read
> > > > repair. But it doesn't repair the data, so it's causing repair every
> > > time.
> > > >
> > > >
> > > > Here are the reproducing steps:
> > > >
> > > > 1. Create a 3 nodes cluster
> > > > 2. Create a table (with small gc_grace_seconds):
> > > >
> > > > CREATE KEYSPACE foo WITH replication = {'class': 'SimpleStrategy',
> > > > 'replication_factor': 3};
> > > > CREATE TABLE foo.bar (
> > > > id int PRIMARY KEY,
> > > > name text
> > > > ) WITH gc_grace_seconds=30;
> > > >
> > > > 3. Insert data with consistency all:
> > > >
> > > > INSERT INTO foo.bar (id, name) VALUES(1, 'cstar');
> > > >
> > > > 4. stop 1 node
> > > >
> > > > $ ccm node2 stop
> > > >
> > > > 5. Delete the data with consistency quorum:
> > > >
> > > > DELETE FROM foo.bar WHERE id=1;
> > > >
> > > > 6. Wait 30 seconds and then start node2:
> > > >
> > > > $ ccm node2 start
> > > >
> > > > Now the tombstone is on node1 and node3 but not on node2.
> > > >
> > > > With quorum read, it may or may not return value, and read repair
> will
> > > send
> > > > the data from node2 to node1 and node3, but it doesn't repair
> anything.
> > > >
> > > > I'd like to discuss a few potential solutions and workarounds:
> > > >
> > > > 1. Can hints replay sends GCed tombstone?
> > > >
> > > > 

Re: secondary index table - tombstones surviving compactions

2018-05-18 Thread Jeff Jirsa
This would matter for the base table, but would be less likely for the 
secondary index, where the partition key is the value of the base row

Roman: there’s a config option related to only purging repaired tombstones - do 
you have that enabled ? If so, are you running repairs?

-- 
Jeff Jirsa


> On May 18, 2018, at 6:41 AM, Eric Stevens <migh...@gmail.com> wrote:
> 
> The answer to Question 3 is "yes."  One of the more subtle points about
> tombstones is that Cassandra won't remove them during compaction if there
> is a bloom filter on any SSTable on that replica indicating that it
> contains the same partition (not primary) key.  Even if it is older than
> gc_grace, and would otherwise be a candidate for cleanup.
> 
> If you're recycling partition keys, your tombstones may never be able to be
> cleaned up, because in this scenario there is a high probability that an
> SSTable not involved in that compaction also contains the same partition
> key, and so compaction cannot have confidence that it's safe to remove the
> tombstone (it would have to fully materialize every record in the
> compaction, which is too expensive).
> 
> In general it is an antipattern in Cassandra to write to a given partition
> indefinitely for this and other reasons.
> 
> On Fri, May 18, 2018 at 2:37 AM Roman Bielik <
> roman.bie...@openmindnetworks.com> wrote:
> 
>> Hi,
>> 
>> I have a Cassandra 3.11 table (with compact storage) and using secondary
>> indices with rather unique data stored in the indexed columns. There are
>> many inserts and deletes, so in order to avoid tombstones piling up I'm
>> re-using primary keys from a pool (which works fine).
>> I'm aware that this design pattern is not ideal, but for now I can not
>> change it easily.
>> 
>> The problem is, the size of 2nd index tables keeps growing (filled with
>> tombstones) no matter what.
>> 
>> I tried some aggressive configuration (just for testing) in order to
>> expedite the tombstone removal but with little-to-zero effect:
>> COMPACTION = { 'class':
>> 'LeveledCompactionStrategy', 'unchecked_tombstone_compaction': 'true',
>> 'tombstone_compaction_interval': 600 }
>> gc_grace_seconds = 600
>> 
>> I'm aware that perhaps Materialized views could provide a solution to this,
>> but I'm bind to the Thrift interface, so can not use them.
>> 
>> Questions:
>> 1. Is there something I'm missing? How come compaction does not remove the
>> obsolete indices/tombstones from 2nd index tables? Can I trigger the
>> cleanup manually somehow?
>> I have tried nodetool flush, compact, rebuild_index on both data table and
>> internal Index table, but with no result.
>> 
>> 2. When deleting a record I'm deleting the whole row at once - which would
>> create one tombstone for the whole record if I'm correct. Would it help to
>> delete the indexed columns separately creating extra tombstone for each
>> cell?
>> As I understand the underlying mechanism, the indexed column value must be
>> read in order a proper tombstone for the index is created for it.
>> 
>> 3. Could the fact that I'm reusing the primary key of a deleted record
>> shortly for a new insert interact with the secondary index tombstone
>> removal?
>> 
>> Will be grateful for any advice.
>> 
>> Regards,
>> Roman
>> 
>> --
>> <http://www.openmindnetworks.com>
>> <http://www.openmindnetworks.com/>
>> <https://www.linkedin.com/company/openmind-networks>
>> <https://twitter.com/Openmind_Ntwks>  <http://www.openmindnetworks.com/>
>> 

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: Academic paper about Cassandra database compaction

2018-05-14 Thread Jeff Jirsa
Interesting!

I suspect I know what the increased disk usage in TWCS, and it's a solvable
problem, the problem is roughly something like this:
- Window 1 has sstables 1, 2, 3, 4, 5, 6
- We start compacting 1, 2, 3, 4 (using STCS-in-TWCS first window)
- The TWCS window rolls over
- We flush (sstable 7), and trigger the TWCS window major compaction, which
starts compacting 5, 6, 7 + any other sstable from that window
- If the first compaction (1,2,3,4) has finished by the time sstable 7 is
flushed, we'll include it's result in that compaction, if it doesn't we'll
have to do the major compaction twice to guarantee we have exactly one
sstable per window, which will temporarily increase disk space

We can likely fix this by not scheduling the major compaction until we know
all of the sstables in the window are available to be compacted.

Also your data model is probably typical, but not well suited for time
series cases - if you find my 2016 Cassandra Summit TWCS talk (it's on
youtube), I mention aligning partition keys to TWCS windows, which involves
adding a second component to the partition key. This is hugely important in
terms of making sure TWCS data expires quickly and avoiding having to read
from more than one TWCS window at a time.


- Jeff



On Mon, May 14, 2018 at 7:12 AM, Lucas Benevides <
lu...@maurobenevides.com.br> wrote:

> Dear community,
>
> I want to tell you about my paper published in a conference in March. The
> title is " NoSQL Database Performance Tuning for IoT Data - Cassandra
> Case Study"  and it is available (not for free) in
> http://www.scitepress.org/DigitalLibrary/Link.aspx?doi=
> 10.5220/0006782702770284 .
>
> TWCS is used and compared with DTCS.
>
> I hope you can download it, unfortunately I cannot send copies as the
> publisher has its copyright.
>
> Lucas B. Dias
>
>
>


Spring 2018 Cassandra Dev Wrap-up

2018-05-12 Thread Jeff Jirsa
Here's what's going on in the Cassandra world this spring:

Mailing list:
- Kurt sent out a call for reviewers:
https://lists.apache.org/thread.html/f1f7926d685b7f734edb180aeddc3014d79dc6e5f89e68b751b9eb5e@%3Cdev.cassandra.apache.org%3E
- Dinesh proposed a management sidecar:
https://lists.apache.org/thread.html/a098341efd8f344494bcd2761dba5125e971b59b1dd54f282ffda253@%3Cdev.cassandra.apache.org%3E
- Joey sent some math about the impact of vnodes on availability:
https://lists.apache.org/thread.html/54a9cb1d3eeed57cbe55f14aff2fb0030bce22b59d04b32d592da6b3@%3Cdev.cassandra.apache.org%3E
- We spent some time talking about feature freeze dates for 4.0, and seem
to have landed around Sept 1:
https://lists.apache.org/thread.html/eb9f5080fbab4f4e38266c7444b467ca1c54af787568321af56e8e4b@%3Cdev.cassandra.apache.org%3E

Activity:
- Some really crude git log | grep | cut | sort nonsense suggests 58
different patch authors / contributors so far in 2018. This may be
undercounted by a bit.
- Blake Eggleston, Sam Tunnicliffe, and Stefan Podkowinski were added to
the PMC (congrats!)
- We're up to about 45 changes pending in 3.11.3 and 30'ish in 3.0.17,
nearing time for some new release votes

Notable Commits to 4.0 since February

- CASSANDRA-12151 landed, brining audit logs
- CASSANDRA-13910 landed, removing probabilistic read repair chance
- Pluggable storage interfaces are being added incrementally, with the
write path, repair, and streaming interfaces already committed

If you're bored on this weekend and want something to do, here's Kurt's
list of patches that need reviews:

Bugs:
https://issues.apache.org/jira/browse/CASSANDRA-14365
https://issues.apache.org/jira/browse/CASSANDRA-14204
https://issues.apache.org/jira/browse/CASSANDRA-14162
https://issues.apache.org/jira/browse/CASSANDRA-14126
https://issues.apache.org/jira/browse/CASSANDRA-14365
https://issues.apache.org/jira/browse/CASSANDRA-14099
https://issues.apache.org/jira/browse/CASSANDRA-14073
https://issues.apache.org/jira/browse/CASSANDRA-14063
https://issues.apache.org/jira/browse/CASSANDRA-14056
https://issues.apache.org/jira/browse/CASSANDRA-14054
https://issues.apache.org/jira/browse/CASSANDRA-14013
https://issues.apache.org/jira/browse/CASSANDRA-13841
https://issues.apache.org/jira/browse/CASSANDRA-13698

Improvements:
https://issues.apache.org/jira/browse/CASSANDRA-14309
https://issues.apache.org/jira/browse/CASSANDRA-10789
https://issues.apache.org/jira/browse/CASSANDRA-14443
https://issues.apache.org/jira/browse/CASSANDRA-13010
https://issues.apache.org/jira/browse/CASSANDRA-11559
https://issues.apache.org/jira/browse/CASSANDRA-10789
https://issues.apache.org/jira/browse/CASSANDRA-10023
https://issues.apache.org/jira/browse/CASSANDRA-8460

And Josh's similar JIRA query:
https://issues.apache.org/jira/issues/?jql=project%20%3D%20cassandra%20AND%20resolution%20%3D%20unresolved%20AND%20status%20in%20(%22Patch%20Available%22%2C%20%22Awaiting%20Feedback%22)%20AND%20reviewer%20is%20EMPTY%20ORDER%20BY%20updated%20DESC


Re: Evolving the client protocol

2018-04-28 Thread Jeff Jirsa
On Sat, Apr 28, 2018 at 4:49 AM, mck  wrote:


> We should, as open source contributors, put business concerns to the side
> and welcome opportunities to work across company and product lines.
>


I resent the fact that you're calling this a business concern. This isn't a
business concern, and as a committer and ASF member you should be able to
discern the difference.

Sylvain said:

> The native protocol is the protocol of the Apache Cassandra project and
was
> never meant to be a standard protocol.

and

> Don't get me wrong, protocol-impacting changes/additions are very much
> welcome if reasonable for Cassandra, and both CASSANDRA-14311 and
CASSANDRA-2848 are
> certainly worthy. Both the definition of done of those ticket certainly
> include the server implementation imo,

I said:

> So again: we have a Cassandra native protocol, and we have a process for
> changing it, and that process is contributor agnostic. Anyone who wants a
> change can submit a patch, and it'll get reviewed, and maybe if it's a
good
> idea, it'll get committed, but the chances of a review leading to a commit
> without an implementation is nearly zero.

The only reason business names came into it is that someone drew a false
equivalence between two businesses. They're not equivalent, and the lack of
equivalence likely explains why this thread keeps bouncing around -
Datastax would have written a patch and contributed it to the project, and
Scylla didn't. But again, the lack of protocol changes so far ISN'T because
the project somehow favors one company more than the other (it doesn't),
the protocol changes havent happened because nobody's submitted a patch.

You're a committer Mick, if you think it belongs in the database, write the
patches and get them reviewed.  Until then, the project isn't going to be
bullied into changing the protocol without an implementation.

- Jeff


Re: Evolving the client protocol

2018-04-24 Thread Jeff Jirsa
They aren't even remotely similar, they're VERY different. Here's a few
starting points:

1) Most of Datastax's work for the first 5, 6, 8 years of existence focused
on driving users to cassandra from other DBs (see all of the "Cassandra
Summits" that eventually created trademark friction) ; Scylla's marketing
is squarely Scylla v  Cassandra. Ultimately they're both companies out to
make money, but one has a history of driving users to Cassandra, and the
other is trying to siphon users away from Cassandra.
2) Datastax may not be actively contributing as much as they used to, but
some ridiculous number of engineering hours got paid out of their budget -
maybe 80% of total lines of code? Maybe higher (though it's decreasing day
by day). By contrast, Scylla has exactly zero meaningful concrete code
contributions to the project, uses a license that makes even sharing
concepts prohibitive, only has a handful or so JIRAs opened (which is
better than zero), but has effectively no goodwill in the eyes of many of
the longer-term community members (in large part because of #1, and also
because of the way they positioned their talk-turned-product announcement
at the competitor-funded 2016 summit).
3) Datastax apparently respects the project enough that they'd NEVER come
in and ask for a protocol spec change without providing a reference
implementation.
4) To that end, native protocol changes aren't something anyone is anxious
to shove in without good reason. Even with a reference implementation, and
a REALLY GOOD REASON (namely data correctness / protection from
corruption), https://issues.apache.org/jira/browse/CASSANDRA-13304 has been
sitting patch available for OVER A YEAR.

So again: we have a Cassandra native protocol, and we have a process for
changing it, and that process is contributor agnostic.  Anyone who wants a
change can submit a patch, and it'll get reviewed, and maybe if it's a good
idea, it'll get committed, but the chances of a review leading to a commit
without an implementation is nearly zero.

Would be happy to see this thread die now. There's nothing new coming out
of it.

- Jeff


On Tue, Apr 24, 2018 at 8:30 AM, Eric Stevens  wrote:

> Let met just say that as an observer to this conversation -- and someone
> who believes that compatibility, extensibility, and frankly competition
> bring out the best in products -- I'm fairly surprised and disappointed
> with the apparent hostility many community members have shown toward a
> sincere attempt by another open source product to find common ground here.
>
> Yes, Scylla has a competing OSS project (albeit under a different
> license).  They also have a business built around it.  It's hard for me to
> see that as dramatically different than the DataStax relationship to this
> community.  Though I would love to be shown why.
>


Re: Evolving the client protocol

2018-04-23 Thread Jeff Jirsa
Respectfully, there’s pretty much already apparent consensus among those with a 
vote (unless I missed some dissenting opinion while I was on vacation).

Its been expressed multiple times by committers and members of the PMC that 
it’s Cassandra native protocol, it belongs in the protocol when it’s 
implemented. I haven’t seen ANY committers or members of the PMC make an 
argument that we should alter the spec without a matching implementation. 

Unless a committer wants to make an argument that we should change the spec 
without changing the implementation, this conversation can end. 

The spec is what the server implements. Anything we don’t implement can use the 
arbitrary payload from the zipkin tracing ticket or fork.

-- 
Jeff Jirsa


> On Apr 23, 2018, at 6:18 PM, Nate McCall <zznat...@gmail.com> wrote:
> 
> Folks,
> Before this goes much further, let's take a step back for a second.
> 
> I am hearing the following: Folks are fine with CASSANDRA-14311 and
> CASSANDRA-2848 *BUT* they don't make much sense from the project's
> perspective without a reference implementation. I think the shard
> concept is too abstract for the project right now, so we should
> probably set that one aside.
> 
> Dor and Avi, I appreciate you both engaging directly on this. Where
> can we find common ground on this?
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
> 

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: Evolving the client protocol

2018-04-22 Thread Jeff Jirsa



On Apr 20, 2018, at 5:03 AM, Sylvain Lebresne  wrote:

>> 
>> 
>> Those were just given as examples. Each would be discussed on its own,
>> assuming we are able to find a way to cooperate.
>> 
>> 
>> These are relatively simple and it wouldn't be hard for use to patch
>> Cassandra. But I want to find a way to make more complicated protocol
>> changes where it wouldn't be realistic for us to modify Cassandra.
>> 
> 
> That's where I'm confused with what you are truly asking.
> 
> The native protocol is the protocol of the Apache Cassandra project and was
> never meant to be a standard protocol. If the ask is to move towards more
> of handling the protocol as a standard that would evolve independently of
> whether Cassandra implements it (would the project commit to implement it
> eventually?), then let's be clear on what the concrete suggestion is and
> have this discussion (but to be upfront, the short version of my personal
> opinion is that this would likely be a big distraction with relatively low
> merits for the project, so I'm very unconvinced).
> 
> But if that's not the ask, what is it exactly? That we agree to commit
> changes
> to the protocol spec before we have actually implemented them? If so, I just
> don't get it. The downsides are clear (we risk the feature is either never
> implemeted due to lack of contributions/loss of interest, or that the
> protocol
> changes committed are not fully suitable to the final implementation) but
> what
> benefit to the project can that ever have?

Agree with everything here 

> 
> Don't get me wrong, protocol-impacting changes/additions are very much
> welcome
> if reasonable for Cassandra, and both CASSANDRA-14311 and CASSANDRA-2848 are
> certainly worthy. Both the definition of done of those ticket certainly
> include the server implementation imo,

Also agree here - any changes to protocol on the Apache Cassandra side have to 
come with the implementation, otherwise you should consider using the optional 
arbitrary k/v map that zipkin tracing leverages for arbitrary payloads.


> not just changing the protocol spec
> file. As for the shard notion, it makes no sense for Cassandra at this point
> in time, so unless an additional contribution makes it so that it start to
> make
> sense, I'm not sure why we'd add anything related to it to the protocol.
> 
> --
> Sylvain
> 
> 
> 
>> 
>>> RE #3,
>>> 
>>> It's hard to be +1 on this because we don't benefit by boxing ourselves
>> in by defining a spec we haven't implemented, tested, and decided we are
>> satisfied with. Having it in ScyllaDB de-risks it to a certain extent, but
>> what if Cassandra decides to go a different direction in some way?
>> 
>> Such a proposal would include negotiation about the sharding algorithm
>> used to prevent Cassandra being boxed in. Of course it's impossible to
>> guarantee that a new idea won't come up that requires more changes.
>> 
>>> I don't think there is much discussion to be had without an example of
>> the the changes to the CQL specification to look at, but even then if it
>> looks risky I am not likely to be in favor of it.
>>> 
>>> Regards,
>>> Ariel
>>> 
 On Thu, Apr 19, 2018, at 9:33 AM, glom...@scylladb.com wrote:
 
 On 2018/04/19 07:19:27, kurt greaves  wrote:
>> 1. The protocol change is developed using the Cassandra process in
>>a JIRA ticket, culminating in a patch to
>>doc/native_protocol*.spec when consensus is achieved.
> I don't think forking would be desirable (for anyone) so this seems
> the most reasonable to me. For 1 and 2 it certainly makes sense but
> can't say I know enough about sharding to comment on 3 - seems to me
> like it could be locking in a design before anyone truly knows what
> sharding in C* looks like. But hopefully I'm wrong and there are
> devs out there that have already thought that through.
 Thanks. That is our view and is great to hear.
 
 About our proposal number 3: In my view, good protocol designs are
 future proof and flexible. We certainly don't want to propose a design
 that works just for Scylla, but would support reasonable
 implementations regardless of how they may look like.
 
> Do we have driver authors who wish to support both projects?
> 
> Surely, but I imagine it would be a minority. ​
> 
 -
 To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org For
 additional commands, e-mail: dev-h...@cassandra.apache.org
 
>>> -
>>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
>>> For additional commands, e-mail: dev-h...@cassandra.apache.org
>>> 
>> 
>> 
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
>> For additional 

Re: Evolving the client protocol

2018-04-18 Thread Jeff Jirsa
Removed other lists (please don't cross post)





On Wed, Apr 18, 2018 at 3:47 AM, Avi Kivity  wrote:

> Hello Cassandra developers,
>
>
> We're starting to see client protocol limitations impact performance, and
> so we'd like to evolve the protocol to remove the limitations. In order to
> avoid fragmenting the driver ecosystem and reduce work duplication for
> driver authors, we'd like to avoid forking the protocol. Since these issues
> affect Cassandra, either now or in the future, I'd like to cooperate on
> protocol development.
>
>
> Some issues that we'd like to work on near-term are:
>
>
> 1. Token-aware range queries
>
>
> When the server returns a page in a range query, it will also return a
> token to continue on. In case that token is on a different node, the client
> selects a new coordinator based on the token. This eliminates a network hop
> for range queries.
>
>
> For the first page, the PREPARE message returns information allowing the
> client to compute where the first page is held, given the query parameters.
> This is just information identifying how to compute the token, given the
> query parameters (non-range queries already do this).
>
>
> https://issues.apache.org/jira/browse/CASSANDRA-14311
>
>
> 2. Per-request timeouts
>
>
> Allow each request to have its own timeout. This allows the user to set
> short timeouts on business-critical queries that are invalid if not served
> within a short time, long timeouts for scanning or indexed queries, and
> even longer timeouts for administrative tasks like TRUNCATE and DROP.
>
>
> https://issues.apache.org/jira/browse/CASSANDRA-2848
>
>
> 3. Shard-aware driver
>
>
> This admittedly is a burning issue for ScyllaDB, but not so much for
> Cassandra at this time.
>
>
> In the same way that drivers are token-aware, they can be shard-aware -
> know how many shards each node has, and the sharding algorithm. They can
> then open a connection per shard and send cql requests directly to the
> shard that will serve them, instead of requiring cross-core communication
> to happen on the server.
>
>
> https://issues.apache.org/jira/browse/CASSANDRA-10989
>
>
> I see three possible modes of cooperation:
>
>
> 1. The protocol change is developed using the Cassandra process in a JIRA
> ticket, culminating in a patch to doc/native_protocol*.spec when consensus
> is achieved.
>
>
> The advantage to this mode is that Cassandra developers can verify that
> the change is easily implementable; when they are ready to implement the
> feature, drivers that were already adapted to support it will just work.
>
>
> 2. The protocol change is developed outside the Cassandra process.
>
>
> In this mode, we develop the change in a forked version of
> native_protocol*.spec; Cassandra can still retroactively merge that change
> when (and if) it is implemented, but the ability to influence the change
> during development is reduced.
>
>
> If we agree on this, I'd like to allocate a prefix for feature names in
> the SUPPORTED message for our use.
>
>
> 3. No cooperation.
>
>
> This requires the least amount of effort from Cassandra developers (just
> enough to reach this point in this email), but will cause duplication of
> effort for driver authors who wish to support both projects, and may cause
> Cassandra developers to redo work that we already did.
>
>
> Looking forward to your views.
>
>
> Avi
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
>


Re: Quantifying Virtual Node Impact on Cassandra Availability

2018-04-17 Thread Jeff Jirsa
There are two huge advantages 

1) during expansion / replacement / decom, you stream from far more ranges. 
Since streaming is single threaded per stream, this enables you to max out 
machines during streaming where single token doesn’t

2) when adjusting the size of a cluster, you can often grow incrementally 
without rebalancing 

Streaming entire wholly covered/contained/owned sstables during range movements 
is probably a huge benefit in many use cases that may make the single threaded 
streaming implementation less of a concern, and likely works reasonably well 
without major changes to LCS in particular  - I’m fairly confident there’s a 
JIRA for this, if not it’s been discussed in person among various operators for 
years as an obvious future improvement. 

-- 
Jeff Jirsa


> On Apr 17, 2018, at 8:17 AM, Carl Mueller <carl.muel...@smartthings.com> 
> wrote:
> 
> Do Vnodes address anything besides alleviating cluster planners from doing
> token range management on nodes manually? Do we have a centralized list of
> advantages they provide beyond that?
> 
> There seem to be lots of downsides. 2i index performance, the above
> availability, etc.
> 
> I also wonder if in vnodes (and manually managed tokens... I'll return to
> this) the node recovery scenarios are being hampered by sstables having the
> hash ranges of the vnodes intermingled in the same set of sstables. I
> wondered in another thread in vnodes why sstables are separated into sets
> by the vnode ranges they represent. For a manually managed contiguous token
> range, you could separate the sstables into a fixed number of sets, kind of
> vnode-light.
> 
> So if there was rebalancing or reconstruction, you could sneakernet or
> reliably send entire sstable sets that would belong in a range.
> 
> I also thing this would improve compactions and repairs too. Compactions
> would be naturally parallelizable in all compaction schemes, and repairs
> would have natural subsets to do merkle tree calculations.
> 
> Granted sending sstables might result in "overstreaming" due to data
> replication across the sstables, but you wouldn't have CPU and random I/O
> to look up the data. Just sequential transfers.
> 
> For manually managed tokens with subdivided sstables, if there was
> rebalancing, you would have the "fringe" edges of the hash range subdivided
> already, and you would only need to deal with the data in the border areas
> of the token range, and again could sneakernet / dumb transfer the tables
> and then let the new node remove the unneeded in future repairs.
> (Compaction does not remove data that is not longer managed by a node, only
> repair does? Or does only nodetool clean do that?)
> 
> Pre-subdivided sstables for manually maanged tokens would REALLY pay big
> dividends in large-scale cluster expansion. Say you wanted to double or
> triple the cluster. Since the sstables are already split by some numeric
> factor that has lots of even divisors (60 for RF 2,3,4,5), you simply bulk
> copy the already-subdivided sstables for the new nodes' hash ranges and
> you'd basically be done. In AWS EBS volumes, that could just be a drive
> detach / drive attach.
> 
> 
> 
> 
>> On Tue, Apr 17, 2018 at 7:37 AM, kurt greaves <k...@instaclustr.com> wrote:
>> 
>> Great write up. Glad someone finally did the math for us. I don't think
>> this will come as a surprise for many of the developers. Availability is
>> only one issue raised by vnodes. Load distribution and performance are also
>> pretty big concerns.
>> 
>> I'm always a proponent for fixing vnodes, and removing them as a default
>> until we do. Happy to help on this and we have ideas in mind that at some
>> point I'll create tickets for...
>> 
>>> On Tue., 17 Apr. 2018, 06:16 Joseph Lynch, <joe.e.ly...@gmail.com> wrote:
>>> 
>>> If the blob link on github doesn't work for the pdf (looks like mobile
>>> might not like it), try:
>>> 
>>> 
>>> https://github.com/jolynch/python_performance_toolkit/
>> raw/master/notebooks/cassandra_availability/whitepaper/cassandra-
>> availability-virtual.pdf
>>> 
>>> -Joey
>>> <
>>> https://github.com/jolynch/python_performance_toolkit/
>> raw/master/notebooks/cassandra_availability/whitepaper/cassandra-
>> availability-virtual.pdf
>>>> 
>>> 
>>> On Mon, Apr 16, 2018 at 1:14 PM, Joseph Lynch <joe.e.ly...@gmail.com>
>>> wrote:
>>> 
>>>> Josh Snyder and I have been working on evaluating virtual nodes for
>> large
>>>> scale deployments and while it seems like there is a lot of anecdotal
>>>>

Re: Roadmap for 4.0

2018-04-12 Thread Jeff Jirsa
If we push it to Sept 1 freeze, I'll personally spend a lot of time testing.

What can I do to help convince the Jun1 folks that Sept1 is acceptable?



On Thu, Apr 12, 2018 at 12:57 PM, Ben Bromhead <b...@instaclustr.com> wrote:

> I would also suggest if you can't commit to June 2 due to timing or feature
> set. If you could provide the absolute minimum date / features that would
> let you commit to testing, that would be useful.
>
> On Thu, Apr 12, 2018 at 3:49 PM Ben Bromhead <b...@instaclustr.com> wrote:
>
> > We (Instaclustr) are also happy to get started testing. Including
> > (internal to Instaclustr) production workloads.
> >
> > On Thu, Apr 12, 2018 at 3:45 PM Nate McCall <zznat...@gmail.com> wrote:
> >
> >> To be clear, more who is willing to commit to testing should we go this
> >> route.
> >>
> >> On Fri, Apr 13, 2018, 7:41 AM Nate McCall <zznat...@gmail.com> wrote:
> >>
> >> > Ok. So who's willing to test 4.0 on June 2nd? Let's start a sign up.
> >> >
> >> > We (tlp) will put some resources on this via going through some canned
> >> > scenarios we have internally. We aren't in a position to test data
> >> validity
> >> > (yet) but we can do a lot around cluster behavior.
> >> >
> >> > Who else has specific stuff they are willing to do? Even if it's just
> >> > tee'ing prod traffic, that would be hugely valuable.
> >> >
> >> > On Fri, Apr 13, 2018, 6:15 AM Jeff Jirsa <jji...@gmail.com> wrote:
> >> >
> >> >> On Thu, Apr 12, 2018 at 9:41 AM, Jonathan Haddad <j...@jonhaddad.com>
> >> >> wrote:
> >> >>
> >> >> > It sounds to me (please correct me if I'm wrong) like Jeff is
> arguing
> >> >> that
> >> >> > releasing 4.0 in 2 months isn't worth the effort of evaluating it,
> >> >> because
> >> >> > it's a big task and there's not enough stuff in 4.0 to make it
> >> >> worthwhile.
> >> >> >
> >> >> >
> >> >> More like "not enough stuff in 4.0 to make it worthwhile for the
> >> people I
> >> >> personally know to be willing and able to find the weird bugs".
> >> >>
> >> >>
> >> >> > If that is the case, I'm not quite sure how increasing the surface
> >> area
> >> >> of
> >> >> > changed code which needs to be vetted is going to make the process
> >> any
> >> >> > easier.
> >> >>
> >> >>
> >> >> It changes the interest level of at least some of the people able to
> >> >> properly test it from "not willing" to "willing".
> >> >>
> >> >> Totally possible that there exist people who are willing and able to
> >> find
> >> >> and fix those bugs, who just haven't committed to it in this thread.
> >> >> That's
> >> >> probably why Sankalp keeps asking who's actually willing to do the
> >> testing
> >> >> on June 2 - if nobody's going to commit to doing real testing on June
> >> 2,
> >> >> all we're doing is adding inconvenience to those of us who'd be
> >> willing to
> >> >> do it later in the year.
> >> >>
> >> >
> >>
> > --
> > Ben Bromhead
> > CTO | Instaclustr <https://www.instaclustr.com/>
> > +1 650 284 9692
> > Reliability at Scale
> > Cassandra, Spark, Elasticsearch on AWS, Azure, GCP and Softlayer
> >
> --
> Ben Bromhead
> CTO | Instaclustr <https://www.instaclustr.com/>
> +1 650 284 9692
> Reliability at Scale
> Cassandra, Spark, Elasticsearch on AWS, Azure, GCP and Softlayer
>


Re: Roadmap for 4.0

2018-04-11 Thread Jeff Jirsa
One clarifying point, potentially trivia, but:

On Wed, Apr 11, 2018 at 9:42 AM, Ben Bromhead  wrote:

>
> We haven't seen any actual binding -1s yet on June 1, despite obvious
> concerns and plenty of +1s
>
>
Just to be clear: binding -1 votes are vetos for code changes, but they are
not vetos for procedural issues (
https://www.apache.org/foundation/voting.html ) .

(And at this point, I think it's clear that I'd be changing to -1 on the
June 1 date, but again, that's not a veto)


Re: Roadmap for 4.0

2018-04-10 Thread Jeff Jirsa


-- 
Jeff Jirsa


On Apr 10, 2018, at 5:24 PM, Josh McKenzie <jmcken...@apache.org> wrote:

>> 
>> 50'ish days is too short to draw a line in the sand,
>> especially as people balance work obligations with Cassandra feature
>> development.
> 
> What's a reasonable alternative / compromise for this? And what
> non-disruptive-but-still-large patches are in flight that we would want to
> delay the line in the sand for?

I don’t care about non disruptive patches to be really honest. Nobody’s running 
trunk now, so it doesn’t matter to me if the patch landed 6 months ago or Jun 
29, unless you can show me one person who’s ran a nontrivial multi-dc test 
cluster under real load that included correctness validation. Short of that, 
it’s untested, and the duration a patch has been in an untested repo is 
entirely irrelevant.

If there’s really someone already testing trunk in a meaningful way (real 
workloads, and verifying correctness), and that person is really able to find 
and fix bugs, then tell me who it is and I’ll change my opinion (and  I’m not 
even talking about thousand node clusters, just someone who’s actually using 
real data, like something upgraded from 2.1/3.0, and is checking to prove it 
matches expectations). 

Otherwise, when the time comes for real users to plan real upgrades to a 
hypothetical 4.1, they’ll have to do two sets of real, expensive, annoying 
testing - one for the stuff in 4.0 (chunk cache, file format changes, internode 
changes, etc), and a second for 4.0-4.1 changes for the invasive stuff I care 
about and you don’t want to wait for.

I’d rather see us get all this stuff in and then spend real time testing and 
fixing in a 4-6 month alpha/beta phase (where real users can help, because its 
one real dedicated validation phase) than push this into two (probably 
inadequately tested) releases.

But that’s just my opinion, and I’ll support it with my one vote, and I may get 
outvoted, but that’s what I’d rather see happen.


-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: Roadmap for 4.0

2018-04-10 Thread Jeff Jirsa
Seriously, what's the rush to branch? Do we all love merging so much we
want to do a few more times just for the sake of merging? If nothing
diverges, there's nothing gained from the branch, and if it did diverge, we
add work for no real gain.

Beyond that, I still don't like June 1. Validating releases is hard. It
sounds easy to drop a 4.1 and ask people to validate again, but it's a hell
of a lot harder than it sounds. I'm not saying I'm a hard -1, but I really
think it's too soon. 50'ish days is too short to draw a line in the sand,
especially as people balance work obligations with Cassandra feature
development.




On Tue, Apr 10, 2018 at 3:18 PM, Nate McCall  wrote:

> A lot of good points and everyone's input is really appreciated.
>
> So it sounds like we are building consensus towards June 1 for 4.0
> branch point/feature freeze and the goal is stability. (No one has
> come with a hard NO anyway).
>
> I want to reiterate Sylvain's point that we can do whatever we want in
> terms of dropping a new feature 4.1/5.0 (or whatev.) whenever we want.
>
> In thinking about this, what is stopping us from branching 4.0 a lot
> sooner? Like now-ish? This will let folks start hacking on trunk with
> new stuff, and things we've gotten close on can still go in 4.0
> (Virtual tables). I guess I'm asking here if we want to disambiguate
> "feature freeze" from "branch point?" I feel like this makes sense.
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
>


Re: Roadmap for 4.0

2018-04-09 Thread Jeff Jirsa
I'd like to see pluggable storage and transient replica tickets land, for
starters.

On Mon, Apr 9, 2018 at 10:17 AM, Ben Bromhead  wrote:

> >
> > For those wanting to delay, are we just dancing around inclusion of
> > some pet features? This is fine, I just think we need to communicate
> > what we are after if so.
> >
>
> +1 Some solid examples of tickets that won't make it with the proposed
> timeline and a proposed alternative would help.
>
> Otherwise if no one chimes in I would propose sticking with June 1.
>
>
>
>
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > For additional commands, e-mail: dev-h...@cassandra.apache.org
> >
> > --
> Ben Bromhead
> CTO | Instaclustr 
> +1 650 284 9692
> Reliability at Scale
> Cassandra, Spark, Elasticsearch on AWS, Azure, GCP and Softlayer
>


Re: Roadmap for 4.0

2018-04-04 Thread Jeff Jirsa
Earlier than I’d have personally picked, but I’m +1 too



-- 
Jeff Jirsa


> On Apr 4, 2018, at 5:06 PM, Nate McCall <zznat...@gmail.com> wrote:
> 
> Top-posting as I think this summary is on point - thanks, Scott! (And
> great to have you back, btw).
> 
> It feels to me like we are coalescing on two points:
> 1. June 1 as a freeze for alpha
> 2. "Stable" is the new "Exciting" (and the testing and dogfooding
> implied by such before a GA)
> 
> How do folks feel about the above points?
> 
> 
>> Re-raising a point made earlier in the thread by Jeff and affirmed by Josh:
>> 
>> –––
>> Jeff:
>>>> A hard date for a feature freeze makes sense, a hard date for a release
>>>> does not.
>> 
>> Josh:
>>> Strongly agree. We should also collectively define what "Done" looks like
>>> post freeze so we don't end up in bike-shedding hell like we have in the
>>> past.
>> –––
>> 
>> Another way of saying this: ensuring that the 4.0 release is of high quality 
>> is more important than cutting the release on a specific date.
>> 
>> If we adopt Sylvain's suggestion of freezing features on a "feature 
>> complete" date (modulo a "definition of done" as Josh suggested), that will 
>> help us align toward the polish, performance work, and dog-fooding needed to 
>> feel great about shipping 4.0. It's a good time to start thinking about the 
>> approaches to testing, profiling, and dog-fooding various contributors will 
>> want to take on before release.
>> 
>> I love how Ben put it:
>> 
>>> An "exciting" 4.0 release to me is one that is stable and usable
>>> with no perf regressions on day 1 and includes some of the big
>>> internal changes mentioned previously.
>>> 
>>> This will set the community up well for some awesome and exciting
>>> stuff that will still be in the pipeline if it doesn't make it to 4.0.
>> 
>> That sounds great to me, too.
>> 
>> – Scott
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
> 

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: Roadmap for 4.0

2018-04-03 Thread Jeff Jirsa
A hard date for a feature freeze makes sense, a hard date for a release does 
not.

-- 
Jeff Jirsa


> On Apr 3, 2018, at 2:29 PM, Michael Shuler <mich...@pbandjelly.org> wrote:
> 
> On 04/03/2018 03:51 PM, Nate McCall wrote:
>>> My concrete proposal would be to declare a feature freeze for 4.0 in 2
>>> months,
>>> so say June 1th. That leave some time for finishing features that are in
>>> progress, but not too much to get derailed. And let's be strict on that
>>> freeze.
>> 
>> I quite like this suggestion. Thanks, Sylvain.
> 
> Should we s/TBD/somedate/ on the downloads page and get the word out?
> 
> Apache Cassandra 3.0 is supported until 6 months after 4.0 release (date
> TBD).
> Apache Cassandra 2.2 is supported until 4.0 release (date TBD).
> Apache Cassandra 2.1 is supported until 4.0 release (date TBD) with
> critical fixes only.
> 
> -- 
> Kind regards,
> Michael
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
> 

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: Roadmap for 4.0

2018-04-02 Thread Jeff Jirsa
9608 (java9)

-- 
Jeff Jirsa


> On Apr 2, 2018, at 3:45 AM, Jason Brown <jasedbr...@gmail.com> wrote:
> 
> The only additional tickets I'd like to mention are:
> 
> https://issues.apache.org/jira/browse/CASSANDRA-13971 - Automatic
> certificate management using Vault
> - Stefan's Vault integration work. A sub-ticket, CASSANDRA-14102, addresses
> encryption at-rest, subsumes CASSANDRA-9633 (SSTable encryption) - which I
> doubt I would be able to get to any time this year. It would definitely be
> nice to have a clarified encryption/security story for 4.0.
> 
> https://issues.apache.org/jira/browse/CASSANDRA-11990 - Address rows rather
> than partitions in SASI
> - a nice update for SASI, but not critical.
> 
> -Jason
> 
>> On Sat, Mar 31, 2018 at 6:53 PM, Ben Bromhead <b...@instaclustr.com> wrote:
>> 
>> Apologies all, I didn't realize I was responding to this discussion only on
>> the @user list. One of the perils of responding to a thread that is on both
>> user and dev...
>> 
>> For context, I have included my response to Kurt's previous discussion on
>> this topic as it only ended up on the user list.
>> 
>> *After some further discussions with folks offline, I'd like to revive this
>> discussion. *
>> 
>> *As Kurt mentioned, to keep it simple I if we can simply build consensus
>> around what is in for 4.0 and what is out. We can then start the process of
>> working off a 4.0 branch towards betas and release candidates. Again as
>> Kurt mentioned, assigning a timeline to it right now is difficult, but
>> having a firm line in the sand around what features/patches are in, then
>> limiting future 4.0 work to bug fixes will give folks a less nebulous
>> target to work on. *
>> 
>> *The other thing to mention is that once we have a 4.0 branch to work off,
>> we at Instaclustr have a commitment to dogfooding the release candidates on
>> our internal staging and internal production workloads before 4.0 becomes
>> generally available. I know other folks have similar commitments and simply
>> having a 4.0 branch with a clear list of things that are in or out will
>> allow everyone to start testing and driving towards a quality release. *
>> 
>> *The other thing is that there are already a large number of changes ready
>> for 4.0, I would suggest not recommending tickets for 4.0 that have not yet
>> been finished/have outstanding work unless you are the person working on it
>> (or are offering to work on it instead) and can get it ready for review in
>> a timely fashion. That way we can build a more realistic working target.
>> For other major breaking changes, there is always 5.0 or 4.1 or whatever we
>> end up doing :)*
>> 
>> Thinking further about it, I would suggest a similar process that was
>> applied to releasing 3.0, in order to get to 4.0:
>> 
>>   - Clean up ticket labeling. Move tickets unlikely to make it / be worked
>>   on for 4.0 to something else (e.g. 4.x or whatever).
>>   - Tickets labeled 4.0 will be the line in the sand, with some trigger
>>   ("done") event where all features not done by a certain event will
>> simply
>>   move into the next release. For the 3.0 branch, this occurred after a
>>   large review of 8099. For 4.0 it could simply be resolving all current
>>   blockers/major tickets tagged 4.0... doesn't have to be / nor is it
>>   something I would strongly advocate.
>>   - Once we hit this "done" event. Cut a Cassandra-4.0 branch and start
>>   the alpha/beta/rc cycle from that branch, with only bugfixes going into
>>   it
>>   - This, in my mind, is similar to the 3.0 approach
>>   https://mail-archives.apache.org/mod_mbox/cassandra-dev/
>> 201503.mbox/%3CCALdd-zjAyiTbZksMeq2LxGwLF5LPhoi_
>> 4vsjy8JBHBRnsxH%3D8A%40mail.gmail.com%3E,
>>   but without the subsequent tick-tock :)
>> 
>> There are currently 3 open blockers tagged 4.0, some are old and probably
>> not really blockers anymore, there are other tickets that may/should be
>> blockers on 4.0:
>> 
>>   - https://issues.apache.org/jira/browse/CASSANDRA-13951
>>   - https://issues.apache.org/jira/browse/CASSANDRA-13994
>>   - https://issues.apache.org/jira/browse/CASSANDRA-12042
>> 
>> In terms of major tickets that I would like to see land:
>> 
>>   - https://issues.apache.org/jira/browse/CASSANDRA-7622 Virtual Tables
>>   - https://issues.apache.org/jira/browse/CASSANDRA-13628 Internode netty
>>   - https://issues.apache.org/jira/browse/CASSANDRA-13475 Pluggable
>> Storage
>>   - https://issues.apache.org/jira/browse/

Re: Paying off tech debt and correctly naming things

2018-03-21 Thread Jeff Jirsa
Please please please ping the list and ask if anyone has big commits ready to 
merge before actually committing any huge automated refactors - people who may 
be sitting on big patches will thank you if they don’t have to rebase against 
huge IntelliJ refactors . 

-- 
Jeff Jirsa


> On Mar 21, 2018, at 5:49 PM, Jeremiah D Jordan <jeremiah.jor...@gmail.com> 
> wrote:
> 
> +1 if you are willing to take it on.  As the person who performed the 
> Table->Keyspace rename of 2.0, I say good luck!  From hindsight of doing 
> that, as others suggested, I would come at this in multiple tickets.
> I would suggest a simple class rename with intellij refactoring tools or 
> something as the first ticket.  This is going to touch the most files at 
> once, but will be mechanical, and for the most part if it compiles it was 
> right :).
> After you have done that you can take on other renaming of things with a 
> smaller scope.
> Also as others have said the main things to be wary of are the naming of 
> things in JMX metrics.  Ideally we would keep around deprecated aliases of 
> the old JMX names for a release before removing them.  The other thing is to 
> watch out for class names in byte man scripts in dtest.
> 
> -Jeremiah
> 
>> On Mar 21, 2018, at 4:48 AM, Sylvain Lebresne <lebre...@gmail.com> wrote:
>> 
>> I really don't think anyone has been recently against such renaming, and in
>> fact, a _lot_ of renaming *has* already happen over time. The problem, as
>> you carefully noted, is that it's such a big task that there is still a lot
>> to do. Anyway, I've yet to see a patch renaming things to match the CQL
>> naming scheme be rejected, so I'd personally encourage such submission. But
>> maybe with a few caveats (already mentioned largely, so repeating here to
>> signify my personal agreement with them):
>> - renaming with large surface area can be painful for ongoing patches or
>> even future merge. That's not a reason for not doing them, but that's imo a
>> good enough reason to do things incrementally/in as-small-as-reasonable
>> steps. Making sure a renaming commit only does renaming and doesn't change
>> the logic is also pretty nice when you rebase such things.
>> - breaking hundreds of tests is obviously not ok :)
>> - pure code renaming is one reasonably simple aspect, but quite a few
>> renaming may have user visible impact. Particularly around JMX where many
>> things are name based on their class, and to a lesser extend some of our
>> tools still use "old" naming. We can't and shouldn't ignore those impact:
>> such user visible changes should imo be documented, and we should make sure
>> we have a reasonably painless (and thus incremental) upgrade path. My hunch
>> is the latter isn't as simple as it seems.
>> 
>> 
>> --
>> Sylvain
>> 
>> 
>>> On Wed, Mar 21, 2018 at 9:06 AM kurt greaves <k...@instaclustr.com> wrote:
>>> 
>>> As someone who came to the codebase post CQL but prior to thrift being
>>> removed, +1 to refactor. The current mixing of terminology is a complete
>>> nightmare. This would also give a good opportunity document a lot of code
>>> that simply isn't documented (or incorrect). I'd say it's worth doing it in
>>> multiple steps though, such as refactor of a single class at a time, then
>>> followed by refactor of variable names. We've already done one pretty big
>>> refactor (InetAddressAndPort) for 4.0, I don't see how a few more could
>>> make it any worse (lol).
>>> 
>>> Row vs partition vs key vs PK is killing me
>>> 
>>>> On 20 March 2018 at 22:04, Jon Haddad <j...@jonhaddad.com> wrote:
>>>> 
>>>> Whenever I hop around in the codebase, one thing that always manages to
>>>> slow me down is needing to understand the context of the variable names
>>>> that I’m looking at.  We’ve now removed thrift the transport, but the
>>>> variables, classes and comments still remain.  Personally, I’d like to go
>>>> in and pay off as much technical debt as possible by refactoring the code
>>>> to be as close to CQL as possible.  Rows should be rows, not partitions,
>>>> I’d love to see the term column family removed forever in favor of always
>>>> using tables.  That said, it’s a big task.  I did a quick refactor in a
>>>> branch, simply changing the ColumnFamilyStore class to TableStore, and
>>>> pushed it up to GitHub. [1]
>>>> 
>>>> Didn’t click on the link?  That’s ok.  The TL;DR is that it’s almost 2K
>>>> LOC changed across 275 files.  I

Re: [DISCUSS] java 9 and the future of cassandra on the jdk

2018-03-21 Thread Jeff Jirsa



> On Mar 21, 2018, at 7:21 AM, Gerald Henriksen  wrote:
> 
>> On Wed, 21 Mar 2018 14:04:39 +0100, you wrote:
>> Bundling a custom JRE along with Cassandra, would be convenient in a way
>> that we can do all the testing against the bundled Java version. We
>> could also switch to a new Java version whenever it fits us.
> 
> To a certain extent though the issue isn't whether Cassandra works
> well with the given JRE but rather the issue of having a supported JRE
> in a production environment.
> 

This, plus the license issue, probably makes this a non-option.

(The license question is closed for the record:

https://www.apache.org/licenses/GPL-compatibility.html - The Apache Software 
Foundation does not allow its own projects to distribute software under 
licenses more restrictive than the Apache License)





Re: Debug logging enabled by default since 2.2

2018-03-18 Thread Jeff Jirsa
In Cassandra-10241 I said I was torn on this whole ticket, since most people 
would end up turning it off if it had a negative impact. You said:

“I'd like to emphasize that we're not talking about turning debug or trace on 
for client-generated request paths. There's way too much data generated and 
it's unlikely to be useful.
What we're proposing is enabling debug logging ONLY for cluster state changes 
like gossip and schema, and infrequent activities like repair. “

Clearly there’s a disconnect here - we’ve turned debug logging on for 
everything and shuffled some stuff to trace, which is a one time action but is 
hard to protect against regression. In fact, just looking at the read callback 
shows two instances of debug log in the client request path (exercise for the 
reader to “git blame”).

Either we can go clean up all the surprises that leaked through, or we can turn 
off debug and start backing out some of the changes in 10241. Putting stuff 
like compaction in the same bucket as digest mismatch and gossip state doesn’t 
make life materially better for most people.


-- 
Jeff Jirsa


> On Mar 18, 2018, at 11:21 AM, Jonathan Ellis <jbel...@gmail.com> wrote:
> 
> That really depends on whether you're judicious in deciding what to log at
> debug, doesn't it?
> 
> On Sun, Mar 18, 2018 at 12:57 PM, Michael Kjellman <kjell...@apple.com>
> wrote:
> 
>> +1. this is how it works.
>> 
>> your computer doesn’t run at debug logging by default. your phone doesn’t
>> either. neither does your smart tv. your database can’t be running at debug
>> just because it makes our lives as engineers easier.
>> 
>>> On Mar 18, 2018, at 5:14 AM, Alexander Dejanovski <
>> a...@thelastpickle.com> wrote:
>>> 
>>> It's a tiny bit unusual to turn on debug logging for all users by default
>>> though, and there should be occasions to turn it on when facing issues
>> that
>>> you want to debug (if they can be easily reproduced).
>> 
> 
> 
> 
> -- 
> Jonathan Ellis
> co-founder, http://www.datastax.com
> @spyced


Re: Making RF4 useful aka primary and secondary ranges

2018-03-14 Thread Jeff Jirsa
Write at CL 3 and read at CL 2

-- 
Jeff Jirsa


> On Mar 14, 2018, at 2:40 PM, Carl Mueller <carl.muel...@smartthings.com> 
> wrote:
> 
> Currently there is little use for RF4. You're getting the requirements of
> QUORUM-3 but only one extra backup.
> 
> I'd like to propose something that would make RF4 a sort of more heavily
> backed up RF3.
> 
> A lot of this is probably achievable with strictly driver-level logic, so
> perhaps it would belong more there.
> 
> Basically the idea is to have four replicas of the data, but only have to
> practically do QUORUM with three nodes. We consider the first three
> replicas the "primary replicas". On an ongoing basis for QUORUM reads and
> writes, we would rely on only those three replicas to satisfy
> two-out-of-three QUORUM. Writes are persisted to the fourth replica in the
> normal manner of cassandra, it just doesn't count towards the QUORUM write.
> 
> On reads, with token and node health awareness by the driver, if the
> primaries are all healthy, two-of-three QUORUM is calculated from those.
> 
> If however one of the three primaries is down, read QUORUM is a bit
> different:
> 1) if the first two replies come from the two remaining primaries and
> agree, the is returned
> 2) if the first two replies are a primary and the "hot spare" and those
> agree, that is returned
> 3) if the primary and hot spare disagree, wait for the next primary to
> return, and then take the agreement (hopefully) that results
> 
> Then once the previous primary comes back online, the read quorum goes back
> to preferring that set, with the assuming hinted handoff and repair will
> get it back up to snuff.
> 
> There could also be some mechanism examining the hinted handoff status of
> the four to determine when to reactivate the primary that was down.
> 
> For mutations, one could prefer a "QUORUM plus" that was a quorum of the
> primaries plus the hot spare.
> 
> Of course one could do multiple hot spares, so RF5 could still be treated
> as RF3 + hot spares.
> 
> The goal here is more data resiliency but not having to rely on as many
> nodes for resiliency.
> 
> Since the data is ring-distributed, the fact there are primary owners of
> ranges should still be evenly distributed and no hot nodes should result

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Cassandra Wrapup: Feb 2018 Edition

2018-03-04 Thread Jeff Jirsa
I'm late. Mea culpa. I blame February for only having 28 days.

The following contributors had their first ever commit into the project
(since the last time I made this list, which was late 2017)!

Johannes Grassler
Michael Burman
Nicolas GUYOMAR
Alex Ott
Samuel Roberts
Dinesh Joshi
Amichai Rothman
Vince White
Sumanth Pasupuleti
Samuel Fink
Alexander Dejanovski
Dimitar Dimitrov
Kevin Wern
Yuji Ito

Jay Zhuang was recently added as a committer. Congrats Jay!

There are some notably active topics to which I'd like to draw your
attention, in case you haven't been reading email or following JIRA:

1) There's been a lot of talk about docs. There are a lot of new JIRAs for
filling in the doc sections. Some of these could use review and commit (
https://issues.apache.org/jira/browse/CASSANDRA-14128?jql=project%20%3D%20CASSANDRA%20AND%20component%20%3D%20%22Documentation%20and%20Website%22%20and%20status%20%3D%20%22Patch%20Available%22
) , some of them still need content (
https://issues.apache.org/jira/issues/?jql=project%20%3D%20CASSANDRA%20AND%20component%20%3D%20%22Documentation%20and%20Website%22%20and%20status%20%3D%20Open
). A friendly reminder for anyone writing docs: respect other peoples'
copyrights. This hasn't been a problem (as far as I can tell), but while
other people/companies have written docs that are probably relevant, please
don't go copying them verbatim. Docs should be some of the lowest bar to
entry for new contributors - there's a nice howto here if you don't know
how to contribute docss
http://cassandra.apache.org/doc/latest/development/documentation.html )

2) There's a lot of activity around audit logging (
https://issues.apache.org/jira/browse/CASSANDRA-12151 ) . There's a few
other related tickets (Stefan's internal auditing events
https://issues.apache.org/jira/browse/CASSANDRA-13668 , and the
full-query-log patch at
https://issues.apache.org/jira/browse/CASSANDRA-13983 ), but there's also a
few different goals (as Joseph Lynch pointed out, there's at least 4 -
security, compliance / SOX / PCI, replayability, debugging). If you're in
the class of user that cares about these features (any of the 4), you
should probably consider visiting that thread and reading the discussion.

3) Lerh Chuan Low has done a LOT of work on
https://issues.apache.org/jira/browse/CASSANDRA-8460 (tiered storage). If
you have any sort of desire to mix spinning+SSD disks in a single server,
you may want to weigh in on the design.

4) There was an interesting conversation about performance of latency
metrics. Started here:
https://lists.apache.org/thread.html/e7067b3a8048ee62d49320642203339dc0f466fec6b3fdf6db575ad6@%3Cdev.cassandra.apache.org%3E


5) There are two or three different JIRAs/fixes floating around for the 2
stupid MV unit tests that keep timing out. At least two of them were opened
by committers, and I think both were reviewed by committers - please settle
on one and commit it.

Finally: If you're interested in learning more about cassandra from other
users, the mailing list and JIRA have both been pretty busy this month, and
that's nice. There are also Cassandra meetup groups all over the world - if
you haven't ever attended one, I encourage you to find one (I'm not going
to link to any, because I don't want it to look like there are any
"official" groups, but search your favorite sites, you'll likely find one
near you).

I'm Jeff Jirsa, and this was the February 2018 Cassandra Dev Wrapup.


Re: penn state academic paper - "scalable" bloom filters

2018-02-22 Thread Jeff Jirsa
Potentially more interesting, range filters:

https://issues.apache.org/jira/plugins/servlet/mobile#issue/CASSANDRA-9843

And rocksdb has a prefix bloom filter

https://github.com/facebook/rocksdb/wiki/Prefix-Seek-API-Changes

Which we could potentially use to track partition:partial-clustering per sstable


-- 
Jeff Jirsa


> On Feb 22, 2018, at 5:47 PM, Jay Zhuang <z...@uber.com> wrote:
> 
> I think there's a similar idea here to dynamically resize the BF:
> https://issues.apache.org/jira/browse/CASSANDRA-6633, but I don't quite
> understand the idea there.
> 
> 
> On Thu, Feb 22, 2018 at 7:45 AM, Carl Mueller <carl.muel...@smartthings.com>
> wrote:
> 
>> http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.
>> 62.7953=rep1=pdf
>> 
>> looks to be an adaptive approach where the "initial guess" bloom filters
>> are enhanced with more layers of ones generated after usage stats are
>> gained.
>> 
>> Disclaimer: I suck at reading academic papers.
>> 


Re: Why isn't there a separate JVM per table?

2018-02-22 Thread Jeff Jirsa
Bloom filters are offheap.

To be honest, there may come a time when it makes sense to move compaction
into its own JVM, but it would be FAR less effort to just profile what
exists now and fix the problems.



On Thu, Feb 22, 2018 at 2:52 PM, Carl Mueller 
wrote:

> BLoom filters... nevermind
>
>
> On Thu, Feb 22, 2018 at 4:48 PM, Carl Mueller <
> carl.muel...@smartthings.com>
> wrote:
>
> > Is the current reason for a large starting heap due to the memtable?
> >
> > On Thu, Feb 22, 2018 at 4:44 PM, Carl Mueller <
> > carl.muel...@smartthings.com> wrote:
> >
> >>  ... compaction on its own jvm was also something I was thinking about,
> >> but then I realized even more JVM sharding could be done at the table
> level.
> >>
> >> On Thu, Feb 22, 2018 at 4:09 PM, Jon Haddad  wrote:
> >>
> >>> Yeah, I’m in the compaction on it’s own JVM camp, in an ideal world
> >>> where we’re isolating crazy GC churning parts of the DB.  It would mean
> >>> reworking how tasks are created and removal of all shared state in
> favor of
> >>> messaging + a smarter manager, which imo would be a good idea
> regardless.
> >>>
> >>> It might be a better use of time (especially for 4.0) to do some GC
> >>> performance profiling and cut down on the allocations, since that
> doesn’t
> >>> involve a massive effort.
> >>>
> >>> I’ve been meaning to do a little benchmarking and profiling for a while
> >>> now, and it seems like a few others have the same inclination as well,
> >>> maybe now is a good time to coordinate that.  A nice perf bump for 4.0
> >>> would be very rewarding.
> >>>
> >>> Jon
> >>>
> >>> > On Feb 22, 2018, at 2:00 PM, Nate McCall  wrote:
> >>> >
> >>> > I've heard a couple of folks pontificate on compaction in its own
> >>> > process as well, given it has such a high impact on GC. Not sure
> about
> >>> > the value of individual tables. Interesting idea though.
> >>> >
> >>> > On Fri, Feb 23, 2018 at 10:45 AM, Gary Dusbabek  >
> >>> wrote:
> >>> >> I've given it some thought in the past. In the end, I usually talk
> >>> myself
> >>> >> out of it because I think it increases the surface area for failure.
> >>> That
> >>> >> is, managing N processes is more difficult that managing one
> process.
> >>> But
> >>> >> if the additional failure modes are addressed, there are some
> >>> interesting
> >>> >> possibilities.
> >>> >>
> >>> >> For example, having gossip in its own process would decrease the
> odds
> >>> that
> >>> >> a node is marked dead because STW GC is happening in the storage
> JVM.
> >>> On
> >>> >> the flipside, you'd need checks to make sure that the gossip process
> >>> can
> >>> >> recognize when the storage process has died vs just running a long
> GC.
> >>> >>
> >>> >> I don't know that I'd go so far as to have separate processes for
> >>> >> keyspaces, etc.
> >>> >>
> >>> >> There is probably some interesting work that could be done to
> support
> >>> the
> >>> >> orgs who run multiple cassandra instances on the same node (multiple
> >>> >> gossipers in that case is at least a little wasteful).
> >>> >>
> >>> >> I've also played around with using domain sockets for IPC inside of
> >>> >> cassandra. I never ran a proper benchmark, but there were some
> >>> throughput
> >>> >> advantages to this approach.
> >>> >>
> >>> >> Cheers,
> >>> >>
> >>> >> Gary.
> >>> >>
> >>> >>
> >>> >> On Thu, Feb 22, 2018 at 8:39 PM, Carl Mueller <
> >>> carl.muel...@smartthings.com>
> >>> >> wrote:
> >>> >>
> >>> >>> GC pauses may have been improved in newer releases, since we are in
> >>> 2.1.x,
> >>> >>> but I was wondering why cassandra uses one jvm for all tables and
> >>> >>> keyspaces, intermingling the heap for on-JVM objects.
> >>> >>>
> >>> >>> ... so why doesn't cassandra spin off a jvm per table so each jvm
> >>> can be
> >>> >>> tuned per table and gc tuned and gc impacts not impact other
> tables?
> >>> It
> >>> >>> would probably increase the number of endpoints if we avoid having
> an
> >>> >>> overarching query router.
> >>> >>>
> >>> >
> >>> > 
> -
> >>> > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> >>> > For additional commands, e-mail: dev-h...@cassandra.apache.org
> >>> >
> >>>
> >>>
> >>> -
> >>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> >>> For additional commands, e-mail: dev-h...@cassandra.apache.org
> >>>
> >>>
> >>
> >
>


Re: Cassandra Needs to Grow Up by Version Five!

2018-02-21 Thread Jeff Jirsa
On Wed, Feb 21, 2018 at 2:53 PM, Kenneth Brotman <
kenbrot...@yahoo.com.invalid> wrote:

> Hi Akash,
>
> I get the part about outside work which is why in replying to Jeff Jirsa I
> was suggesting the big companies could justify taking it on easy enough and
> you know actually pay the people who would be working at it so those people
> could have a life.
>
> The part I don't get is the aversion to usability.  Isn't that what you
> think about when you are coding?  "Am I making this thing I'm building easy
> to use?"  If you were programming for me, we would be constantly talking
> about what we are building and how we can make things easier for users.  If
> I had to fight with a developer, architect or engineer about usability all
> the time, they would be gone and quick.  How do approach programming if you
> aren't trying to make things easy.
>


There's no aversion to usability, you're assuming things that just aren't
true. Nobody's against usability, we've just prioritized other things
HIGHER. We make those decisions in part by looking at open JIRAs and
determining what's asked for the most, what members of the community have
contributed, and then balance that against what we ourselves care about.
You're making a statement that it should be the top priority for the next
release, with no JIRA, and history of contributing (and indeed, no real
clear sign that you even understand the full extent of the database), no
sign that you're willing to do the work yourself, and making a ton of
assumptions about the level of effort and ROI.

I would love for Cassandra to be easier to use, I'm sure everyone does.
There's a dozen features I'd love to add if I had infinite budget and
infinite manpower. But what you're asking for is A LOT of effort and / or A
LOT of money, and you're assuming someone's going to step up and foot the
bill, but there's no real reason to believe that's the case.

In the mean time, everyone's spending hours replying to this thread that is
0% actionable. We would all have been objectively better off had everyone
ignored this thread and just spent 10 minutes writing some section of the
docs. So the next time I get the urge to reply, I'm just going to do that
instead.


Re: Cassandra Needs to Grow Up by Version Five!

2018-02-19 Thread Jeff Jirsa
There's a lot of things below I disagree with, but it's ok. I convinced
myself not to nit-pick every point.

https://issues.apache.org/jira/browse/CASSANDRA-13971 has some of Stefan's
work with cert management

Beyond that, I encourage you to do what Michael suggested: open JIRAs for
things you care strongly about, work on them if you have time. Sometime
this year we'll schedule a NGCC (Next Generation Cassandra Conference)
where we talk about future project work and direction, I encourage you to
attend if you're able (I encourage anyone who cares about the direction of
Cassandra to attend, it's probably be either free or very low cost, just to
cover a venue and some food). If nothing else, you'll meet some of the
teams who are working on the project, and learn why they've selected the
projects on which they're working. You'll have an opportunity to pitch your
vision, and maybe you can talk some folks into helping out.

- Jeff




On Mon, Feb 19, 2018 at 1:01 AM, Kenneth Brotman <
kenbrot...@yahoo.com.invalid> wrote:

> Comments inline
>
> >-Original Message-----
> >From: Jeff Jirsa [mailto:jji...@gmail.com]
> >Sent: Sunday, February 18, 2018 10:58 PM
> >To: u...@cassandra.apache.org
> >Cc: dev@cassandra.apache.org
> >Subject: Re: Cassandra Needs to Grow Up by Version Five!
> >
> >Comments inline
> >
> >
> >> On Feb 18, 2018, at 9:39 PM, Kenneth Brotman
> <kenbrot...@yahoo.com.INVALID> wrote:
> >>
> > >Cassandra feels like an unfinished program to me. The problem is not
> that it’s open source or cutting edge.  It’s an open source cutting edge
> program that lacks some of its basic functionality.  We are all stuck
> addressing fundamental mechanical tasks for Cassandra because the basic
> code that would do that part has not been contributed yet.
> >>
> >There’s probably 2-3 reasons why here:
> >
> >1) Historically the pmc has tried to keep the scope of the project very
> narrow. It’s a database. We don’t ship drivers. We don’t ship developer
> tools. We don’t ship fancy UIs. We ship a database. I think for the most
> part the narrow vision has been for the best, but maybe it’s time to
> reconsider some of the scope.
> >
> >Postgres will autovacuum to prevent wraparound (hopefully),  but everyone
> I know running Postgres uses flexible-freeze in cron - sometimes it’s ok to
> let the database have its opinions and let third party tools fill in the
> gaps.
> >
>
> I can appreciate the desire to stay in scope.  I believe usability is the
> King.  When users have to learn the database, then learn what they have to
> automate, then learn an automation tool and then use the automation tool to
> do something that is as fundamental as the fundamental tasks I described,
> then something is missing from the database itself that is adversely
> affecting usability - and that is very bad.  Where those big companies need
> to calculate the ROI is in the cost of acquiring or training the next group
> of users.  Consider how steep the learning curve is for new users.
> Consider the business case for improving ease of use.
>
> >2) Cassandra is, by definition, a database for large scale problems. Most
> of the companies working on/with it tend to be big companies. Big companies
> often have pre-existing automation that solved the stuff you consider
> fundamental tasks, so there’s probably nobody actively working on the
> solved problems that you may consider missing features - for many people
> they’re already solved.
> >
>
> I could be wrong but it sounds like a lot of the code work is done, and if
> the companies would take the time to contribute more code, then the rest of
> the code needed could be generated easily.
>
> >3) It’s not nearly as basic as you think it is. Datastax seemingly had a
> multi-person team on opscenter, and while it was better than anything else
> around last time I used it (before it stopped supporting the OSS version),
> it left a lot to be desired. It’s probably 2-3 engineers working for a
> month  to have any sort of meaningful, reliable, mostly trivial
> cluster-managing UI, and I can think of about 10 JIRAs I’d rather see that
> time be spent on first.
>
> How about 6-9 engineers working 12 months a year on it then.  I'm not
> kidding.  For a big company with revenues in the tens of billions or more,
> and a heavy use of Cassandra nodes, it's easy to make a case for having a
> full time person or more that involved.  They aren't paying for using the
> open source code that is Cassandra.  Let's see what would the licensing
> fees be for a big company if the costs where like Microsoft or Oracle would
> charge for their enterprise level relational database?   What's the
> co

Re: Cassandra Needs to Grow Up by Version Five!

2018-02-18 Thread Jeff Jirsa
Comments inline 


> On Feb 18, 2018, at 9:39 PM, Kenneth Brotman  
> wrote:
> 
> Cassandra feels like an unfinished program to me. The problem is not that 
> it’s open source or cutting edge.  It’s an open source cutting edge program 
> that lacks some of its basic functionality.  We are all stuck addressing 
> fundamental mechanical tasks for Cassandra because the basic code that would 
> do that part has not been contributed yet.
> 
There’s probably 2-3 reasons why here:

1) Historically the pmc has tried to keep the scope of the project very narrow. 
It’s a database. We don’t ship drivers. We don’t ship developer tools. We don’t 
ship fancy UIs. We ship a database. I think for the most part the narrow vision 
has been for the best, but maybe it’s time to reconsider some of the scope. 

Postgres will autovacuum to prevent wraparound (hopefully),  but everyone I 
know running Postgres uses flexible-freeze in cron - sometimes it’s ok to let 
the database have its opinions and let third party tools fill in the gaps.

2) Cassandra is, by definition, a database for large scale problems. Most of 
the companies working on/with it tend to be big companies. Big companies often 
have pre-existing automation that solved the stuff you consider fundamental 
tasks, so there’s probably nobody actively working on the solved problems that 
you may consider missing features - for many people they’re already solved.

3) It’s not nearly as basic as you think it is. Datastax seemingly had a 
multi-person team on opscenter, and while it was better than anything else 
around last time I used it (before it stopped supporting the OSS version), it 
left a lot to be desired. It’s probably 2-3 engineers working for a month  to 
have any sort of meaningful, reliable, mostly trivial cluster-managing UI, and 
I can think of about 10 JIRAs I’d rather see that time be spent on first. 

> Ease of use issues need to be given much more attention.  For an 
> administrator, the ease of use of Cassandra is very poor. 
> 
> Furthermore, currently Cassandra is an idiot.  We have to do everything for 
> Cassandra. Contrast that with the fact that we are in the dawn of artificial 
> intelligence.
> 

And for everything you think is obvious, there’s a 50% chance someone else will 
have already solved differently, and your obvious new solution will be seen as 
an inconvenient assumption and complexity they won’t appreciate. Open source 
projects get to walk a fine line of trying to be useful without making too many 
assumptions, being “too” opinionated, or overstepping bounds. We may be too 
conservative, but it’s very easy to go too far in the opposite direction. 

> Software exists to automate tasks for humans, not mechanize humans to 
> administer tasks for a database.  I’m an engineering type.  My job is to 
> apply science and technology to solve real world problems.  And that’s where 
> I need an organization’s I.T. talent to focus; not in crank starting an 
> unfinished database.
> 

And that’s why nobody’s done it - we all have bigger problems we’re being paid 
to solve, and nobody’s felt it necessary. Because it’s not necessary, it’s 
nice, but not required.

> For example, I should be able to go to any node, replace the Cassandra.yaml 
> file and have a prompt on the display ask me if I want to update all the yaml 
> files across the cluster.  I shouldn’t have to manually modify yaml files on 
> each node or have to create a script for some third party automation tool to 
> do it. 
> 
I don’t see this ever happening.  Your config management already pushes files 
around your infrastructure, Cassandra doesn’t need to do it. 

> I should not have to turn off service, clear directories, restart service in 
> coordination with the other nodes.  It’s already a computer system.  It can 
> do those things on its own.
> 

The only time you should be doing this is when you’re wiping nodes from failed 
bootstrap, and that stopped being required in 2.2.
> How about read repair.  First there is something wrong with the name.  Maybe 
> it should be called Consistency Repair.  An administrator shouldn’t have to 
> do anything.  It should be a behavior of Cassandra that is programmed in. It 
> should consider the GC setting of each node, calculate how often it has to 
> run repair, when it should run it so all the nodes aren’t trying at the same 
> time and when other circumstances indicate it should also run it.
> 
There’s a good argument to be made that something like Reaper should be shipped 
with Cassandra. There’s another good argument that most tools like this end up 
needing some sort of leader election for scheduling and that goes against a lot 
of the fundamental assumptions in Cassandra (all nodes are equal, etc) - 
solving that problem is probably at least part of why you haven’t seen them 
built into the db. “Leader election is easy” you’ll say, and I’ll laugh and 
tell you about users I know who have DCs go 

Re: scheduled work compaction strategy

2018-02-16 Thread Jeff Jirsa
There’s a company using TWCS in this config - I’m not going to out them, but I 
think they do it (or used to) with aggressive tombstone sub properties. They 
may have since extended/enhanced it somewhat.

-- 
Jeff Jirsa


> On Feb 16, 2018, at 2:24 PM, Carl Mueller <carl.muel...@smartthings.com> 
> wrote:
> 
> Oh and as a further refinement outside of our use case.
> 
> If we could group/organize the sstables by the rowkey time value or
> inherent TTL value, the naive version would be evenly distributed buckets
> into the future.
> 
> But many/most data patterns like this have "busy" data in the near term.
> Far out scheduled stuff would be more sparse. In our case, 50% of the data
> is in the first 12 hours, 50% of the remaining in the next day or two, 50%
> of the remaining in the next week, etc etc.
> 
> So we could have a "long term" general bucket to take data far in the
> future. But here's the thing, if we could actively process the "long term"
> sstable on a regular basis into two sstables: the stuff that is still "long
> term" and sstables for the "near term", that could solve many general
> cases. The "long term" bucket could even be STCS by default, and as the
> near term comes into play, that is considered a different "level".
> 
> Of course all this relies on the ability to look at the data in the rowkey
> or the TTL associated with the row.
> 
> On Fri, Feb 16, 2018 at 4:17 PM, Carl Mueller <carl.muel...@smartthings.com>
> wrote:
> 
>> We have a scheduler app here at smartthings, where we track per-second
>> tasks to be executed.
>> 
>> These are all TTL'd to be destroyed after the second the event was
>> registered with has passed.
>> 
>> If the scheduling window was sufficiently small, say, 1 day, we could
>> probably use a time window compaction strategy with this. But the window is
>> one-two years worth of adhoc event registration per the contract.
>> 
>> Thus, the intermingling of all this data TTL'ing at the different times
>> since they are registered at different times means the sstables are not
>> written with data TTLing in the same rough time period. If they were, then
>> compaction would be a relatively easy process since the entire sstable
>> would tombstone.
>> 
>> We could kind of do this by doing sharded tables for the time periods and
>> rotating the shards for duty, and truncating them as they are recycled.
>> 
>> But an elegant way would be a custom compaction strategy that would
>> "window" the data into clustered sstables that could be compacted with
>> other similarly time bucketed sstables.
>> 
>> This would require visibility into the rowkey when it came time to convert
>> the memtable data to sstables. Is that even possible with compaction
>> schemes? We would provide a requirement that the time-based data would be
>> in the row key if it is a composite row key, making it required.
>> 
>> 
>> 

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Release votes

2018-02-15 Thread Jeff Jirsa
Moving this to it’s own thread:

We’ve declared this a requirement multiple times and then we occasionally get a 
critical issue and have to decide whether it’s worth the delay. I assume 
Jason’s earlier -1 on attempt 1 was an enforcement of that earlier stated goal. 

It’s up to the PMC. We’ve said in the past that we don’t release without green 
tests. The PMC gets to vote and enforce it. If you don’t vote yes without 
seeing the test results, that enforces it. 

-- 
Jeff Jirsa


> On Feb 15, 2018, at 9:49 AM, Josh McKenzie <jmcken...@apache.org> wrote:
> 
> What would it take for us to get green utest/dtests as a blocking part of
> the release process? i.e. "for any given SHA, here's a link to the tests
> that passed" in the release vote email?
> 
> That being said, +1.
> 
>> On Wed, Feb 14, 2018 at 4:33 PM, Nate McCall <zznat...@gmail.com> wrote:
>> 
>> +1
>> 
>> On Thu, Feb 15, 2018 at 9:40 AM, Michael Shuler <mich...@pbandjelly.org>
>> wrote:
>>> I propose the following artifacts for release as 3.0.16.
>>> 
>>> sha1: 890f319142ddd3cf2692ff45ff28e71001365e96
>>> Git:
>>> http://git-wip-us.apache.org/repos/asf?p=cassandra.git;a=
>> shortlog;h=refs/tags/3.0.16-tentative
>>> Artifacts:
>>> https://repository.apache.org/content/repositories/
>> orgapachecassandra-1157/org/apache/cassandra/apache-cassandra/3.0.16/
>>> Staging repository:
>>> https://repository.apache.org/content/repositories/
>> orgapachecassandra-1157/
>>> 
>>> Debian and RPM packages are available here:
>>> http://people.apache.org/~mshuler
>>> 
>>> *** This release addresses an important fix for CASSANDRA-14092 ***
>>>"Max ttl of 20 years will overflow localDeletionTime"
>>>https://issues.apache.org/jira/browse/CASSANDRA-14092
>>> 
>>> The vote will be open for 72 hours (longer if needed).
>>> 
>>> [1]: (CHANGES.txt) https://goo.gl/rLj59Z
>>> [2]: (NEWS.txt) https://goo.gl/EkrT4G
>>> 
>>> -
>>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
>>> For additional commands, e-mail: dev-h...@cassandra.apache.org
>>> 
>> 
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
>> For additional commands, e-mail: dev-h...@cassandra.apache.org
>> 
>> 

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: row tombstones as a separate sstable citizen

2018-02-15 Thread Jeff Jirsa
Worth a JIRA, yes


On Wed, Feb 14, 2018 at 9:45 AM, Carl Mueller <carl.muel...@smartthings.com>
wrote:

> So is this at least a decent candidate for a feature request ticket?
>
>
> On Tue, Feb 13, 2018 at 8:09 PM, Carl Mueller <
> carl.muel...@smartthings.com>
> wrote:
>
> > I'm particularly interested in getting the tombstones to "promote" up the
> > levels of LCS more quickly. Currently they get attached at the low level
> > and don't propagate up to higher levels until enough activity at a lower
> > level promotes the data. Meanwhile, LCS means compactions can occur in
> > parallel at each level. So row tombstones in their own sstable could be
> up
> > promoted the LCS levels preferentially before normal processes would move
> > them up.
> >
> > So if the delete-only sstables could move up more quickly, the compaction
> > at the levels would happen more quickly.
> >
> > The threshold stuff is nice if I read 7019 correctly, but what is the %
> > there? % of rows? % of columns? or % of the size of the sstable? Row
> > tombstones are pretty compact being just the rowkey and the tombstone
> > marker. So if 7019 is triggered at 10% of the sstable size, even a
> crapton
> > of tombstones deleting practially the entire database would only be a
> small
> > % size of the sstable.
> >
> > Since the row tombstones are so compact, that's why I think they are good
> > candidates for special handling.
> >
> > On Tue, Feb 13, 2018 at 5:22 PM, J. D. Jordan <jeremiah.jor...@gmail.com
> >
> > wrote:
> >
> >> Have you taken a look at the new stuff introduced by
> >> https://issues.apache.org/jira/browse/CASSANDRA-7019 ?  I think it may
> >> go a ways to reducing the need for something complicated like this.
> >> Though it is an interesting idea as special handling for bulk deletes.
> >> If they were truly just sstables that only contained deletes the logic
> from
> >> 7109 would probably go a long ways. Though if you are bulk inserting
> >> deletes that is what you would end up with, so maybe it already works.
> >>
> >> -Jeremiah
> >>
> >> > On Feb 13, 2018, at 6:04 PM, Jeff Jirsa <jji...@gmail.com> wrote:
> >> >
> >> > On Tue, Feb 13, 2018 at 2:38 PM, Carl Mueller <
> >> carl.muel...@smartthings.com>
> >> > wrote:
> >> >
> >> >> In process of doing my second major data purge from a cassandra
> system.
> >> >>
> >> >> Almost all of my purging is done via row tombstones. While performing
> >> this
> >> >> the second time while trying to cajole compaction to occur (in 2.1.x,
> >> >> LevelledCompaction) to goddamn actually compact the data, I've been
> >> >> thinking as to why there isn't a separate set of sstable
> infrastructure
> >> >> setup for row deletion tombstones.
> >> >>
> >> >> I'm imagining that row tombstones are written to separate sstables
> than
> >> >> mainline data updates/appends and range/column tombstones.
> >> >>
> >> >> By writing them to separate sstables, the compaction systems can
> >> >> preferentially merge / process them when compacting sstables.
> >> >>
> >> >> This would create an additional sstable for lookup in the bloom
> >> filters,
> >> >> granted. I had visions of short circuiting the lookups to other
> >> sstables if
> >> >> a row tombstone was present in one of the special row tombstone
> >> sstables.
> >> >>
> >> >>
> >> > All of the above sounds really interesting to me, but I suspect it's a
> >> LOT
> >> > of work to make it happen correctly.
> >> >
> >> > You'd almost end up with 2 sets of logs for the LSM - a tombstone
> >> > log/generation, and a data log/generation, and the tombstone logs
> would
> >> be
> >> > read-only inputs to data compactions.
> >> >
> >> >
> >> >> But that would only be possible if there was the notion of a "super
> row
> >> >> tombstone" that permanently deleted a rowkey and all future writes
> >> would be
> >> >> invalidated. Kind of like how a tombstone with a mistakenly huge
> >> timestamp
> >> >> becomes a sneaky permanent tombstone, but intended. There could be a
> >> >> special operation / statement to undo this permanent tombstone, and
> >> since
> >> >> the row tombstones would be in their own dedicated sstables, they
> could
> >> >> process and compact more quickly, with prioritization by the
> compactor.
> >> >>
> >> >>
> >> > This part sounds way less interesting to me (other than the fact you
> can
> >> > already do this with a timestamp in the future, but it'll gc away at
> >> gcgs).
> >> >
> >> >
> >> >> I'm thinking there must be something I am forgetting in the
> >> >> read/write/compaction paths that invalidate this.
> >> >>
> >> >
> >> > There are a lot of places where we do "smart" things to make sure we
> >> don't
> >> > accidentally resurrect data. Read path includes old sstables for
> >> tombstones
> >> > for example. Those all need to be concretely identified and handled
> (and
> >> > tested),.
> >>
> >
> >
>


Re: row tombstones as a separate sstable citizen

2018-02-13 Thread Jeff Jirsa
On Tue, Feb 13, 2018 at 2:38 PM, Carl Mueller 
wrote:

> In process of doing my second major data purge from a cassandra system.
>
> Almost all of my purging is done via row tombstones. While performing this
> the second time while trying to cajole compaction to occur (in 2.1.x,
> LevelledCompaction) to goddamn actually compact the data, I've been
> thinking as to why there isn't a separate set of sstable infrastructure
> setup for row deletion tombstones.
>
> I'm imagining that row tombstones are written to separate sstables than
> mainline data updates/appends and range/column tombstones.
>
> By writing them to separate sstables, the compaction systems can
> preferentially merge / process them when compacting sstables.
>
> This would create an additional sstable for lookup in the bloom filters,
> granted. I had visions of short circuiting the lookups to other sstables if
> a row tombstone was present in one of the special row tombstone sstables.
>
>
All of the above sounds really interesting to me, but I suspect it's a LOT
of work to make it happen correctly.

You'd almost end up with 2 sets of logs for the LSM - a tombstone
log/generation, and a data log/generation, and the tombstone logs would be
read-only inputs to data compactions.


> But that would only be possible if there was the notion of a "super row
> tombstone" that permanently deleted a rowkey and all future writes would be
> invalidated. Kind of like how a tombstone with a mistakenly huge timestamp
> becomes a sneaky permanent tombstone, but intended. There could be a
> special operation / statement to undo this permanent tombstone, and since
> the row tombstones would be in their own dedicated sstables, they could
> process and compact more quickly, with prioritization by the compactor.
>
>
This part sounds way less interesting to me (other than the fact you can
already do this with a timestamp in the future, but it'll gc away at gcgs).


> I'm thinking there must be something I am forgetting in the
> read/write/compaction paths that invalidate this.
>

There are a lot of places where we do "smart" things to make sure we don't
accidentally resurrect data. Read path includes old sstables for tombstones
for example. Those all need to be concretely identified and handled (and
tested),.


Re: CASSANDRA-14183 review request -> logback upgrade to fix CVE

2018-02-13 Thread Jeff Jirsa
Using the internals in ThreadAwareSecurityManager has caused countless
problems, and needs to be fixed once and for all -

There are 2 different patches up for review in
https://issues.apache.org/jira/browse/CASSANDRA-13396 - would be nice if
one could be selected, and hopefully whichever is chosen can be a final
workaround for upgrading safely as well.




On Tue, Feb 13, 2018 at 9:41 AM, Jacques-Henri Berthemet <
jacques-henri.berthe...@genesys.com> wrote:

> Hi,
>
> I suppose upgrading Logback breaks Cassandra because some classes are used
> directly like in StorageService, ThreadAwareSecurityManager and
> StorageServiceMBean.
> This was a problem in my case as we're embedding Cassandra for our
> functional tests, I had to stub it as it was conflicting with log4j2
> configuration.
>
> Ideally Cassandra should only use pure SLF4J so that logging can be easily
> upgraded or changed.
> --
> Jacques-Henri Berthemet
>
> -Original Message-
> From: Ariel Weisberg [mailto:ar...@weisberg.ws]
> Sent: Tuesday, February 13, 2018 6:28 PM
> To: dev@cassandra.apache.org
> Subject: Re: CASSANDRA-14183 review request -> logback upgrade to fix CVE
>
> Hi,
>
> So our options are:
>
> 1. Ignore it.
> Most people aren't using this functionality.
> Most people aren't and shouldn't be exposing the logging port to untrusted
> networks But everyone loses at defense in depth (or is it breadth) if they
> use this functionality and someone might expose the port
>
> 2. Remove the offending classes from the 1.1.10 jar My crazy idea, break
> it, but only for the people using the vulnerable functionality. Possibly no
> one, but probably someone. Maybe they can upgrade it manually for their
> usage?
> This also has an issue when working with maven.
>
> 3. Upgrade it
> Definitely going to break some apps according to Michael Shuler. Happened
> when he tried it.
>
> Certainly we can upgrade in trunk? While we are at it come up to the
> latest version.
>
> Ariel
>
> On Tue, Feb 13, 2018, at 12:03 PM, Ariel Weisberg wrote:
> > Hi,
> >
> > I don't think the fix is in 1.1.11 looking at the diff between 1.1.11
> > and 1.2.0
> > https://github.com/qos-ch/logback/compare/v_1.1.11...v_1.2.0
> > .com
> >
> > I looked at 1.1.11 and 1.1.10 and didn't see it there either.
> >
> > When you say stuff broke do you mean stuff not in the dtests or utests?
> >
> > Ariel
> >
> > On Tue, Feb 13, 2018, at 11:57 AM, Michael Shuler wrote:
> > > I tried a logback 1.2.x jar update a number of months ago to fix the
> > > broken log rotation (try setting rotation to a large number - you'll
> > > find you only get I think it was 10 files, regardless of setting).
> > >
> > > Like we've found updating other jars in the past, this seemingly
> > > "simple" update broke a number of application components, so we
> > > rolled it back and worked out another log rotation method.
> > >
> > > Looking at the logback changelog, I cannot tell if version 1.1.11 is
> > > fixed for this, or if that might be less breakage? There are a
> > > pretty significant number of API-looking changes from 1.1.3 to
> > > 1.2.3, so I do not wish to break other user's applications, as I have
> experienced.
> > >
> > > I do not think this should block the current releases, unless
> > > someone wants to do some significant testing and user outreach for
> > > tentatively breaking their applications.
> > >
> > > --
> > > Michael
> > >
> > > On 02/13/2018 10:48 AM, Jason Brown wrote:
> > > > Ariel,
> > > >
> > > > If this is a legit CVE, then we would want to patch all the
> > > > current versions we support - which is 2.1 and higher.
> > > >
> > > > Also, is this worth stopping the current open vote for this patch?
> > > > (Not in a place to look at the patch and affects to impacted
> branches right now).
> > > >
> > > > Jason
> > > >
> > > > On Tue, Feb 13, 2018 at 08:43 Ariel Weisberg 
> wrote:
> > > >
> > > >> Hi,
> > > >>
> > > >> Seems like users could conceivably be using the vulnerable
> > > >> component. Also seems like like we need potentially need to do this
> as far back as 2.1?
> > > >>
> > > >> Anyone else have an opinion before I commit this? What version to
> > > >> start from?
> > > >>
> > > >> Ariel
> > > >>
> > > >> On Tue, Feb 13, 2018, at 5:59 AM, Thiago Veronezi wrote:
> > > >>> Hi dev team,
> > > >>>
> > > >>> Sorry to keep bothering you.
> > > >>>
> > > >>> This is just a friendly reminder that I would like to contribute
> > > >>> to this project starting with a fix for CASSANDRA-14183
> > > >>> .
> > > >>>
> > > >>> []s,
> > > >>> Thiago.
> > > >>>
> > > >>>
> > > >>>
> > > >>> On Tue, Jan 30, 2018 at 8:05 AM, Thiago Veronezi
> > > >>> 
> > > >>> wrote:
> > > >>>
> > >  Hi dev team,
> > > 
> > >  Can one of you guys take a look on this jira ticket?
> > >  https://issues.apache.org/jira/browse/CASSANDRA-14183
> > >  =issues.apache.org
> > > 
> > >  

Re: Roadmap for 4.0

2018-02-12 Thread Jeff Jirsa
Advantages of cutting a release sooner than later:
1) The project needs to constantly progress forward. Releases are the most
visible part of that.
2) Having a huge changelog in a release increases the likelihood of bugs
that take time to find.

Advantages of a slower release:
1) We don't do major versions often, and when we do breaking changes
(protocol, file format, etc), we should squeeze in as many as possible to
avoid having to roll new majors
2) There are probably few people actually running 3.11 at scale, so
probably few people actually testing trunk.

In terms of "big" changes I'd like to see land, the ones that come to mind
are:

https://issues.apache.org/jira/browse/CASSANDRA-9754 - "Birch" (changes
file format)
https://issues.apache.org/jira/browse/CASSANDRA-13442 - Transient Replicas
(probably adds new replication strategy or similar)
https://issues.apache.org/jira/browse/CASSANDRA-13628 - Rest of the
internode netty stuff (no idea if this changes internode stuff, but I bet
it's a lot easier if it lands on a major)
https://issues.apache.org/jira/browse/CASSANDRA-7622 - Virtual Tables
(selfish inclusion, probably doesn't need to be a major at all, and I
wouldn't even lose sleep if it slips, but I'd like to see it land)

Stuff I'm ok with slipping to 4.X or 5.0, but probably needs to land on a
major because we'll change something big (like gossip, or the way schema is
passed, etc):

https://issues.apache.org/jira/browse/CASSANDRA-9667 - Strongly consistent
membership
https://issues.apache.org/jira/browse/CASSANDRA-10699 - Strongly consistent
schema

All that said, what I really care about is building confidence in the
release, which means an extended testing cycle. If all of those patches
landed tomorrow, I'd still expect us to be months away from a release,
because we need to bake the next major - there's too many changes to throw
out an alpha/beta/rc and hope someone actually runs it.

I don't believe Q3/Q4 is realistic, but I may be biased (or jaded). It's
possible Q3/Q4 alpha/beta is realistic, but definitely not a release.




On Sun, Feb 11, 2018 at 8:29 PM, kurt greaves  wrote:

> Hi friends,
> *TL;DR: Making a plan for 4.0, ideally everyone interested should provide
> up to two lists, one for tickets they can contribute resources to getting
> finished, and one for features they think would be desirable for 4.0, but
> not necessarily have the resources to commit to helping with.*
>
> So we had that Roadmap for 4.0 discussion last year, but there was never a
> conclusion or a plan that came from it. Times getting on and the changes
> list for 4.0 is getting pretty big. I'm thinking it would probably make
> sense to define some goals to getting 4.0 released/have an actual plan. 4.0
> is already going to be quite an unwieldy release with a lot of testing
> required.
>
> Note: the following is open to discussion, if people don't like the plan
> feel free to speak up. But in the end it's a pretty basic plan and I don't
> think we should over-complicate it, I also don't want to end up in a
> discussion where we "make a plan to make a plan". Regardless of whatever
> plan we do end up following it would still be valuable to have a list of
> tickets for 4.0 which is the overall goal of this email - so let's not get
> too worked up on the details just yet (save that for after I
> summarise/follow up).
>
> // TODO
> I think the best way to go about this would be for us to come up with a
> list of JIRA's that we want included in 4.0, tag these as 4.0, and all
> other improvements as 4.x. We can then aim to release 4.0 once all the 4.0
> tagged tickets (+bug fixes/blockers) are complete.
>
> Now, the catch is that we obviously don't want to include too many tickets
> in 4.0, but at the same time we want to make sure 4.0 has an appealing
> feature set for both users/operators/developers. To minimise scope creep I
> think the following strategy will help:
>
> We should maintain two lists:
>
>1. JIRA's that people want in 4.0 and can commit resources to getting
>them implemented in 4.0.
>2. JIRA's that people simply think would be desirable for 4.0, but
>currently don't have anyone assigned to them or planned assignment. It
>would probably make sense to label these with an additional tag in JIRA. 
> *(User's
>please feel free to point out what you want here)*
>
> From list 1 will come our source of truth for when we release 4.0. (after
> aggregating a list I will summarise and we can vote on it).
>
> List 2 would be the "hopeful" list, where stories can be picked up from if
> resourcing allows, or where someone comes along and decides it's good
> enough to work on. I guess we can also base this on a vote system if we
> reach the point of including some of them. (but for the moment it's purely
> to get an idea of what users actually want).
>
> Please don't refrain from listing something that's already been mentioned.
> The purpose is to get an idea of 

Re: [VOTE] Release Apache Cassandra 2.1.20

2018-02-12 Thread Jeff Jirsa
+1


On Mon, Feb 12, 2018 at 1:03 PM, Brandon Williams  wrote:

> +1
>
> On Mon, Feb 12, 2018 at 2:30 PM, Michael Shuler 
> wrote:
>
> > I propose the following artifacts for release as 2.1.20.
> >
> > sha1: b2949439ec62077128103540e42570238520f4ee
> > Git:
> > http://git-wip-us.apache.org/repos/asf?p=cassandra.git;a=
> > shortlog;h=refs/tags/2.1.20-tentative
> > Artifacts:
> > https://repository.apache.org/content/repositories/
> > orgapachecassandra-1152/org/apache/cassandra/apache-cassandra/2.1.20/
> > Staging repository:
> > https://repository.apache.org/content/repositories/
> > orgapachecassandra-1152/
> >
> > Debian and RPM packages are available here:
> > http://people.apache.org/~mshuler
> >
> > *** This release addresses an important fix for CASSANDRA-14092 ***
> > "Max ttl of 20 years will overflow localDeletionTime"
> > https://issues.apache.org/jira/browse/CASSANDRA-14092
> >
> > The vote will be open for 72 hours (longer if needed).
> >
> > [1]: (CHANGES.txt) https://goo.gl/5i2nw9
> > [2]: (NEWS.txt) https://goo.gl/i9Fg2u
> >
> >
>


Re: [VOTE] Release Apache Cassandra 2.2.12

2018-02-12 Thread Jeff Jirsa
+1


On Mon, Feb 12, 2018 at 1:02 PM, Brandon Williams  wrote:

> +1
>
> On Mon, Feb 12, 2018 at 2:30 PM, Michael Shuler 
> wrote:
>
> > I propose the following artifacts for release as 2.2.12.
> >
> > sha1: 1602e606348959aead18531cb8027afb15f276e7
> > Git:
> > http://git-wip-us.apache.org/repos/asf?p=cassandra.git;a=
> > shortlog;h=refs/tags/2.2.12-tentative
> > Artifacts:
> > https://repository.apache.org/content/repositories/
> > orgapachecassandra-1153/org/apache/cassandra/apache-cassandra/2.2.12/
> > Staging repository:
> > https://repository.apache.org/content/repositories/
> > orgapachecassandra-1153/
> >
> > Debian and RPM packages are available here:
> > http://people.apache.org/~mshuler
> >
> > *** This release addresses an important fix for CASSANDRA-14092 ***
> > "Max ttl of 20 years will overflow localDeletionTime"
> > https://issues.apache.org/jira/browse/CASSANDRA-14092
> >
> > The vote will be open for 72 hours (longer if needed).
> >
> > [1]: (CHANGES.txt) https://goo.gl/QkJeXH
> > [2]: (NEWS.txt) https://goo.gl/A4iKFb
> >
> >
>


Re: [VOTE] (Take 2) Release Apache Cassandra 3.11.2

2018-02-12 Thread Jeff Jirsa
+1

On Mon, Feb 12, 2018 at 7:40 PM, Michael Shuler 
wrote:

> I propose the following artifacts for release as 3.11.2.
>
> sha1: 8a5e88f635fdb984505a99a553b5799cedccd06d
> Git:
> http://git-wip-us.apache.org/repos/asf?p=cassandra.git;a=
> shortlog;h=refs/tags/3.11.2-tentative
> Artifacts:
> https://repository.apache.org/content/repositories/
> orgapachecassandra-1156/org/apache/cassandra/apache-cassandra/3.11.2/
> Staging repository:
> https://repository.apache.org/content/repositories/
> orgapachecassandra-1156/
>
> Debian and RPM packages are available here:
> http://people.apache.org/~mshuler
>
> *** This release addresses an important fix for CASSANDRA-14092 ***
> "Max ttl of 20 years will overflow localDeletionTime"
> https://issues.apache.org/jira/browse/CASSANDRA-14092
>
> The vote will be open for 72 hours (longer if needed).
>
> [1]: (CHANGES.txt) https://goo.gl/RLZLrR
> [2]: (NEWS.txt) https://goo.gl/kpnVHp
>
>


Re: [VOTE] Release Apache Cassandra 3.0.16

2018-02-12 Thread Jeff Jirsa
+1

On Mon, Feb 12, 2018 at 1:03 PM, Brandon Williams  wrote:

> +1
>
> On Mon, Feb 12, 2018 at 2:31 PM, Michael Shuler 
> wrote:
>
> > I propose the following artifacts for release as 3.0.16.
> >
> > sha1: 91e83c72de109521074b14a8eeae1309c3b1f215
> > Git:
> > http://git-wip-us.apache.org/repos/asf?p=cassandra.git;a=
> > shortlog;h=refs/tags/3.0.16-tentative
> > Artifacts:
> > https://repository.apache.org/content/repositories/
> > orgapachecassandra-1154/org/apache/cassandra/apache-cassandra/3.0.16/
> > Staging repository:
> > https://repository.apache.org/content/repositories/
> > orgapachecassandra-1154/
> >
> > Debian and RPM packages are available here:
> > http://people.apache.org/~mshuler
> >
> > *** This release addresses an important fix for CASSANDRA-14092 ***
> > "Max ttl of 20 years will overflow localDeletionTime"
> > https://issues.apache.org/jira/browse/CASSANDRA-14092
> >
> > The vote will be open for 72 hours (longer if needed).
> >
> > [1]: (CHANGES.txt) https://goo.gl/rLj59Z
> > [2]: (NEWS.txt) https://goo.gl/EkrT4G
> >
> >
>


Re: Search in cassandra

2018-02-09 Thread Jeff Jirsa
Are you referencing a specific book or section of docs? Can you link that here 
so there’s context? 

-- 
Jeff Jirsa


> On Feb 8, 2018, at 8:21 AM, Mahdi Manavi <mahdi.manav...@gmail.com> wrote:
> 
> As in the "search in lost data" section , we should get tombstone from
> atleast one node. We should use the database strategy which keeps the
> deleted data in a table. And in the filtring proccess it should triger
> bloom filter and search in the table and in the case that the search has
> result report it to user. This approach cause to increased search speed and
> it decrease search cost.

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Cassandra Monthly Dev Roundup: Jan 2018 Edition

2018-01-31 Thread Jeff Jirsa
Happy 2018 Cassandra Developers,

I hope you all had a good holiday season. In going through some of the
tickets/emails, I'm pretty happy - we had some contributions from some big
and interesting companies I didn't even realize were using Cassandra, and
that's always fun to see [1].

If you haven't had time to keep up with hot issues this month, there's a
few hot topics that will cause us to issue a release in the very near
future:

1) https://issues.apache.org/jira/browse/CASSANDRA-14092
- We store TTLs as 32 bit ints, we cap users at 20 year TTLs. If you set a
TTL to 20 years, that started to overflow the 32 bit int not long ago.
That's bad. Different versions have different impact, from annoying to very
bad. We'll probably cut a release as soon as this is done. There's some
active conversation in the list and on that JIRA - you should read it if
you care about how we handle data when we find a negative timestamp on disk
(read: there's some disagreement, if you have an opinion, chime in).

2) https://issues.apache.org/jira/browse/CASSANDRA-14173
- The JMX auth stuff used some JDK internals. Those JDK internals changed
with JDK8u161. Sam has a new patch, ready to commit. This probably will get
more and more attention as more and more people upgrade to the newest JDK
and find out Cassandra doesnt start

In terms of big / interesting commits that landed since the last email:

CASSANDRA-7544 Configurable storage port per node. Huge patch, you probably
care about this if you ever tried to run multiple instances of cassandra on
one IP (like on a laptop), or on different ports in a given cluster (port
7000 on some hosts, and 7001 on others), or similar.

CASSANDRA-14134 upgraded dtests to python3, getting rid of old dependencies
on pycassa (unmaintained), an ancient version of thrift, etc. Another huge
patch, if you're developing locally and running dtests yourself, you now
need python3. Some extra good news - docs are now much improved.

CASSANDRA-14190 is a patch from a new contributor that did something most
operators probably really wish existed 10 years ago - "nodetool
reloadseeds". Really should have existed long ago.

CASSANDRA-9067 speed up bloom filter serialization by 3-7x

CASSANDRA-13867 isn't flashy, but is another step in making more things
immutable for safety - huge patch for PartitionUpdate and Mutation, for
those of you who pay attention to the deep, dark internals.


On the mailing list, a user asked about plans for CDC. If you have an
opinion, it's not too late to chime in:
https://lists.apache.org/thread.html/aaa82c7dab534c3a35cfd1c4a082cb3a8f6bbf97e3efe960fa2342d0@%3Cdev.cassandra.apache.org%3E

Patches that could use reviews:
- https://issues.apache.org/jira/browse/CASSANDRA-14205 (Missing CQL
reserved keywords)
- https://issues.apache.org/jira/browse/CASSANDRA-14201 (new options to
nodetool verify)
- https://issues.apache.org/jira/browse/CASSANDRA-14204 (nodetool
garbagecollect assertion error)
- https://issues.apache.org/jira/browse/CASSANDRA-13981 (changes for
running on systems with persistent memory)
- https://issues.apache.org/jira/browse/CASSANDRA-14197 (more automatic
upgradesstables)
- https://issues.apache.org/jira/browse/CASSANDRA-14176 (2 line python fix
for making COPY work)
- https://issues.apache.org/jira/browse/CASSANDRA-14102 (transparent data
encryption)
- https://issues.apache.org/jira/browse/CASSANDRA-14107 (key rotation for
transparent data encryption)
- https://issues.apache.org/jira/browse/CASSANDRA-14160 (speeding up
compaction by keeping overlapping sstables ordered by time)
- https://issues.apache.org/jira/browse/CASSANDRA-12763 (make compaction
much faster for cases with lots of sstables)
- https://issues.apache.org/jira/browse/CASSANDRA-14126 (fixing javascript
UDFs)
- https://issues.apache.org/jira/browse/CASSANDRA-14070 (exposing primary
key column values in a different way)

I'd like to pretend that that's all the patch-available-needing-review
tickets, but I'd be lying - there's a LOT of patches waiting for reviews.
If you're able, please review a ticket this week. I'll personally buy you a
drink next time I bump into you if you do it and remind me about it.

Until February,
- Jeff



Footnote 1: I'm super tempted to name them, but I know some companies don't
like the attention, and I don't want everyone to feel like they have to
post with personal emails.


  1   2   3   >