Re: [ANNOUNCE] Evolving governance in the Cassandra Ecosystem

2023-01-26 Thread C. Scott Andreas

Josh and all PMC members, thank you for your work on this!Supportive of the changes 
and grateful to have scaffolding in place to accommodate current/incoming 
subprojects.– ScottOn Jan 26, 2023, at 1:21 PM, Josh McKenzie 
 wrote:The Cassandra PMC is pleased to announce that 
we're evolving our governance procedures to better foster subprojects under the 
Cassandra Ecosystem's umbrella. Astute observers among you may have noticed that the 
Cassandra Sidecar is already a subproject of Apache Cassandra as of CEP-1 
(https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=95652224) and 
Cassandra-14395 (https://issues.apache.org/jira/browse/CASSANDRASC-24), however up 
until now we haven't had any structure to accommodate raising committers on specific 
subprojects or clarity on the addition or governance of future subprojects.

Further, with the CEP for the driver donation in motion 
(https://docs.google.com/document/d/1e0SsZxjeTabzrMv99pCz9zIkkgWjUd4KL5Yp0GFzNnY/edit#heading=h.xhizycgqxoyo),
 the need for a structured and sustainable way to expand the Cassandra 
Ecosystem is pressing.

We'll document these changes in the confluence wiki as well as the sidecar as 
our first formal subproject after any discussion on this email thread. The new 
governance process is as follows:
-

Subproject Governance
1. The Apache Cassandra PMC is responsible for governing the broad Cassandra 
Ecosystem.
2. The PMC will vote on inclusion of new interested subprojects using the 
existing procedural change vote process documented in the confluence wiki 
(Super majority voting: 66% of votes must be in favor to pass. Requires 50% 
participation of roll call).
3. New committers for these subprojects will be nominated and raised, both at inclusion as a subproject and over time. Nominations can be brought to priv...@cassandra.apache.org. Typically we're looking for a mix of commitment and contribution to the community and project, be it through code, documentation, presentations, or other significant engagement with the project. 
4. While the commit-bit is ecosystem wide, code modification rights and voting rights (technical contribution, binding -1, CEP's) are granted per subproject

4a. Individuals are trusted to exercise prudence and only commit or claim 
binding votes on approved subprojects. Repeated violations of this social 
contract will result in losing committer status.
4b. Members of the PMC have commit and voting rights on all subprojects.
5. For each subproject, the PMC will determine a trio of PMC members that will 
be responsible for all PMC specific functions (release votes, driving CVE 
response, marketing, branding, policing marks, etc) on the subproject.
-

Curious to see what thoughts we have as a community!

Thanks!

~Josh

Re: [DISCUSS] Formation of Apache Cassandra Publicity & Marketing Group

2023-01-26 Thread Patrick McFadin
Thanks for the positive reception on email and slack.

We are going to have our first gathering next Wednesday at 8AM PT

Link to calendar event:
https://calendar.google.com/calendar/event?action=TEMPLATE&tmeid=MDVoY3VucnMwaWViaXA1amFmdXAzcnN0dTYga2w5cHVoZ2s3cXRkdXFhdHRlOHRmZDVtcHNAZw&tmsrc=kl9puhgk7qtduqatte8tfd5mps%40group.calendar.google.com



On Tue, Jan 24, 2023 at 3:35 AM Mick Semb Wever  wrote:

> The market...@cassandra.apache.org list is created.
>
> To subscribe send an email to marketing-subscr...@cassandra.apache.org
> from
> the email address you want to subscribe from.
>
> If you are a committer you can alternately use Whimsy:
> https://whimsy.apache.org/committers/subscribe
>
> regards,
> Mick
>
>
> On Fri, 20 Jan 2023 at 00:31, Patrick McFadin  wrote:
>
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> > *Hello Cassandra Community!We are at a pivotal moment for the Cassandra
> > community, with the first Cassandra Summit in 7 years coming up on March
> > 13th, and a major release coming later this year with Cassandra 5.0. It
> is
> > important that we come together to set the publicity strategy and
> direction
> > for these important moments, and that we work together to define how
> > Cassandra shows up across the technology industry.To achieve this, we are
> > proposing the formation of a Publicity & Marketing Working Group, and we
> > are requesting your participation.What is the Publicity & Marketing
> Working
> > Group?This is a working group open to community members who have the
> > insight and skills to help define Cassandra’s public narrative and
> > participate in our marketing strategy and execution. The group will meet
> > once a month for an hour to discuss important marketing topics. You can
> > find us on #cassandra-events. We also propose adding a mailing list,
> > marketing@cassandra.a.o, to handle day-to-day marketing needs and async
> > communication. Our publicity and marketing partners from Constantia -
> Molly
> > Monroy  and Melissa Logan  -
> > will work with us to build this working group. What will this group be
> > responsible for?Our initial vision for this group is to accelerate how we
> > do marketing & publicity for Cassandra. We will refine and advance
> > Cassandra’s public perception of the tech industry, to show how Cassandra
> > has grown, innovated, and revitalized itself as a community. We will do
> > this through: - Participating in marketing strategy for major moments (in
> > particular, C* Summit in March and Cassandra 5.0 release later this
> year)-
> > Expanding our local meetup and events presence- Sourcing end-user case
> > studies for marketing and PR collateral- Making sure the Cassandra
> > community shows up at third-party events- Contributing content - from
> blogs
> > to documentation - to ensure we have a robust stream of content for our
> end
> > usersOur first two orders of business will be: 1. Jointly determine
> > operating model and governance, and get input and alignment on the above
> > goals/responsibilities. 2. Discuss marketing for Cassandra Summit,
> > primarily defining the news we will share at the event from the project
> > directly and from our sponsors. This is coming up quickly and we will
> need
> > community assistance to achieve our publicity goals. As this is a
> > community-driven group, please share ideas and feedback on the purpose of
> > this group and what we need to achieve. When is the meeting?We are
> > proposing the meetings take place on the 4th Wednesday of each month. We
> > will alternate times of the day to try to accommodate. We can adjust
> based
> > on member attendance.  - Jan, March, May, July, Sept, Nov.  - 4th Wed of
> > the month,  8a PT- Feb, April, June, August, October, Dec - 4th Wed of
> the
> > month, Wed 4p PTWe will create a centralized document to share and
> document
> > information about the working group, including meeting minutes, monthly
> > tasks, and priorities. Decisions will be discussed and finalized using
> the
> > project mailing list. Patrick*
> >
>


Cassandra project status, 2023-01-26

2023-01-26 Thread Josh McKenzie
After a bit of time away, I'm ready to regale you with tales of things you've 
already seen on the dev list and JIRA. ;)

Let's start with calling out that registrations for the Cassandra Summit are 
open. Patrick did a better job than I ever could summarizing this in his email 
poetically titled "Cassandra Summit update for 2023-01-24", which you can find 
here: https://lists.apache.org/thread/7roz6z8nvj9cz8o2jwwo1httl85mwjcs. If you 
haven't registered yet and are in the area or receptive to travel, you should 
seriously consider going - it's always great to be at a conference with other 
people brainstorming, lamenting, and celebrating our shared experiences with 
this software project.

>From a technical perspective, there's 2 things I want to call out. One: I want 
>to draw everyone's attention to is the epic Mick has put together for an 
>effort to make ASF CI not only stable, but also repeatable on other 
>containerized cloud-native environments: 
>https://issues.apache.org/jira/browse/CASSANDRA-18137

There's a lot of context there, but the high level 4 goals are:
1. Reproducible reference ASF CI environment so contributors can clone it.
2. An accepted “test result output” format that will certify a commit 
regardless of CI env.
3. Turnaround times as fast as circleci (cloned environment scales to capacity).
4. Intuitive CI implementation accessible to new contributors.

Ultimately, the ideal best-case would be that we could get away from having 2 
CI systems, one of which is a paid-for service, and have a "reproducible 
runnable CI" deterministic Thing contributors can run to get insight into their 
contributions and their stability. Taking this a logical step further, those of 
us that are currently spending money on a paid-for CI system could potentially 
better spend that money on a shared CI infrastructure that the entire project 
could use and benefit from.

Quite a bit of work has fallen out from that epic and is linked from the 
ticket; please take 5 minutes to scan through the ticket and some of the 
sub-tasks so it's at least on your radar. Stable public CI is something we've 
struggled with for _years_, but we've made huge strides in the past year and my 
intuition tells me there's a light at the end of this tunnel.

Mick also hit up the dev ML w/a thread on this offering more context: 
https://lists.apache.org/thread/fqdvqkjmz6w8c864vw98ymvb1995lcy4

The second thing: The Build Lead role! We need volunteers: 
https://cwiki.apache.org/confluence/display/CASSANDRA/Build+Lead. So the TL;DR 
on this and why you should consider it: it takes 30-60 minutes *for the entire 
week*, it helps us stay on top of our CI infrastructure and test failures, 
you'll receive the undying gratitude of many of us on the project, and you also 
get some insight into interesting dark corners of the CI infra and testing 
system you might otherwise never have known about. You don't need to triage or 
attribute failures in the role unless you really want to; getting them 
reflected in JIRA is the low hanging fruit here.


[New Contributors Getting Started]
(Unassigned) (Starter Tickets): this is the set of filters you want to pull 
from on our project's Kanban board: 
https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=484&quickFilter=2162&quickFilter=2160

We have 26 issues in 4.x (22 really; looks like there's 4 either in progress or 
review that need assignee tidied up). 8 issues in 4.0.x, and another 5 floating 
around there.

If any of those catch your fancy, join us in the #cassandra-dev channel on 
https://the-asf.slack.com (reply to me on this email if you need an invite for 
your account), and hit up the @cassandra_mentors alias to reach 13 of us just 
waiting with bated breath to help you get oriented. :)

And hey, if any of these _don't_ catch your fancy but you're still interested 
in the project and are looking for something interesting to get involved with, 
just hop in the slack channel and raise the :batsignal:


[Dev mailing list]
So it's been... a bit. Since I sent the last project status update. Thankfully 
it's the holiday season so we didn't accumulate a crushing load of things I 
have to summarize for us here: 
https://lists.apache.org/list?dev@cassandra.apache.org:dfr=2022-12-19|dto=2023-1-23:

The vote for the Trie-indexed SSTable format passed about a month ago - 
congratulations Branimir and team! 
https://lists.apache.org/thread/d4sr3jkt4xjn86xrf9h708y6s7lc53v5

I sent out an email discussing taking the smallest concievable baby steps in 
formalizing performance testing for the project here: 
https://lists.apache.org/thread/kzbv632tm0j99mg10z24wb8f09z0r81z. It seems like 
the general consensus is that there's _a lot_ of appetite to engage on this 
topic and interesting ideas, and most people aren't all that interested in (nor 
disagreeing with) the bare bones v1 of "get a repeatable test with a repeatable 
runtime env setup and iterate from there". I think the real challenge h

[ANNOUNCE] Evolving governance in the Cassandra Ecosystem

2023-01-26 Thread Josh McKenzie
The Cassandra PMC is pleased to announce that we're evolving our governance 
procedures to better foster subprojects under the Cassandra Ecosystem's 
umbrella. Astute observers among you may have noticed that the Cassandra 
Sidecar is already a subproject of Apache Cassandra as of CEP-1 
(https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=95652224) and 
Cassandra-14395 (https://issues.apache.org/jira/browse/CASSANDRASC-24), however 
up until now we haven't had any structure to accommodate raising committers on 
specific subprojects or clarity on the addition or governance of future 
subprojects.

Further, with the CEP for the driver donation in motion 
(https://docs.google.com/document/d/1e0SsZxjeTabzrMv99pCz9zIkkgWjUd4KL5Yp0GFzNnY/edit#heading=h.xhizycgqxoyo),
 the need for a structured and sustainable way to expand the Cassandra 
Ecosystem is pressing.

We'll document these changes in the confluence wiki as well as the sidecar as 
our first formal subproject after any discussion on this email thread. The new 
governance process is as follows:
-

Subproject Governance
1. The Apache Cassandra PMC is responsible for governing the broad Cassandra 
Ecosystem.
2. The PMC will vote on inclusion of new interested subprojects using the 
existing procedural change vote process documented in the confluence wiki 
(Super majority voting: 66% of votes must be in favor to pass. Requires 50% 
participation of roll call).
3. New committers for these subprojects will be nominated and raised, both at 
inclusion as a subproject and over time. Nominations can be brought to 
priv...@cassandra.apache.org. Typically we're looking for a mix of commitment 
and contribution to the community and project, be it through code, 
documentation, presentations, or other significant engagement with the project. 
4. While the commit-bit is ecosystem wide, code modification rights and voting 
rights (technical contribution, binding -1, CEP's) are granted per subproject
 4a. Individuals are trusted to exercise prudence and only commit or claim 
binding votes on approved subprojects. Repeated violations of this social 
contract will result in losing committer status.
 4b. Members of the PMC have commit and voting rights on all subprojects.
5. For each subproject, the PMC will determine a trio of PMC members that will 
be responsible for all PMC specific functions (release votes, driving CVE 
response, marketing, branding, policing marks, etc) on the subproject.
-

Curious to see what thoughts we have as a community!

Thanks!

~Josh


Re: [DISCUSSION] Framework for Internal Collection Exposure and Monitoring API Alignment

2023-01-26 Thread David Capwell
I took a look and I see the result is an interface that looks like the vtable 
interface, that is then used by vtables and JMX?  My first thought is why not 
just use the vtable logic?

I also wonder about if we should care about JMX?  I know many wish to migrate 
(its going to be a very long time) away from JMX, so do we need a wrapper to 
make JMX and vtables consistent?  I am cool with something like the following

registerWithJMX(jmxName, query(“SELECT * FROM system_views.streaming”));

So if we want to have a JMX view that matches the table then that’s cool by me, 
but one thing that has been brought up in reviews is backwards compatibility 
with regard to adding columns… If we add a column to the end of the JMX row did 
we just break users?  

> Considering that JMX is usually not used and disabled in production 
> environments for various performance and security reasons, the operator may 
> not see the same picture from various of Dropwizard's metrics exporters

If this is a real problem people are hitting, we can always add the ability to 
push metrics to common systems with a pluggable way to add non-standard 
solutions.  Dropwizard already support this so would be low hanging fruit to 
address this.

> To make the proposed changes backwards compatible with the previous version 
> of Cassandra, all MBeans and Virtual Tables we already have will remain 
> unchanged


If this is for new JMX endpoints moving forward, I am not sure of the benefit 
for the same reason listed above; we wish to move away from JMX

> On Jan 25, 2023, at 10:51 AM, Maxim Muzafarov  wrote:
> 
> Hello Cassandra Community,
> 
> 
> I've been faced with a number of inconsistencies in the user APIs of
> the internal data collections representation exposed through the
> Cassandra monitoring interfaces that need to be fully aligned from an
> operator perspective. First of all, I'm highlighting JMX, Dropwizard
> Metrics, and Virtual Tables user interfaces. In order to address all
> these inconsistencies, I have created a draft enhancement proposal
> that describes everything I have found and how we can fix it once and
> for all.
> 
> I'd like to hear your opinion and thoughts on it. Please take a look:
> https://docs.google.com/document/d/1j4J3bPWjQkAU9x4G-zxKObxPrKg36jLRT6xpUoNJa8Q
> 
> 
> -- 
> Maxim Muzafarov



Upgrading sstables and default partitioner.

2023-01-26 Thread Claude Warren, Jr via dev
Greetings,

I am working on porting a fix for table upgrade order into V3.0 and have
come across the following issue:

ERROR 10:23:31 Cannot open
/home/claude/apache/cassandra/build/test/cassandra/data/system_schema/keyspaces-abac5682dea631c5b535b3d6cffd0fb6/me-89-big;
partitioner org.apache.cassandra.dht.ByteOrderedPartitioner does not match
system partitioner org.apache.cassandra.dht.Murmur3Partitioner.  Note that
the default partitioner starting with Cassandra 1.2 is Murmur3Partitioner,
so you will need to edit that to match your old partitioner if upgrading.

Now I know that this can be corrected by setting the default partitioner in
the test code for later versions but in 3.0 we are simply calling the
bin/sstableupgrade script.  This got me wondering.


   1. Should the upgrade fail if the partitioner is different?  I think
   that the partitioner should simply upgrade the format and leave the
   specified partitioner as it is.  If we need to change the partitioner  then
   we need a way to do it with a command line/environment option for the
   sstableupgrade to function.
   2. At what point did the system move from being ByteOrderd to Murmur3?
   Wouldn't the upgradetables script have failed at that point?
   3. #2 leads me to ask, who uses the upgradetables script?  Since later
   Cassandra versions can read earlier versions it must only be used when
   skipping entire versions.

Discussion/solutions for these questions/problems is greatly appreciated,
Claude