Re: [DISCUSS] CEP-15: General Purpose Transactions

2021-10-07 Thread C. Scott Andreas

Hi Jonathan,Following up on my message yesterday as it looks like our replies may 
have crossed en route.Thanks for bumping your message from earlier in our discussion. 
I believe we have addressed most of these questions on the thread, in addition to 
offering a presentation on this and related work at ApacheCon, a discussion hosted 
following that presentation at ApacheCon, and in ASF Slack. Contributors have further 
offered an opportuntity to discuss specific questions via videoconference if it helps 
to speak live. I'd be happy to do so as well.Since your original message, discussion 
has covered a lot of ground on the related databases you've mentioned:– Henrik has 
shared expertise related to MongoDB and its implementation.– You've shared an 
overview of Calvin.– Alex Miller has helped us review the work relative to other 
Paxos algorithms and identified a few great enhancements to incorporate.– The paper 
discusses related approaches in FoundationDB, CockroachDB, and Yugabyte.– Subsequent 
discussion has contrasted the implementation to DynamoDB, Google Cloud BigTable, and 
Google Cloud Spanner (noting specifically that the protocol achieves Spanner's 1x 
round-trip without requiring specialized hardware).In my reply yesterday, I've 
attempted to crystallize what becomes possible via CQL: one-shot multi-partition 
transactions in the first implementation and a 4x latency reduction on writes / 2x 
latency reduction on reads relative to today; along with the ability to build upon 
this work to enable interactive transactions in the future.I believe we've exercised 
the questions you've raised and am grateful for the ground we've covered. If you have 
further questions that are difficult to exercise via email, please let me know if 
you'd like to arrange a call (open-invite); we'd be happy to discuss live as 
well.With the proposal hitting the one-month mark, the contributors are interested in 
gauging the developer community's response to the proposal. We warrant our ability to 
focus durably on the project; execute this development on ASF JIRA in collaboration 
with other contributors; engage with members of the developer and user community on 
feedback, enhancements, and bugs; and intend deliver it to completion at a standard 
of readiness suitable for production transactional systems of record.Thanks,– ScottOn 
Oct 6, 2021, at 8:25 AM, C. Scott Andreas  wrote:Hi 
folks,Thanks for discussion on this proposal, and also to Benedict who’s been 
fielding questions on the list!I’d like to restate the goals and problem statement 
captured by this proposal and frame context.Today, lightweight transactions limit 
users to transacting over a single partition. This unit of atomicity has a very low 
upper limit in terms of the amount of data that can be CAS’d over; and doing so leads 
many to design contorted data models to cram different types of data into one 
partition for the purposes of being able to CAS over it. We propose that Cassandra 
can and should be extended to remove this limit, enabling users to issue one-shot 
transactions that CAS over multiple keys – including CAS batches, which may modify 
multiple keys.To enable this, the CEP authors have designed a novel, leaderless 
paxos-based protocol unique to Cassandra, offered a proof of its correctness, a 
whitepaper outlining it in detail, along with a prototype implementation to incubate 
development, and integrated it with Maelstrom from jepsen.io to validate 
linearizability as more specific test infrastructure is developed. This rigor is 
remarkable, and I’m thrilled to see such a degree of investment in the area.Even 
users who do not require the capability to transact across partition boundaries will 
benefit. The protocol reduces message/WAN round-trips by 4x on writes (4 → 1) and 2x 
on reads (2 → 1) in the common case against today’s baseline. These latency 
improvements coupled with the enhanced flexibility of what can be transacted over in 
Cassandra enable new classes of applications to use the database.In particular, 1xRTT 
read/write transactions across partitions enable Cassandra to be thought of not just 
as a strongly consistent database, but even a transactional database - a mode many 
may even prefer to use by default. Given this capability, Apache Cassandra has an 
opportunity to become one of – or perhaps the only – database in the industry that 
can store multiple petabytes of data in a single database; replicate it across many 
regions; and allow users to transact over any subset of it. These are capabilities 
that can be met by no other system I’m aware of on the market. Dynamo’s transactions 
are single-DC. Google Cloud BigTable does not support transactions. Spanner, Aurora, 
CloudSQL, and RDS have far lower scalability limits or require specialized hardware, 
etc.This is an incredible opportunity for Apache Cassandra - to surpass the 
scalability and transactional capability of some of the most advanced systems in our 

Re: [VOTE] Release dtest-api 0.0.10

2021-10-07 Thread Oleksandr Petrov
With 6 +1s, and no -1s, the vote passes.

On Wed, Oct 6, 2021 at 8:21 AM Dinesh Joshi 
wrote:

> +1
>
> Dinesh
>
> > On Oct 5, 2021, at 7:27 PM, Joshua McKenzie 
> wrote:
> >
> > +1
> >
> >> On Tue, Oct 5, 2021 at 2:15 PM Brandon Williams 
> wrote:
> >>
> >> +1
> >>
> >>> On Tue, Oct 5, 2021 at 11:47 AM Oleksandr Petrov
> >>>  wrote:
> >>>
> >>> Proposing the test build of in-jvm dtest API 0.0.10 for release.
> >>>
> >>> Repository:
> >>>
> >>
> https://gitbox.apache.org/repos/asf?p=cassandra-in-jvm-dtest-api.git;a=shortlog;h=refs/tags/0.0.10
> >>>
> >>> Candidate SHA:
> >>>
> >>
> https://github.com/apache/cassandra-in-jvm-dtest-api/commit/2139b4c85e319b17afbdea2f653152d1e1895fc6
> >>> tagged with 0.0.10
> >>>
> >>> Artifacts:
> >>>
> >>
> https://repository.apache.org/content/repositories/orgapachecassandra-1249/org/apache/cassandra/dtest-api/0.0.10/
> >>>
> >>> Key signature: A4C465FEA0C552561A392A61E91335D77E3E87CB
> >>>
> >>> Changes since last release:
> >>>  * CASSANDRA-17013: CEP-10 Simulator Improvements
> >>>
> >>>
> >>> The vote will be open for 24 hours. Everyone who has tested the build
> >>> is invited to vote. Votes by PMC members are considered binding. A
> >>> vote passes if there are at least three binding +1s.
> >>
> >> -
> >> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> >> For additional commands, e-mail: dev-h...@cassandra.apache.org
> >>
> >>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
>

-- 
alex p


Re: [DISCUSS] Cleaning up docs, completing CASSANDRA-16763

2021-10-07 Thread Stefan Miklosovic
Hi Lorina, Ekaterina,

In general your approach sounds good to me. I am also +1 on not
creating too many tickets as I can see it will be easy to get lost in.

If it was feasible to gather all related changes touching a subsystem
under one umbrella ticket, that would be very nice but I do not know
if that makes sense from your point of view (what workflow you have).

Regards

On Wed, 6 Oct 2021 at 23:56, Ekaterina Dimitrova  wrote:
>
> Hey Lorina,
>
> First of all - thank you so much for all the work done by you and the rest
> of the people! The website and the docs are our front door as a project!
>
> +1 on your proposal. My understanding is we need 1)+5) and then everything
> else will be able to roll out and more people will be able to join the
> efforts so we can knock out 2) which seems the biggest work here, did I get
> it correct?
>
> My only comment is about the tickets we will have to open. I can suggest we
> don’t do 1:1 ticket for every small backport ticket/change but 1:1 for
> bigger bodies of work and 1:many where we see we can combine a few smaller
> changes so we don’t deal with too many tickets. Does this sound reasonable?
> Is there a different suggestion or plan?
>
> Thank you one more time. I will be happy to help with what I can do in
> order to bring this to the finish line. I am sure others will do too even
> with a ticket or two :-) In OSS every single contribution matter, right?
>
> Best regards,
> Ekaterina
>
> On Wed, 6 Oct 2021 at 8:22, Benjamin Lerer  wrote:
>
> > Thanks Lorina for all your work.
> >
> > +1 Your proposal makes sense to me.
> >
> > Le mer. 6 oct. 2021 à 00:34, Lorina Poland  a écrit :
> >
> > > This is a discussion about how to tackle getting the docs “fixed”.
> > >
> > > As many of you know, I started months ago to convert the Apache Cassandra
> > > in-tree docs
> > > from reStructedText (rST)to AsciiDoc. [1]
> > > The conversion required both the docs source files to be converted, but
> > > also the cassandra-website
> > > source to be updated, to build the docs from AsciiDoc.[2] You all have
> > seen
> > > the results of that
> > > conversion + the beautiful new design work accomplished.
> > > When Apache Cassandra 4.0 was ready to GA, we used my private repo
> > > (polandll/cassandra) to build the docs for
> > > publication. (The new cassandra-website procedure allows for any repo to
> > be
> > > used to build.)
> > > Due to a series of interferences with virtually all the people on the
> > > project
> > > (myself, Anthony Grasso, Mick Semb Wever, Paul Lau) in the time leading
> > up
> > > to the GA or right after,
> > > we have never gotten my repo work committed and merged to the official
> > > source (apache/cassandra).
> > > So, here is the proposal for a plan of action:
> > >
> > > (1) Anthony and Lorina get the 4.0/trunk and 3.11 branches that Lorina
> > > worked on for the last 18 months
> > > ready for merge from polandll/cassandra -> apache/cassandra.
> > > (2) There are changes that were made in the last 18 months to docs (in
> > the
> > > current rST docs) that need
> > > to be backported to the new adoc docs. We can use the commit history to
> > > hunt down those changes and make
> > > tickets for each of them. Those tickets can be listed under an umbrella
> > > ticket.
> > > (3) There are tickets that already exist that I asked people to wait on
> > > merging during the conversion.
> > > Those tickets also need to be completed.
> > > (4) There are a few tickets for PRs people submitted to my private repo
> > (oh
> > > my!) that should be completed
> > > again in the official repo.
> > > (5) I will write a “how to contribute to docs” that gives people a
> > rundown
> > > of how to write AsciiDoc.
> > >
> > > We would like to merge the docs in their current state, step (1), then
> > make
> > > the backports, rather than make the
> > > backports then merge to the apache/cassandra repo. Main reason for this
> > > order is that, at least the docs
> > > and website could be built from official repos once that is done. Until
> > the
> > > adoc conversion is merged,
> > > the docs and website can only be built from my personal repo, which is a
> > > sad situation.
> > >
> > > Lastly, just to clarify the work we want to merge. I modified the trunk
> > for
> > > 4.0 and made all the changes
> > > required. (750+ files). Then, rather than modify the 3.11 branch, I wrote
> > > trunk to 3.11 and
> > > removed the “What’s new” folder (called /new, unimaginatively). I had
> > > planned to then go back and
> > > incorporate the "What’s new" material into the appropriate places in the
> > > 4.0 docs, because, in short order,
> > > those changes are no longer what’s new.
> > >
> > > [1]
> > >
> > >
> > https://lists.apache.org/thread.html/r42802f86d7893c42b5091fe7f7d4b048a63cbe0fd11fadcd120596e3%40%3Cdev.cassandra.apache.org%3E
> > > [2]
> > >
> > >
> > https://lists.apache.org/thread.html/r961c52f58a42a3b2cae7299244a525311283cd2758d0201f8b0feb83%40%3Cdev.cassandr