Re: Jepsen testing

2018-11-09 Thread Oleksandr Shulgin
On Thu, Nov 8, 2018 at 10:42 PM Yuji Ito  wrote:

>
> We are working on Jepsen testing for Cassandra.
> https://github.com/scalar-labs/jepsen/tree/cassandra/cassandra
>
> As you may know, Jepsen is a framework for distributed systems
> verification.
> It can inject network failure and so on and check data consistency.
> https://github.com/jepsen-io/jepsen
>
> Our tests are based on riptano's great work.
> https://github.com/riptano/jepsen/tree/cassandra/cassandra
>
> I refined it for the latest Jepsen and removed some tests.
> Next, I'll fix clock-drift tests.
>
> I would like to get your feedback.
>

Cool stuff!  Do you have jepsen tests as part of regular testing in
scalardb?  How long does it take to run all of them on average?

I wonder if Apache Cassandra would be willing to include this as part of
regular testing drill as well.

Cheers,
--
Alex


Re: Cassandra Needs to Grow Up by Version Five!

2018-02-22 Thread Oleksandr Shulgin
On Thu, Feb 22, 2018 at 9:50 AM, Eric Plowe  wrote:

> Cassandra, hard to use? I disagree completely. With that said, there are
> definitely deficiencies in certain parts of the documentation, but nothing
> that is a show stopper.


True, there are no show-stoppers from the docs side, it's just all those
little things--they add up.

We’ve been using Cassandra since the sub 1.0 days and have had nothing but
> great things to say about it.
>
> With that said, its an open source project; you get from it what you’re
> willing to put in. If you just expect something that installs, asks a
> couple of questions and you’re off to the races, Cassandra might not be for
> you.
>
> If you’re willing to put in the time to understand how Cassandra works,
> and how it fits into your use case, and if it is the right fit for your use
> case, you’ll be more than happy, I bet.
>

We are using Cassandra since v2.1 for more than 2 years now, and installing
was never a problem.  It does work and allows us to sleep well, which
cannot be underappreciated.

The problems begin when you need to do operations.  You never know what
exactly will happen when you start a certain repair command or how the
streaming will happen in case of bootstrap/rebuild, and the docs just
aren't detailed enough, so you have go the trial and error path most of the
time.

Regards,
--
Alex


Re: Cassandra Needs to Grow Up by Version Five!

2018-02-21 Thread Oleksandr Shulgin
On Wed, Feb 21, 2018 at 7:54 PM, Durity, Sean R  wrote:

>
>
> However, I think the shots at Cassandra are generally unfair. When I
> started working with it, the DataStax documentation was some of the best
> documentation I had seen on any project, especially an open source one.
>

Oh, don't get me started on documentation, especially the DataStax one.  I
come from Postgres.  In comparison, Cassandra documentation is mostly
non-existent (and this is just a way to avoid listing other uncomfortable
epithets).

Not sure if I would be able to submit patches to improve that, however,
since most of the time it would require me to already know the answer to my
questions when the doc is incomplete.

The move from DataStax to Apache.org for docs is actually good, IMO, since
the docs were maintained very poorly and there was no real leverage to
influence that.

Cheers,
--
Alex


Re: Cassandra Needs to Grow Up by Version Five!

2018-02-21 Thread Oleksandr Shulgin
On Mon, Feb 19, 2018 at 10:01 AM, Kenneth Brotman <
kenbrot...@yahoo.com.invalid> wrote:

>
> >> Cluster wide management should be a big theme in any next major release.
> >>
> >Na. Stability and testing should be a big theme in the next major release.
> >
>
> Double Na on that one Jeff.  I think you have a concern there about the
> need to test sufficiently to ensure the stability of the next major
> release.  That makes perfect sense.- for every release, especially the
> major ones.  Continuous improvement is not a phase of development for
> example.  CI should be in everything, in every phase.  Stability and
> testing a part of every release not just one.  A major release should be a
> nice step from the previous major release though.
>

I guess what Jeff refers to is the tick-tock release cycle experiment,
which has proven to be a complete disaster by popular opinion.

There's also the "materialized views" feature which failed to materialize
in the end (pun intended) and had to be declared experimental retroactively.

Another prominent example is incremental repair which was introduced as the
default option in 2.2 and now is not recommended to use because of so many
corner cases where it can fail.  So again experimental as an afterthought.

Not to mention that even if you are aware of the default incremental and go
with full repair instead, you're still up for a sad surprise:
anti-compaction will be triggered despite the "full" repair.  Because
anti-compaction is only disabled in case of sub-range repair (don't ask
why), so you need to use something advanced like Reaper if you want to
avoid that.  I don't think you'll ever find this in the documentation.

Honestly, for an eventually-consistent system like Cassandra anti-entropy
repair is one of the most important pieces to get right.  And Cassandra
fails really badly on that one: the feature is not really well designed,
poorly implemented and under-documented.

In a summary, IMO, Cassandra is a poor implementation of some good ideas.
It is a collection of hacks, not features.  They sometimes play together
accidentally, and rarely by design.

Regards,
--
Alex


Re: Pluggable throttling of read and write queries

2017-02-20 Thread Oleksandr Shulgin
On Sat, Feb 18, 2017 at 3:12 AM, Abhishek Verma  wrote:

> Cassandra is being used on a large scale at Uber. We usually create
> dedicated clusters for each of our internal use cases, however that is
> difficult to scale and manage.
>
> We are investigating the approach of using a single shared cluster with
> 100s of nodes and handle 10s to 100s of different use cases for different
> products in the same cluster. We can define different keyspaces for each of
> them, but that does not help in case of noisy neighbors.
>
> Does anybody in the community have similar large shared clusters and/or
> face noisy neighbor issues?
>

Hi,

We've never tried this approach and given my limited experience I would
find this a terrible idea from the perspective of maintenance (remember the
old saying about basket and eggs?)

What potential benefits do you see?

Regards,
--
Alex


Re: data not replicated on new node

2016-11-23 Thread Oleksandr Shulgin
On Tue, Nov 22, 2016 at 5:23 PM, Bertrand Brelier <
bertrand.brel...@gmail.com> wrote:

> Hello Shalom.
>
> No I really went from 3.1.1 to 3.0.9 .
>
So you've just installed the 3.0.9 version and re-started with it?  I
wonder if it's really supported?

Regards,
--
Alex


Re: [RELEASE] Apache Cassandra 3.0.10 released

2016-11-17 Thread Oleksandr Shulgin
On Wed, Nov 16, 2016 at 9:17 PM, Michael Shuler 
wrote:
>
> The Cassandra team is pleased to announce the release of Apache
> Cassandra version 3.0.10.
>
> Apache Cassandra is a fully distributed database. It is the right choice
> when you need scalability and high availability without compromising
> performance.
>
>  http://cassandra.apache.org/
>
> Downloads of source and binary distributions are listed in our download
> section:
>
>  http://cassandra.apache.org/download/
>
> This version is a bug fix release[1] on the 3.0 series. As always,
> please pay attention to the release notes[2] and Let us know[3] if you
> were to encounter any problem.

Hello,

>From the NEWS file:

3.0.10
> =
> Upgrading
> -
>- memtable_allocation_type: offheap_buffers is no longer allowed to be
> specified in the 3.0 series.
>  This was an oversight that can cause segfaults. Offheap was
> re-introduced in 3.4 see CASSANDRA-11039
>  and CASSANDRA-9472 for details.


Does this mean that offheap_objects is still available or that there is no
longer support for offheap memtables in version 3.0?

--
Alex