Re: unsubscribe

2024-01-10 Thread Erick Ramirez
Sorry to see you go. If you'd like to unsubscribe from the user ML, please
email user-unsubscr...@cassandra.apache.org. Cheers!

>


Re: java driver with cassandra proxies (option: -Dcassandra.join_ring=false)

2023-10-12 Thread Erick Ramirez
Those nodes are not in the peers table(s) because you told them NOT to join
the ring with `join_ring=false` so it is working by design.

I'm not really sure what you're trying to achieve but if you want to
separate the coordinator functions from the storage then what you probably
want is to deploy Stargate nodes . Stargate is a data
API gateway that sits between the app instances and the Cassandra database.
It decouples client request coordination from the storage aspects of C*. It
also allows you to perform CRUD operations against C* using APIs -- REST,
JSON, gRPC, GraphQL.

See the docs on Using the Stargate CQL API
 for details on
how to set up Stargate nodes as coordinators for your C* database.

If you want to see it in action, you can try it free on Astra DB
 (Cassandra-as-a-service). Cheers!

>


Re: Questions about high read latency and related metrics

2023-05-11 Thread Erick Ramirez
Is it the concept of histograms that's not clear? Something else?

>


Re: Questions about high read latency and related metrics

2023-05-11 Thread Erick Ramirez
The min/max/mean partition sizes are the sizes in bytes which are the same
statistics reported by nodetool tablestats.

EstimatedPartitionSizeHistogram is the distribution of partition sizes
within specified ranges (percentiles) and is the same histogram reported by
nodetool tablehistograms (in the Partition Size column). Cheers!

>


Re: JIRA account creation request for new contributor

2023-02-15 Thread Erick Ramirez
Mick has provisioned your account and you should receive a separate email
from the jira server with further instructions. Cheers!


Re: JIRA account creation request for new contributor

2023-02-15 Thread Erick Ramirez
Welcome! Sorry for the delay. Let me see if I can get your request
expedited. 

On Tue, 14 Feb 2023 at 07:31, Omair Muhi  wrote:

> Greetings,
>
> I would like to request a new JIRA account as I am interested in
> contributing to the project. Here is my information:
>
>
>- email address: omairm...@icloud.com
>- preferred username: OMAIRMUHI
>- alternate username: OMUHI
>
> Please let me know if there is any other information needed from my side.
>
> Cheers,
> Omair
>


Re: Unsubscribe

2023-01-24 Thread Erick Ramirez
Sorry to see you go. If you'd like to unsubscribe from the user ML, please
email user-unsubscr...@cassandra.apache.org. Cheers!


Re: Change IP address (on 3.11.14)

2022-12-06 Thread Erick Ramirez
If (a) the node is part of the cluster, and (b) is running and operational,
then (c) the cluster will recognise that the node has a new IP when you
restart the node and there's nothing to do on the C* side.

A new IP will be handled by C* automatically. Think of situations where a
node experiences a hardware failure and you move the data disk to a new
server which has a new IP. When you start C* on that node, it will be
recognised as an existing node and that's normal. Cheers!


Re: Errors while attempting snapshot

2022-11-02 Thread Erick Ramirez
Can you help us out by providing more details? When asking questions, it's
always a good idea to include background info such as versions and steps to
replicate the issue. Cheers!


Re: cassandra.yaml

2022-11-02 Thread Erick Ramirez
They are really in mebibytes (MiB). In the upcoming release of Cassandra,
the configuration is getting standardised to KiB, MiB, etc, to remove
ambiguity (CASSANDRA-15234 [1]). For more info, see Ekaterina
Dimitrova's blog post [2]. Cheers!

[1] https://issues.apache.org/jira/browse/CASSANDRA-15234
[2]
https://cassandra.apache.org/_/blog/Apache-Cassandra-4.1-Configuration-Standardization.html

>


CASSANDRA DAY - Santa Clara, Bellevue, Houston + FREE private screening of Wakanda Forever! 

2022-10-27 Thread Erick Ramirez
[image: cday-20221110-wakanda_forever.png]
Calling all developers!

The Apache Cassandra community invites you to join an action-packed day of
superhero events held simultaneously across 3 cities on November 10 — Santa
Clara CA, Bellevue WA and Houston TX!
Event info

WORKSHOP - Attend in-person in one of the cities and join us for
complimentary two-hour hands-on workshop to learn how to build applications
on Cassandra. Participants receive a voucher to take the Cassandra test at
a later date to get certified.

MEETUP - We celebrate the return of in-person Cassandra meetups with
speakers from Netflix, Intel, Quine & DataStax. There will be lots of food,
drinks and time to network PLUS t-shirts and prizes!

MOVIE - Bring a guest to end the day with an exclusive private
screening of Black
Panther: Wakanda Forever — a Cassandra appreciation event sponsored by
DataStax and Intel!

For agenda and venue details, visit the registration page
. This event is expected to sell out
soon so be quick and register today!
Background

Cassandra Days focus on the open source Apache Cassandra project and the
community that supports the project. Everyone (whether they are an
individual user, contributor, or company) is welcome to attend and help
organize these events.

These events are an opportunity for Apache Cassandra users, enthusiasts,
and community members to share their experiences working with Cassandra
daily, hear talks and participate in workshops for NoSQL beginners &
experts.


Re: cassandra 4.0.6 files removed from archive

2022-10-23 Thread Erick Ramirez
redhat.cassandra.apache.org/40x/ redirects to
apache.jfrog.io/artifactory/cassandra-rpm/40x/. When I curl it on the
command line, I can see that the cassandra-tools package for 4.0.6 is
there. Cheers!

cassandra-4.0.6-1.noarch.rpm
 25-Aug-2022 09:05  45.43 MB
cassandra-4.0.6-1.src.rpm
  25-Aug-2022 09:05  12.20 MB
cassandra-tools-4.0.6-1.noarch.rpm
   25-Aug-2022 09:05  7.34 KB

>


Re: Upgrade

2022-10-12 Thread Erick Ramirez
That's correct. Cheers!


Re: Upgrade

2022-10-12 Thread Erick Ramirez
It's just a minor patch upgrade so all you're really upgrading is the
binaries. In any case, switching off replication is not the recommended
approach. The recommended pre-upgrade procedure is to take backups of the
data on your nodes with nodetool snapshot. Cheers!


Re: Cassandra java > 15

2022-09-26 Thread Erick Ramirez
Correction -- It has been brought to my attention that there are no plans
for 5.0 yet so Java 17 support might be added sooner in the next 4.x
release if it's ready. Cheers!


> There is no support for Java 17 yet. The plan is to add it in Cassandra
> 5.0 [1].
>
> [1] https://issues.apache.org/jira/browse/CASSANDRA-16895
>


Re: Cassandra java > 15

2022-09-26 Thread Erick Ramirez
There is no support for Java 17 yet. The plan is to add it in Cassandra 5.0
[1]. By default, builds are done with Java 8. You can build with Java 11 by
setting the flags documented on the site [2]. Cheers!

[1] https://issues.apache.org/jira/browse/CASSANDRA-16895
[2] https://cassandra.apache.org/_/development/ide.html


Re: Local read request going across DC

2022-09-21 Thread Erick Ramirez
Just to circle back here, I've reviewed the trace output and it shows
multiple requests fired off for lookups on the `roles` table which
indicated to me that the default `cassandra` superuser is being used. To be
clear, the original read request is being executed at the configured
consistency BUT authenticating with the default `cassandra` superuser
requires `QUORUM` so it spans across DCs.

For this reason, we recommend that the default `cassandra` superuser only
be used for provisioning a new admin role then should not be used again.
Cheers!

>


Re: Local read request going across DC

2022-09-21 Thread Erick Ramirez
Would you be open to temporarily posting the full CQL command + full trace
output to gist.github.com? I'd like to see what it shows. Cheers!

>


Re: Local read request going across DC

2022-09-20 Thread Erick Ramirez
It sounds like read-repair chance is enabled on the table. Check the table
schema for a non-zero read_repair_chance. Cheers!

>


Re: netty connection reset by peer errors in logs

2022-09-01 Thread Erick Ramirez
That error message indicates that 2 nodes are unable to communicate with
each other over the internode (gossip) port. It makes no sense to suppress
it since it's an indication that there's a problem that you need to
address. Cheers!


Re: Cassandra 4.0 upgrade - Upgradesstables

2022-08-16 Thread Erick Ramirez
As convenient as it is, there are a few caveats and it isn't a silver
bullet. The automatic feature will only kick in if there are no other
compactions scheduled. Also, it is going to be single-threaded by default
so it will take a while to get through all the sstables on dense nodes.

In contrast, you'll have a bit more control if you manually upgrade the
sstables. For example, you can schedule the upgrade during low traffic
periods so reads are not competing with compactions for IO. Cheers!

>


Re: Cassandra Client compatibility

2022-08-02 Thread Erick Ramirez
In the context of driver compatibility, Cassandra "3.0+" means C* 3.0 and
newer releases which include C* 3.0.x, 3.11.x, 4.0.x and [soon] 4.1.x.

To answer your question directly, version 2.6 of the C++ driver works with
C* 3.11.11 but we don't recommend you use it since it's an ancient release
that's over 5 years old. If you run into issues, the first response you'll
get is to upgrade. Instead, we recommend using the latest release of the
drivers if you're about to start a new project. Cheers!


Re: Slow unit tests with Cassandra 4.x on macOS

2022-06-02 Thread Erick Ramirez
Johannes, I've copied the Dev ML to hopefully get a wider audience. Cheers!

On Wed, 1 Jun 2022 at 21:25, Johannes Weißl  wrote:

> Hello,
>
> We noticed that our unit tests are way slower on macOS after the upgrade
> from Cassandra 3.11.x to 4.x, e.g. over 8 minutes instead of 30 seconds
> (!).
> On Linux the duration stays more or less the same.
>
> After debugging, we found that operations like "DROP KEYSPACE" seem
> responsible for the increase. Also interesting: If Cassandra is started
> via Docker on macOS, the tests run as fast as on Cassandra 3.11.x again.
>
> Is this a known phenomenon? Do others experience it as well?
>
> Thanks,
> Johannes
>


Re:

2022-05-27 Thread Erick Ramirez
Sorry to see you go. If you wish to unsubscribe from the mailing list,
please email user-unsubscr...@cassandra.apache.org. Cheers!

On Sat, 28 May 2022 at 09:27, Prachi Rath  wrote:

> unsubscribe
>
>


Re: Upgrade 3.11.6 to 4.0.3, commitlog warns

2022-05-13 Thread Erick Ramirez
>
> Thank you for that clarification, Erick. So do i understand correctly,
> that because of the upgrade the host id changed and therefore differs from
> the ones in the sstables where the old host id is still sitting until a
> sstable upgrade?
>

Not quite. :) The host ID will never change for the lifetime of a node. As
I said:

... the SSTables now contain the host ID on which they were created ...


the implication being that the older versions didn't store the host ID with
the SSTables hence leading to the "origin unknown" warning in newer
versions. Cheers!


Re: Upgrade 3.11.6 to 4.0.3, commitlog warns

2022-05-13 Thread Erick Ramirez
It's expected and is nothing to worry about. From C* 3.0.25/3.11.11/4.0,
the SSTables now contain the host ID on which they were created to prevent
loss of commitlog data when SSTables are moved/copied to other nodes
(CASSANDRA-16619). That's why the message is logged at WARN level instead
of ERROR. Cheers!

>


Re: Malformed IPV6 address

2022-04-27 Thread Erick Ramirez
This issue was reported in https://community.datastax.com/questions/13764/
as well. TL;DR the URL parser for JNDI providers was made stricter in
Oracle Java 8u331 and brackets are only allowed around IPv6 addresses. The
URL format in NodeProbe.java wraps the host in square brackets so nodetool
fails with the syntax exception.

Jermy Li posted PR #1586 and I've requested him to log a ticket for it
(CASSANDRA-17581). Israel Fuchter and penky28 posted the following
workarounds:

OPTION 1 - Add a legacy flag to disable the new validation, for example:

$ nodetool -Dcom.sun.jndi.rmiURLParsing=legacy status

OPTION 2 - Specify the hostname with an IPv6 subnet prefix, for example:

$ nodetool -h :::127.0.0.1 status

Would you please try both workarounds and let us know if either of them
work for you? Cheers!


Re: Cassandra 4.X CqlBulkOutputFormat - mapreduce

2022-04-13 Thread Erick Ramirez
>
> than you for your response .. But I’m looking for this source in cassandra
> 4.x release:
>
>
> https://github.com/AndyHu19900119/cassandra/blob/trunk/examples/hadoop_word_count/src/WordCount.java


That was removed a long time ago when Hadoop code was removed in C* 3.0 (
CASSANDRA-9353 ).
Cheers!


Re: Cassandra 4.X CqlBulkOutputFormat - mapreduce

2022-04-08 Thread Erick Ramirez
Is this what you are looking for?

https://github.com/apache/cassandra/blob/cassandra-4.0.3/src/java/org/apache/cassandra/hadoop/cql3/CqlBulkOutputFormat.java


CALLING ALL CASSANDRA USERS - The Community would love to hear from you!

2022-04-05 Thread Erick Ramirez
Are you using Cassandra in production today?
Is your organisation building an app with Cassandra as the backend?
Is your organisation evaluating or building a POC on Cassandra?

If you answered yes to ANY of the questions above, we want to hear from
you! The Cassandra Community is always interested in hearing from users &
developers talk about their Cassandra experiences, use cases, projects,
production deployments, lessons learned.

Can you help? Let's talk -- https://calendly.com/erickramirezau/catchup.
You don't have to make a commitment, just a quick initial 10-15 minute
chat. Cheers!


Re: Cassandra commitlog corruption on hard shutdown

2022-04-05 Thread Erick Ramirez
Thanks for circling back and posting your experience!

>


Re: upgrade from 3.11 to 4.0

2022-03-26 Thread Erick Ramirez
The general advice is to always upgrade to 3.11.latest before upgrading to
4.0.latest. It is possible to upgrade from an older 3.11 version but you'll
probably run into known issues already fixed in the latest version.

Also, we recommend you run upgradesstables BEFORE upgrading to 4.0.latest
-- this is to make sure there are no old sstables lying around. If there
are no old sstables to upgrade, the upgradesstables is a no-op so it's no
big deal. And definitely run it AFTER you've completed the binary upgrade
to 4.0. It doesn't need to happen immediately and can be postponed for
off-peak periods. Cheers!


Re: Cassandra 3.0.14 transport completely blocked

2022-03-22 Thread Erick Ramirez
>
> Thanks, Scott, for the prompt response! We will apply this patch and see
> how it goes.
> Also, in the near future, we will consider upgrading to 3.0.26 and
> eventually to 4.0
>

We would really discourage you from just upgrading to C* 3.0.21. There
really is no logical reason for doing that. If you're going to the trouble
of upgrading the binaries, you might as well go all the way to C* 3.0.26
since it's a prerequisite to eventually upgrading to C* 4.0. Cheers!


Re: Unsolicited emails from IRONMAN Monttremblant

2022-02-11 Thread Erick Ramirez
Thanks for bringing it up. I've been meaning to look into this. It got
annoying enough for me today that I got ASF Infra team to investigate and
the offending address has now been removed from the list (
https://issues.apache.org/jira/browse/INFRA-22879). Cheers!


Re: [RELEASE] Apache Cassandra 4.0.2 released

2022-02-11 Thread Erick Ramirez
(moved dev@ to BCC)


> It looks like the otc_coalescing_strategy config key is no longer
> supported in cassandra.yaml in 4.0.2, despite this not being mentioned
> anywhere in CHANGES.txt or NEWS.txt.
>

James, you're right -- it was removed by CASSANDRA-17132
 in 4.0.2 and 4.1.

I agree that the CHANGES.txt entry should be clearer and we'll improve it
plus add detailed info in NEWS.txt. I'll get this done soon in
CASSANDRA-17135 .
Thanks for the feedback. Cheers!


Re: Cassandra 4.0 upgrade from Cassandra 3x

2022-02-10 Thread Erick Ramirez
Make sure you go through all the instructions in
https://github.com/apache/cassandra/blob/trunk/NEWS.txt. It's also highly
recommended that you upgrade to the latest 3.0.x or 3.11.x version before
upgrading to 4.0.

Generally there are no changes required on the client side apart from
setting the protocol version depending on the driver version. Cheers!

>


Re: Running enablefullquerylog crashes cassandra

2022-02-09 Thread Erick Ramirez
Are there really no entries after those INFO messages? That indicates that
a person/script/daemon/tool/process killed Cassandra. Perhaps check the OS
logs to see if oom-killer kicked in to see if the C* process was
terminated. Cheers!

>


Re: TLS/SSL overhead

2022-02-05 Thread Erick Ramirez
The 3-5% penalty range is consistent with what other users have reported
over the years but I'm sorry that I can't seem to find the
threads/references so my response is unfortunately anecdotal.

More importantly, would you be interested in sharing your data? It would be
great to feature it as a blog post and I'm sure a lot of users are going to
be very interested. It doesn't have to be a polished write up and we've got
other contributors who'd be happy to help with the draft if that's a
concern. Cheers!


Re: Cassandra internal bottleneck

2022-02-05 Thread Erick Ramirez
How many clients do you have sending write requests? In several cases I've
worked on, the bottleneck is on the client side.

Try increasing the number of app instances and you might find that the
combined throughput increases significantly. Cheers!


Re: Problem on setup Cassandra v4.0.1 cluster

2022-01-31 Thread Erick Ramirez
TP stats indicate pending gossip. Check that the times are synchronised on
both nodes (use NTP) since it can prevent gossip from working.

I'd also suggest looking at the logs on both nodes to see what other WARN
and ERROR messages are being reported. Cheers!


Re: Cassandra 4.0 hanging on restart

2022-01-26 Thread Erick Ramirez
I just came across this thread and noted that you're running repairs with
-pr which are not incremental repairs. Was that a typo? Cheers!


Re: Migration between Apache 4.x and DSE 6+?

2022-01-18 Thread Erick Ramirez
DSE 6.x is compatible with C* 3.11. In any case, there are a lot of sharp
edges with mixing OSS C* and DSE nodes so it's not recommended.

It is going to be addressed in a future release. Cheers!


Re: Error in bootstrapping node

2021-12-16 Thread Erick Ramirez
The error you're seeing is specific to DSE so your best course of action is
to log a ticket with DataStax Support (https://support.datastax.com).
Cheers!


Re: Which source replica does rebuild stream from?

2021-11-25 Thread Erick Ramirez
Yes, you are correct that the source may not necessarily be fully
consistent. But this risk is negligible if your cluster is sized-correctly
and nodes are not dropping mutations.

If your nodes are dropping mutations because they're overloaded and cannot
keep up with writes, rebuild is probably the least of your problems. Cheers!

>


Re: unsubscribe

2021-11-22 Thread Erick Ramirez
Hey, mate. You'll need to email user-unsubscr...@cassandra.apache.org to
unsubscribe from the list. Cheers!


Re: 4.0.1 - adding a node

2021-10-29 Thread Erick Ramirez
Out of curiosity, what's up with hercules and chaos? Do you have different
hardware deployed in your cluster? Cheers!

>


Re: Schema collision results in multiple data directories per table

2021-10-18 Thread Erick Ramirez
>
> Erick, one last question: Is there a quick and easy way to extract the
> date from a time UUID?
>

Yeah, just use any online converters on the web. Cheers!


Re: Schema collision results in multiple data directories per table

2021-10-15 Thread Erick Ramirez
I agree with Jeff that this isn't related to ALTER TABLE. FWIW, the
original table was created in 2017 but a new version got created on August
5:

   - 20739eb0-d92e-11e6-b42f-e7eb6f21c481 - Friday, January 13, 2017 at
   1:18:01 GMT
   - 8ad72660-f629-11eb-a217-e1a09d8bc60c - Thursday, August 5, 2021 at
   20:13:04 GMT

Would that have been when you added the new nodes? Any possibility that you
"merged" two clusters together?

>


Re: Problem with www.apache.org/dist/cassandra/KEYS?

2021-10-07 Thread Erick Ramirez
There was a problem with keys reported yesterday on ASF Slack that Brandon
Williams (driftx) fixed early in my morning (I'm based in Australia).
Perhaps try again. For reference, the Slack conversation is available here
-- https://the-asf.slack.com/archives/CJZLTM05A/p1633538881180500. Cheers!

>


Re: Vulnerability in libthrift library (CVE-2019-0205)

2021-10-04 Thread Erick Ramirez
See https://issues.apache.org/jira/browse/CASSANDRA-15420. It only applies
if you're still using Thrift in 2021. Cheers!


Re: Change of Cassandra TTL

2021-09-30 Thread Erick Ramirez
That's an awesome tool! I forgot that it's included in
https://cassandra.apache.org/_/ecosystem.html. 

On Thu, 30 Sept 2021 at 16:27, Stefan Miklosovic <
stefan.mikloso...@instaclustr.com> wrote:

> Hi Raman,
>
> we at Instaclustr have created a CLI tool (1) which can strip TTLs
> from your SSTables and you can import that back to your node. Maybe
> that is something you find handy.
>
> We had some customers who had data which expired and they wanted to
> resurrect them - so they took SSTables with expired TTLs, removed them
> and voila, they had it back. So I can imagine you do this and then you
> re-enable TTL on it which is different.
>
> (1) https://github.com/instaclustr/cassandra-ttl-remover
>
> Regards.
>


Re: COUNTER timeout

2021-09-14 Thread Erick Ramirez
The obvious conclusion is to say that the nodes can't keep up so it would
be interesting to know how often you're issuing the counter updates. Also,
how are the commit log disks performing on the nodes? If you have
monitoring in place, check the IO stats/metrics. And finally, review the
logs on the nodes to see if they are indeed dropping mutations. Cheers!


Re: Change of Cassandra TTL

2021-09-14 Thread Erick Ramirez
You'll need to write an ETL app (most common case is with Spark) to scan
through the existing data and update it with a new TTL. You'll need to make
sure that the ETL job is throttled down so it doesn't overload your
production cluster. Cheers!

>


Re: hints for a node that was removed from cluster

2021-09-12 Thread Erick Ramirez
Hints for a removed node should have been dropped. Out of curiosity, how do
you know they belong to the removed node?


Re: Question related to nodetool repair options

2021-09-07 Thread Erick Ramirez
No, I'm just saying that [-pr] is the same as [-pr -full], NOT the same as
just [-full] on its own. Primary range repairs are not compatible with
incremental repairs so by definition, -pr is a [-pr -full] repair. I think
you're confusing the concept of a full repair vs incremental. This document
might help you understand the concepts --
https://docs.datastax.com/en/cassandra-oss/3.x/cassandra/operations/opsRepairNodesManualRepair.html.
Cheers!

>


Re: Question related to nodetool repair options

2021-09-07 Thread Erick Ramirez
   1. Will perform a full repair vs incremental which is the default in
   some later versions.
   2. As you said, will only repair the token range(s) on the node for
   which it is a primary owner.
   3. The -full flag with -pr is redundant -- primary range repairs are
   always done as a full repair because it is not compatible with incremental
   repairs,, i.e. -pr doesn't care that an SSTable is already marked as
   repaired.


Re: Migrating Cassandra from 3.11.11 to 4.0.0 vs num_tokens

2021-09-04 Thread Erick Ramirez
It isn't possible to change the tokens on a node once it is already part of
the cluster. Cassandra won't allow you to do it because it will make the
data  already on disk unreadable. You'll need to either configure new nodes
or add a new DC. I've answered an identical question in
https://community.datastax.com/questions/12213/ where I've provided steps
for the 2 options. I hope to draft a runbook and get it published on the
Apache website in the coming days. Cheers!


Re: Looking for pointers about replication internal working

2021-09-02 Thread Erick Ramirez
In the interest of you not getting any responses, I'm not personally aware
if such references exist. That's not to say they don't but I just don't
know about them because that was a long time ago and it was before my time.
If you don't get any other responses, perhaps you could search through the
Dev mailing list archive. Cheers!


Re: Large number of tiny sstables flushed constantly

2021-08-11 Thread Erick Ramirez
4 flush writers isn't bad since the default is 2. It doesn't make a
difference if you have fast disks (like NVMe SSDs) because only 1 thread
gets used.

But if flushes are slow, the work gets distributed to 4 flush writers so
you end up with smaller flush sizes although it's difficult to tell how
tiny the SSTables would be without analysing the logs and overall
performance of your cluster.

Was there a specific reason you decided to bump it up to 4? I'm just trying
to get a sense of why you did it since it might provide some clues. Out of
curiosity, what do you have set for the following?
- max heap size
- memtable_heap_space_in_mb
- memtable_offheap_space_in_mb

>


Re: nodetool listsnapshots and auto snapshots from dropped tables

2021-08-11 Thread Erick Ramirez
Awesome! Thanks, mate! 

>


Re: New Servers - Cassandra 4

2021-08-11 Thread Erick Ramirez
That's 430TB of eggs in the one 4U basket so consider that against your
MTTR requirements. I fully understand the motivation for that kind of
configuration but *personally*, I wouldn't want to be responsible for its
day-to-day operation but maybe that's just me. 


Re: Large number of tiny sstables flushed constantly

2021-08-10 Thread Erick Ramirez
Is it possible that you've got memtable_cleanup_threshold set in
cassandra.yaml with a low value? It's been deprecated in C* 3.10 (
CASSANDRA-12228 ).
If you do have it configured, I'd recommend removing it completely and
restart C* when you can. Cheers!


Re: nodetool listsnapshots and auto snapshots from dropped tables

2021-08-10 Thread Erick Ramirez
Dropped tables used to be handled differently and were no longer tracked
once they were dropped. The clearsnapshot command was fixed (CASSANDRA-6418
) but listsnapshots
doesn't appear to be (for whatever reason). It looks like an oversight to
me. Would you mind logging a ticket for it --
https://issues.apache.org/jira/secure/CreateIssue.jspa?pid=12310865? Cheers!

>


Re: Validation of NetworkTopologyStrategy data center name in Cassandra 4.0

2021-08-10 Thread Erick Ramirez
You are correct. Cassandra no longer allows invalid DC names for
NetworkTopologyStrategy in CREATE KEYSPACE or ALTER KEYSPACE from 4.0 (
CASSANDRA-12681 ).
FWIW, here is the NEWS.txt

entry for reference. I'm not aware of a hack that would circumvent the
validation. Cheers!

>


Re: WARN on free space across data volumes (false positive)

2021-08-09 Thread Erick Ramirez
Out of curiosity, does that mean that your `data_file_directories` is
different to `storageDir`?


Re: WARN on free space across data volumes (false positive)

2021-08-06 Thread Erick Ramirez
I'd say that your `data_file_directories` is pointing at
/var/lib/cassandra/data and is mounted on the root volume instead of
/srv/var. Cheers!


Re: Issue with native protocol

2021-07-29 Thread Erick Ramirez
Then that's the cause for the node negotiating down to an older protocol
version by design for dealing with mixed-version clusters as Sam described
in his response. As Bowen stated, you must have had an old node back from
when it was still a C* 2.2 cluster that you probably tried to
remove/decommission but ran into issues so it's still hanging around in
gossip.

You can manually delete that node to get rid of it with:

cqlsh> DELETE FROM system.peers WHERE peer = '10.39.36.152';


There's a good chance that you need to delete it multiple times -- it's a
race with gossip re-populating the table. Also check that it's completely
gone from nodetool gossipinfo. Once you're convinced that it's no longer in
gossip and in peers, you'll need to restart the node so it defaults back to
v4. Good luck!


Re: Issue with native protocol

2021-07-29 Thread Erick Ramirez
Is 10.39.36.152 part of the cluster or is it dead?

>


Re: Issue with native protocol

2021-07-29 Thread Erick Ramirez
Thanks, Pekka. But we know from an earlier post from Srinivas that the
driver is trying to negotiate with v4 but the node wouldn't:

[2021-07-09 23:26:52.382 -0700] 
com.datastax.driver.core.Connection - DEBUG: Got unsupported protocol
version error from /: for version V4 server supports version V3
[2021-07-09 23:26:52.382 -0700] 
com.datastax.driver.core.Connection - DEBUG: Connection[//: -1,
inFlight=0, closed=true] closing connection
[2021-07-09 23:26:52.382 -0700] 
com.datastax.driver.core.Host.STATES - DEBUG: [//:]
Connection[/10.39.38.166:9042-1, inFlight=0, closed=true] closed, remaining
= 0
[2021-07-09 23:26:52.383 -0700]  com.datastax.driver.core.Cluster -
DEBUG: Cannot connect with protocol V4, trying V3

So we know it's just the one problematic node in the cluster which won't
negotiate. The SHOW VERSION in cqlsh also indicates v3 but I can't figure
out what could be triggering it. Cheers!


Re: Issue with native protocol

2021-07-29 Thread Erick Ramirez
When you restart C*, you should have an entry in the logs which look like
this that indicates it defaults to v4:

INFO  [main] 2021-07-28 20:45:31,178 StorageService.java:650 - Native
protocol supported versions: 3/v3, 4/v4, 5/v5-beta (default: 4/v4)

I'm hoping someone else here on the mailing list can give pointers as to
why a 3.11 node would advertise v3. I've been code-diving and scratching my
head. I can't think of a scenario that would lead to this:

[cqlsh 5.0.1 | Cassandra 3.11.5 | CQL spec 3.4.4 | Native protocol v3]


Re: High memory usage during nodetool repair

2021-07-28 Thread Erick Ramirez
Based on the symptoms you described, it's most likely caused by SSTables
being mmap()ed as part of the repairs.

Set `disk_access_mode: mmap_index_only` so only index files get mapped and
not the data files. I've explained it in a bit more detail in this article
-- https://community.datastax.com/questions/6947/. Cheers!

>


Re: Issue with native protocol

2021-07-28 Thread Erick Ramirez
Someone asked me about the same issue a couple of months ago and we never
managed to figure out why the wrong version is being displayed.

Could you try to run `SELECT native_protocol_version FROM system.local`? It
should come back with 4. Cheers!


Re: cassandra 4.0 java 11 support

2021-07-27 Thread Erick Ramirez
There's been some discussion around removing the "experimental" tag for C*
4.0 + Java 11 so by all means, we encourage everyone to try it and report
back to the community if you run into issues. Java 11 support was added 2
years ago so I think most of the issues have been ironed out. Now that 4.0
is out, we're hoping there would be more users testing it out. Cheers!


Re: Permission/Role Cache causing timeouts in apps.

2021-07-27 Thread Erick Ramirez
Are you using the default `cassandra` superuser role? Because that would be
expensive. Also confirm if you've set the replication for the `system_auth`
keyspace to NTS because if you have multiple DCs, the request could be
going to another DC.

It's interesting that you've set validity to over 3 days but you update
them every 6 hours. Is that intentional? Cheers!


Re: R/W timeouts VS number of tables in keyspace

2021-07-22 Thread Erick Ramirez
I wanted to add a word of warning that switching to G1 won't necessarily
give you breathing space. In fact, I know it definitely won't.

In your original post, it looked like the node had a very small heap (2GB).
In my experience, you need to allocate at least 8GB of memory to the heap
for production workloads. You might be able to get away with 4GB for apps
with very low traffic but 8GB should really be the minimum. For real
production workloads, 16-24GB is ideal when using CMS. But once you're in
the 20GB+ territory, I recommend switching to G1 since it performs well for
large heap sizes and it is the collector we recommend for heaps between
20-31GB (32GB heap has less addressable objects than 31GB).

It's really important to note that G1 doesn't do well with small heap sizes
and you're better off sticking with CMS in that case. As always, YMMV. I'm
sure others will chime in with their own opinions/experiences. Cheers!


Re: Adding new DC

2021-07-22 Thread Erick Ramirez
I wouldn't use either of the steps you outlined. Neither of them are
correct.

Follow the procedure documented here instead --
https://docs.datastax.com/en/cassandra-oss/3.x/cassandra/operations/opsAddDCToCluster.html.
Cheers!


Re: Number of DCs in Cassandra

2021-07-14 Thread Erick Ramirez
You can really have as many as you need. The most unusual clusters I've
worked on had about 12 DCs mostly because they had different workloads that
needed to be isolated into their own DCs so one workload didn't affect
another. FWIW by "workloads" I mean OLTP, analytics, reporting, etc.

How many did you have in mind?


Re: TWCS repair and compact help

2021-06-29 Thread Erick Ramirez
You definitely shouldn't perform manual compactions -- you should let the
normal compaction tasks take care of it. It is unnecessary to manually run
compactions since it creates more problems than it solves as I've explained
in this post -- https://community.datastax.com/questions/6396/. Cheers!


Re: Cassandra-stress tool creating Static columns in COMPACT STORAGE Table

2021-06-18 Thread Erick Ramirez
Noting here that this was already answered by Benjamin on the dev ML here
[1]. Cheers!

[1]
https://lists.apache.org/thread.html/rf9c74e8e3d431ff395f73677a616a9fcd70a46f3ea0500b71ca4d91d%40%3Cdev.cassandra.apache.org%3E

>


Re: On-prem backup options ... Medusa?

2021-06-11 Thread Erick Ramirez
There are a lot of companies who use Medusa in production, yes. Archiving
to NFS is a good option and works in a lot of use cases. You'd just want to
make sure that there's limited access to the backups so they can't be
accidentally deleted or result in a security breach.

We would love to hear about any issues you're running into. And big +1 on
submitting PRs. We're happy to help and you can reach the Medusa
engineers/contributors directly on ASF slack
 in the #cassandra-medusa channel.
Cheers!


Re: Turn off automatic granting

2021-06-08 Thread Erick Ramirez
There's definitely a case for separation of duties. For example, admin
roles who have DDL permissions should not have DML access. To achieve this,
you'll need to manage the permissions at a granular level and revoke
permissions from the role. Cheers!

>


Re: multiple clients making schema changes at once

2021-06-03 Thread Erick Ramirez
Having said that, I'm still not a fan of making schema changes
programmatically. I spend way too much time helping users unscramble their
schema after they've hit multiple disagreements. I do understand the need
for it but avoid it if you can particularly in production.

On Fri, 4 Jun 2021 at 09:41, Erick Ramirez 
wrote:

> I wonder if there’s a way to query the driver to see if your schema change
>> has fully propagated.  I haven’t looked into this.
>>
>
> Yes, the drivers have APIs for this. For example, the Java driver has
> isSchemaInAgreement() and checkSchemaAgreement().
>
> See
> https://docs.datastax.com/en/developer/java-driver/latest/manual/core/metadata/schema/.
> Cheers!
>
>


Re: multiple clients making schema changes at once

2021-06-03 Thread Erick Ramirez
>
> I wonder if there’s a way to query the driver to see if your schema change
> has fully propagated.  I haven’t looked into this.
>

Yes, the drivers have APIs for this. For example, the Java driver has
isSchemaInAgreement() and checkSchemaAgreement().

See
https://docs.datastax.com/en/developer/java-driver/latest/manual/core/metadata/schema/.
Cheers!


Re: Memory requirements for Cassandra reaper

2021-05-04 Thread Erick Ramirez
2GB is allocated to the Reaper JVM on startup (see
https://github.com/thelastpickle/cassandra-reaper/blob/2.2.4/src/packaging/bin/cassandra-reaper#L90-L91
).

If you just want to test it out on a machine with only 8GB, you can update
the cassandra-reaper script to only use 1GB by setting -Xms1G and -Xmx1G
but you won't be able to do much with it. It might also be necessary to
reduce the heap allocated to Cassandra down to 2GB so there's enough RAM
left for the operating system.

For test and production environments, I recommend deploying Reaper on a
dedicated machine so it doesn't affect the performance of whatever cluster
it is connecting to. Reaper needs a minimum of 2 vCPUs + 2GB of RAM and
this works in most cases.

As a side note, if you just want to play around with the likes of Reaper
and Medusa (backups) then I'd recommend having a look at deploying
https://k8ssandra.io/ -- it's a production-ready platform for running
Apache Cassandra on Kubernetes with all the tools bundled in:

   - Reaper  for automated repairs
   - Medusa  for
   backups and restores
   - Metrics Collector
    for
   monitoring with Prometheus + Grafana
   - *Stargate.io * for accessing your data using
   REST, GraphQL and JSON Doc APIs
   - Traefik templates for k8s cluster ingress

Cheers!

>


Re: V3.11.10 Docker uses Java 1.8-282, why not Java 3.11

2021-05-03 Thread Erick Ramirez
There are lots of vendors who will continue to support Java 8 given it's
LTS until 2024(?). Discussion has started around officially supporting Java
11 for 4.x but that won't happen until after 4.0 GA.

We encourage everyone in the community to actively test C* 4.0 with Java 11
so it becomes a known quantity as more and more organisations test it at
scale. But as far as C* 3.11 is concerned, it will only be supported on
Java 8 (but never say never ).

I'm not close to the action but FWIW, I know that there's a lot of
excitement around ZGC and Java 17 being LTS (but don't quote me). Cheers!


Re: io.netty.channel.unix.Errors$NativeIoException: Connection reset by peer

2021-04-26 Thread Erick Ramirez
That message gets logged when the node tries to respond back to the client
but the driver has already given up waiting for the cluster to respond so
the connection is no longer active.

It typically happens when running an expensive query and the coordinator is
still waiting for the replicas to respond but the driver already reached
the client-side timeout. It can also happen when the driver has been
configured with a very low timeout value so the coordinator never gets a
chance to respond back.

Check for the timeouts configured on the driver. I'd also recommend
reviewing the app queries for clues. Cheers!

>


Re: Datastax Java Driver Compatibility Matrix

2021-04-19 Thread Erick Ramirez
>
> Is there a Datastax Java Driver
> 
> Compatibility matrix available for Cassandra 4.0?
>

No, there isn't but the same driver versions apply to C* 4.0 under the
column 3.0+.

Thanks for bringing this up as it has prompted me to consider its inclusion
in the official Apache Cassandra website and I've logged CASSANDRA-16617
. Cheers!


Re: New open-source CQL driver for Rust released - 0.1.0

2021-04-08 Thread Erick Ramirez
Thanks, Piotr & team. Fantastic contribution!

I'll request Constantia.io to get in contact with you to shortlist it in
next month's Changelog blog post. Cheers!


Re: Log Rotation of Extended Compaction Logging

2021-04-07 Thread Erick Ramirez
As far as I'm aware, the compaction logs don't get rotated. It looks like
it just increments the sequence number by 1.

You can have a look at the logic here --
https://github.com/apache/cassandra/blob/cassandra-3.11.6/src/java/org/apache/cassandra/db/compaction/CompactionLogger.java#L303-L318.
Cheers!

>


Re: Backup cassandra and restore. Best practices

2021-04-06 Thread Erick Ramirez
Minio is a supported type --
https://github.com/apache/libcloud/blob/trunk/libcloud/storage/types.py#L108

On Tue, 6 Apr 2021 at 20:29, Erick Ramirez 
wrote:

> This is a useful tool, but we look for smth that could store backups in
>> local S3 (like minio), not Amazon or else..
>>
>
> As I stated in my response, Medusa supports any S3-like storage that the
> Apache Libcloud API can access. See the docs I linked. Cheers!
>


Re: Backup cassandra and restore. Best practices

2021-04-06 Thread Erick Ramirez
>
> This is a useful tool, but we look for smth that could store backups in
> local S3 (like minio), not Amazon or else..
>

As I stated in my response, Medusa supports any S3-like storage that the
Apache Libcloud API can access. See the docs I linked. Cheers!


Re: Backup cassandra and restore. Best practices

2021-04-06 Thread Erick Ramirez
I'd recommend using Medusa (
https://github.com/thelastpickle/cassandra-medusa/wiki) -- an open-source
tool which automates backups and has support for archiving to S3, Google
Cloud and any S3-like storage. Cheers!

>


Re: Ansible Cassandra Collection

2021-03-19 Thread Erick Ramirez
Fantastic, Rhys! Thanks very much. I'm sure it will prove very useful for
users in the community. Cheers!

>


Re: Barman equivalent for Cassandra?

2021-03-12 Thread Erick Ramirez
I'm not familiar with Barman but if you're looking for a backup software
for Cassandra, have a look at Medusa from The Last Pickle --
https://github.com/thelastpickle/cassandra-medusa/wiki.

It's open-source and is also used for https://k8ssandra.io/ -- the platform
for deploying Cassandra on Kubernetes with tools for repairs, backups and
monitoring built-in. Cheers!


Re: No node was available to execute query error

2021-03-12 Thread Erick Ramirez
Does it get returned by the driver every single time? The
NoNodeAvailableException gets thrown when (1) all nodes are down, or (2)
all the contact points are invalid from the driver's perspective.

Is it possible there's no route/connectivity from your app server(s) to the
172.16.x.x network? If you post the full error message + full stacktrace,
it might provide clues. Cheers!


Re: underutilized servers

2021-03-05 Thread Erick Ramirez
The tpstats you posted show that the node is dropping reads and writes
which means that your disk can't keep up with the load meaning your disk is
the bottleneck. If you haven't already, place data and commitlog on
separate disks so they're not competing for the same IO bandwidth. Note
that It's OK to have them on the same disk/volume if you have NVMe SSDs
since it's a lot more difficult to saturate them.

The challenge with monitoring is that typically it's only checking disk
stats every 5 minutes (for example). But your app traffic is bursty in
nature so stats averaged out over a period of time is irrelevant because
the only thing that matters is what the disk IO is at the the time you hit
peak loads.

The dropped reads and mutations tell you the node is overloaded. Provided
your nodes are configured correctly, the only way out of this situation is
to correctly size your cluster and add more nodes -- your cluster needs to
be sized for peak loads, not average throughput. Cheers!


Re: MISSING keyspace

2021-03-01 Thread Erick Ramirez
The timestamp (1614575293790) in the snapshot directory name is equivalent
to 1 March 16:08 GMT:

actually I found a lot of .db files in the following directory:
>
> /var/lib/cassandra/data/mykespace/mytable-2795c0204a2d11e9aba361828766468f/snapshots/dropped-1614575293790-
> mytable
>

which lines up nicely with this log entry:


>  2021-03-01 06:08:08,864 INFO  [Native-Transport-Requests-1]
> MigrationManager.java:542 announceKeyspaceDrop Drop Keyspace 'mykeyspace'
>

In any case, those 2 pieces of information are evidence that the keyspace
didn't get randomly dropped -- some operator/developer/daemon/orchestration
tool/whatever initiated it either intentionally or by accident.

I've seen this happen a number of times where a developer thought they were
connecting to dev/staging/test environment and issued a DROP or TRUNCATE
not realising they were connected to production. Not saying this is what
happened in your case but I'm just giving you ideas for your investigation.
Cheers!


Re: Cassandra on arm aws instances

2021-03-01 Thread Erick Ramirez
>
> it's not the same, notice I wrote r6gd, these are the ones with nvme, i'm
> looking just at those.
>

I'm aware. I did use r6gd.2xlarge in my example. :)


> I do not need all the space that i3en gives me (and probably won't be able
> to use it all due to memory usage, or have other issues just like you
> mention), so the plan is use the big enough r6gd nodes, such as
> r6gd.8xlarge, it has 1.9tb nvme, it should good enough for my needs
>

I feel like we have a disconnect here. :) You won't get value from the
r6gd.8xlarge. You're paying for 32 cores + 256GB RAM which are mostly
unusable to you unless you have a configuration where Spark is co-located
with C* on the servers. It's the equivalent of using a truck to transport 2
boxes when a car will suffice.

>From a dollar perspective, you're opting to pay $10,714/yr for
a r6gd.8xlarge (I arbitrarily picked a standard 1-year term in West coast)
versus $3839/year for an i3.2xlarge just because you want Arm but will end
up using just a quarter (maybe half if I'm generous) of the compute power.
It doesn't stack up for me. But YMMV. :)


> (I would also add that a big chunk of the data that is not read that
> frequently, so I might be ok with putting a specific set of tables on EBS)
>

Interestingly, how do you plan to configure that? Unless I'm mistaken, C*
doesn't support tiered storage. Cheers!


Re: Cassandra on arm aws instances

2021-03-01 Thread Erick Ramirez
The instance types you refer to are contradictory so I'm not really sure if
this is really about Arm-based servers. The i3en-vs-r6 is not an
apples-for-apples comparison.

The R6g type is EBS-only so they will perform significantly worse than i3
instances. R6gd come with NVMe SSDs but they are disproportionately small
compared to the CPU+RAM they have. For example, a r6gd.2xlarge which has 8
cores + 64GB RAM only has a 474GB NVMe SSD so they're not a good back for
the buck.

On the other hand, i3en instances are intended for dense storage. I'd
discourage you from choosing this type since it will be tempting to have
dense nodes and are problematic when it comes to operations such as
bootstrapping, decommissions and running repairs. For example, an
i3en.2xlarge with 8 cores + 64GB RAM can potentially have 5TB of disks (2 x
2.5TB NVMe SSDs).

In my experience, i3 instances are the optimal choice such as i3.2xlarge. I
think 8 cores + 61GB RAM + 1.9TB NVMe SSD is the sweet spot for price and
performance. Cheers!


Re: MISSING keyspace

2021-03-01 Thread Erick Ramirez
As the warning message suggests, you need to check for schema disagreement.
My suspicion is that someone made a schema change and possibly dropped the
problematic keyspace.

FWIW I suspect the keyspace was dropped because the table isn't new -- CF
ID cba90a70-5c46-11e9-9e36-f54fe3235e69 is equivalent to 11 Apr 2019.

Check for the existence of the keyspace via cqlsh on other nodes in the
cluster (not the node which ran out of disk space). Cheers!


  1   2   3   4   >