Hi Cassandra Users,
Before I go ahead and create my own solution... are there any tools that
exist to help with the management of a Cassandra cluster?
For example, if I want to make some changes to the configuration file that
resides on each node, is there a tool that will propagate the change
Hi Particia,
Thank you for the feedback. It has been helpful.
On Tue, Aug 27, 2013 at 12:02 AM, Patricia Gorla
gorla.patri...@gmail.comwrote:
Anthony,
We use a number of tools to manage our Cassandra cluster.
* Datastax OpsCenter [0] for at a glance information, and trending
statistics.
Thanks Nate! We will look into this one to see if we can use it.
Regards,
Anthony
On Tue, Aug 27, 2013 at 12:22 AM, Nate McCall n...@thelastpickle.comwrote:
For example, if I want to make some changes to the configuration file
that resides on each node, is there a tool that will propagate
Hi Robert,
We found having about 50% free disk space is a good rule of thumb.
Cassandra will typically use less than that when running compactions,
however it is good to have free space available just in case it compacts
some of the larger SSTables in the keyspace. More information can be found
Hi Robert,
In this case would it be possible to do the following to replace a seed
node?
nodetool disablethrift
nodetool disablegossip
nodetool drain
stop Cassandra
deep copy /var/lib/cassandra/* on old seed node to new seed node
start Cassandra on new seed node
Regards,
Anthony
On Wed,
Hi Robert,
It sounds like you have done a fair bit investigating and testing already.
Have you considered using a time based data model to avoid doing deletions
in the database?
Regards,
Anthony
On Thu, Jan 9, 2014 at 1:26 PM, sankalp kohli kohlisank...@gmail.comwrote:
With Level compaction,
Keep in mind there are side effects to increasing to RF = 4
- Storage requirements for each node will increase. Depending on the
number of nodes in the cluster and the size of the data this could be
significant.
- Whilst the number of available coordinators increases, the number of
Hi Dhruva,
There are definitely some performance improvements to Storage Engine in
Cassandra 3.10 which make it worth the upgrade. Note that Cassandra 3.11
has further bug fixes and it may be worth considering a migration to that
version.
Regarding the issue of building a Cassandra 3.10 RPM, it
Hi Daniel,
Yes, you are right it does require some additional work to rsync just the
snapshots.
What about doing something like this to make rsync syntax for the backup
easier?
# in the Cassandra data directory, iterate through the keyspaces
for ks in $(find . -type d -iname backup)
do
#
Hi Nandan,
If there is a requirement to answer a query "What are the changes to a book
made by a particular user?", then yes the schema you have proposed can
work. To obtain the list of updates for a book by a user from the
*book_title_by_user* table will require the partition key (*book_title*),
Hi Nandan,
Interesting project!
One thing that helps define the schema is knowing what queries will be made
to the database up front. It sounds like you have an idea already of what
those queries will be. I want to confirm that these are the queries that
the database needs to answer.
- *What
Hi Surbhi,
Please see my comment inline below.
On 28 May 2017 at 12:11, Jeff Jirsa wrote:
>
>
> On 2017-05-27 18:04 (-0700), Surbhi Gupta
> wrote:
> > Thanks a lot for all of your reply.
> > Our requirement is :
> > Our company releases AMI almost
Hi Daniel,
When you say that the nodes have to be restarted, are you just restarting
the Cassandra service or are you restarting the machine?
How are you reclaiming disk space at the moment? Does disk space free up
after the restart?
Regarding storage on nodes, keep in mind the more data stored
Hi Eduardo,
Please see my comment inline below regarding your third question.
Regards,
Anthony
On 28 April 2017 at 21:26, Eduardo Alonso wrote:
> Hi to all:
>
> I am having some problems with two client's cassandra:3.0.8 clusters i
> want to share with you. These
Hi Lydia,
Yes. This will define the *x*, *y* columns as the components of the
partition key. Note that by doing this both *x* and *y* values will be
required to at a minimum to perform a valid query.
Alternatively, the *x* and *y* values could be combined in into a single
text field as Jon has
As Kurt mentioned, you definitely need to pick a partition key that ensure
data is uniformly distributed.
If you want to want to redistribute the data in cluster and move tokens
around, you could decommission the node with the tokens you want to
redistribute and then bootstrap a new node into the
Hi Pradeep,
If you are going to copy N snapshots to N nodes you will need to make sure
you have the System keyspace as part of that snapshot. The System keyspace
that is local to each node, contains the token allocations for that
particular node. This allows the node to work out what data it is
Hi Anshu,
To add to Erick's comment, remember to remove the *replace_address* method
from the *cassandra-env.sh* file once the node has rejoined successfully.
The node will fail the next restart otherwise.
Alternatively, use the *replace_address_first_boot* method which works
exactly the same
Hi Kenneth,
Fantastic idea!
One thing that came to mind from my reading of the proposed setup was rack
awareness of each node. Given that the proposed setup contains three DCs, I
assume that each node will be made rack aware? If not, consider defining
three racks for each DC and placing two
The speed at which compactions operate is also physically restricted by the
speed of the disk. If the disks used on the new node are HDDs, then
increasing the compaction throughput will be of little help. However, if
the disks on the new node are SSDs then increasing the compaction
throughput to
Hi Abdul,
There is no mechanism offered in Cassandra to bind a keyspace (when
created) to specific filesystem or directory. If multiple filesystems or
directories are specified in the data_file_directories property in the
*cassandra.yaml* then Cassandra will attempt to evenly distribute data from
Hi,
Yes, you can use nodetool status to inspect the health/status of the
cluster. Using *nodetool status * will show the cluster
health/status as well as the amount of data that each node has for the
specified **. Using *nodetool status* without the
argument will only show the cluster
Hi Kenneth,
In addition to CASSANDRA-7622, it may help to inspect the Cassandra
*system.log* and look for the following entry:
INFO [main] ... - Node configuration:[...]
The content of "Node configuration" will have the settings the node is
using.
Regards,
Anthony
On Tue, 13 Mar 2018 at
Hi Oliver,
I was in a similar situation to you and Matija a few years back as well and
can vouch for what Matija has said. Some data sets are more suitable for
Cassandra than others; so the answer to your question depends on the type
of data and how it is modelled in Cassandra. The data model
Hi Peng,
Depending on the hardware failure you can do one of two things:
1. If the disks are intact and uncorrupted you could just use the disks
with the current data on them in the new node. Even if the IP address
changes for the new node that is fine. In that case all you need to do is
run
nt_window_in_ms,we must run
> repair to make the replaced node consistent again, since it missed ongoing
> writes during bootstrapping.but for a great cluster,repair is a painful
> process.
>
> Thanks,
> Peng Xiao
>
>
>
> ------ 原始邮件 --
> *发件人
Hi Alex,
We wrote a blog post on this topic late last year:
http://thelastpickle.com/blog/2018/09/18/assassinate.html.
In short, you will need to run the assassinate command on each node
simultaneously a number of times in quick succession. This will generate a
number of messages requesting all
Hi Abdul,
Usually we get no noticeable improvement at tuning concurrent_reads and
concurrent_writes above 128. I generally try to keep current_reads to no
higher than 64 and concurrent_writes to no higher than 128. In creasing the
values beyond that you might start running into issues where the
Hi Robert,
Your action plan looks good.
You can think of the *cassandra-topology.properties* file as a map for the
cluster. The map between the nodes must be consistent because each node
uses it to determine where it is meant to be located logically.
It is good hygiene to maintain the
Hi Thomas,
The process you suggested to get around the issue should work with the
system.keyspaces table.
Make sure to backup the original *system.keyspaces* table files on the node
that fails to start. Then, copy only the *system.keyspaces *table files
from a working node into the
Did you roll back to OpenJDK 1.7u181 or did you upgrade to a more recent
version?
On Thu, 16 May 2019 at 13:43, keshava wrote:
> The java version that we were using and which turns out to be causing this
> issue was OpenJdk 1.7 u191
>
> On 16-May-2019 06:02, "sankalp kohli" wrote:
>
>> which
;
> "The best way to predict the future is to invent it" Alan Kay
>
>
> On Mon, Apr 29, 2019 at 2:45 AM Anthony Grasso
> wrote:
>
>> Hi Jean,
>>
>> It sounds like there are no nodes in one of the racks for the eu-west-3
>> datacenter. What does th
Good idea Jeff. I can add that in if you like? Do we have a ticket for it
or should I just raise one?
On Mon, 6 May 2019 at 03:49, Jeff Jirsa wrote:
> Picking an ideal allocation for N seed nodes and M vnodes per seed is
> probably something we should add as a little python script or similar in
Hi
If you are planning on setting up a new cluster with
allocate_tokens_for_keyspace, then yes, you will need one seed node per
rack. As Jon mentioned in a previous email, you must manually specify the
token range for *each* seed node. This can be done using the initial_token
setting.
The
Hi Jean,
It sounds like there are no nodes in one of the racks for the eu-west-3
datacenter. What does the output of nodetool status look like currently?
Note, you will need to start a node in each rack before creating the
keyspace. I wrote a blog post with the procedure to set up a new cluster
I thought it was needed only for new
> "clusters", not also for new "DCs"; but RF is per DC so it makes sense.
>
> You TLP guys are doing a great job for Cassandra community.
>
> Thank you,
> Enrico
>
>
> On Fri, 29 Nov 2019 at 05:09, Anthony Grasso
> wro
Hi Enrico,
This is a classic chicken and egg problem with the
allocate_tokens_for_keyspace setting.
The allocate_tokens_for_keyspace setting uses the replication factor of a
DC keyspace to calculate the token allocation when a node is added to the
cluster for the first time.
Nodes need to be
Hi Tobias,
I have had a similar experiences to Jon where I have seen Materialized
Views cause major issues in clusters. I too recommend avoiding them.
Regards,
Anthony
On Sat, 29 Feb 2020 at 07:37, Jon Haddad wrote:
> I also recommend avoiding them. I've seen too many clusters fall over as
>
Manish is correct.
Upgrade the Cassandra version of a single node only. If that node is
behaving as expected (i.e. is in an Up/Normal state and no errors in the
logs), then upgrade the Cassandra version for each node one at a time. Be
sure to check that each node is running as expected. Once the
stand this part of the process. Why do tokens conflict if the
> nodes owning them are in a different datacenter ?
>
> Regards,
>
> Leo
>
> On Thu, Dec 5, 2019 at 1:00 AM Anthony Grasso
> wrote:
>
>> Hi Enrico,
>>
>> Glad to hear the problem has been resolved and t
Hi Maxim,
Basically what Sean suggested is the way to do this without downtime.
To clarify the, the *three* steps following the "Decommission each node in
the DC you are working on" step should be applied to *only* the
decommissioned nodes. So where it say "*all nodes*" or "*every node*" it
>>> Is the new recommendation 4 now even in version 3.x (asking for 3.11)?
>>> Thanks
>>>
>>> On Fri, Jan 31, 2020 at 9:49 AM Durity, Sean R <
>>> sean_r_dur...@homedepot.com> wrote:
>>>
>>>> These are good clarificat
Hi Kornel,
Great use of the script for generating initial tokens! I agree that you can
achieve an optimal token distribution in a cluster using such a method.
One thing to think about is the process for expanding the size of the
cluster in this case. For example consider the scenario where you
Hi Jean,
This is a really good question.
As Erick mentioned, if you want to change your cluster's *num_tokens* to 16
to match the 4.0 default, you will need to perform a datacenter migration.
Feel free to read over this blog post
Hi Arvinder,
You are correct; tlp-stress includes Log4j as one of its libraries and
users will need to update the JAR file.
On 16th December 2021, tlp-stress was updated [1] to include Log4j 2.16.0
which fixed CVE-2021-45046. Version 5.0.0 was released which included this
change.
Unfortunately,
45 matches
Mail list logo