New node is stuck in JOINING state

2023-04-05 Thread Eunsu Kim
Hi, all I recently encountered this behavior when adding new nodes to my Apache Cassandra 4.1.0 cluster. When I checked the system.log of the new added node, I found the following logs being logged repeatedly. -- WARN [OptionalTasks:1] 2023-04-05 18:50:26,722

Change the compression algorithm on a production table at runtime

2022-09-20 Thread Eunsu Kim
Hi all According to https://docs.datastax.com/en/cql-oss/3.3/cql/cql_reference/cqlAlterTable.html , it can be very problematic to modify the Compaction strategy on a table running in production. Similarly, is it

Re: Using zstd compression on Cassandra 3.x

2022-09-12 Thread Eunsu Kim
ly upgrade you can extract the implementation > from 4.0 and use it. I would advise against this path though as zstd > implementation is nuanced. > > Dinesh > >> On Sep 12, 2022, at 7:09 PM, Eunsu Kim wrote: >> >> Hi all, >> >> Since zstd co

Using zstd compression on Cassandra 3.x

2022-09-12 Thread Eunsu Kim
Hi all, Since zstd compression is a very good compression algorithm, it is available in Cassandra 4.0. Because the overall performance and ratio are excellent There is open source available for Cassandra 3.x. https://github.com/MatejTymes/cassandra-zstd Do you have any experience applying this

Re: about memory problem in write heavy system..

2022-01-10 Thread Eunsu Kim
t/documentation/cassandra/cassandra-cluster-operations/cassandra-version-upgrades/>, > and you should always read the release notes > <https://github.com/apache/cassandra/blob/trunk/NEWS.txt> which includes > breaking changes and new features before you perform an upgrade. > > &

Re: about memory problem in write heavy system..

2022-01-06 Thread Eunsu Kim
Looking at the memory usage chart, it seems that the physical memory usage of the existing node has increased since the new node was added with auto_bootstrap=false. > > On Fri, Jan 7, 2022 at 1:11 AM Eunsu Kim <mailto:eunsu.bil...@gmail.com>> wrote: > Hi, > > I

about memory problem in write heavy system..

2022-01-06 Thread Eunsu Kim
Hi, I have a Cassandra cluster(3.11.4) that does heavy writing work. (14k~16k write throughput per second per node) Nodes are physical machine in data center. Number of nodes are 30. Each node has three data disks mounted. A few days ago, a QueryTimeout problem occurred due to Full GC. So,

remove dead node without streaming

2021-03-25 Thread Eunsu Kim
Hi all, Is it possible to remove dead node directly from the cluster without streaming? My Cassandra cluster is quite large and takes too long to stream. (nodetool removenode) It's okay if my data is temporarily inconsistent. Thanks in advance.

Re: various TTL datas in one table (TWCS)

2020-10-28 Thread Eunsu Kim
e the 2w ttl data before the highest ttl time you chose > >> On Oct 28, 2020, at 5:58 PM, Eunsu Kim wrote: >> >> Hello, >> >> I have a table with a default TTL(2w). I'm using TWCS(window size : 12h) on >> the recommendation of experts. This table is quit

various TTL datas in one table (TWCS)

2020-10-28 Thread Eunsu Kim
Hello, I have a table with a default TTL(2w). I'm using TWCS(window size : 12h) on the recommendation of experts. This table is quite big, high WPS. I would like to insert data different TTL from the default in this table according to the type of data. About four different TTLs (4w, 6w, 8w,

new node stops streaming..

2020-01-27 Thread Eunsu Kim
Hi experts I had a problem adding a new node. Joining node in datacenterA stops streaming while joining. So it keeps the UJ. (datacenterB is fine.) I try 'nodetool netstats' on a stopped node and it looks like this: Mode: JOINING Not sending any streams. Read Repair Statistics: Attempted: 0

Curiosity in adding nodes

2019-10-21 Thread Eunsu Kim
Hi experts, When a new node was added, how can the coordinator find data that has been not yet streamed? Or is new nodes not used until all data is streamed? Thanks in advance - To unsubscribe, e-mail:

Re: What happens if my Cassandra cluster's certificate expires?

2019-09-25 Thread Eunsu Kim
ew connection and re-connection (in case of service restart) will not establish but requests on existing connections should work. On Wed, Sep 25, 2019 at 3:19 PM Eunsu Kim mailto:eunsu.bil...@gmail.com>> wrote: Hi all I recently enabled client_encryption_options on a cassandra.yaml client_encry

What happens if my Cassandra cluster's certificate expires?

2019-09-25 Thread Eunsu Kim
Hi all I recently enabled client_encryption_options on a cassandra.yaml client_encryption_options: enabled: true optional: true keystore: conf/my-keystore.jks keystore_password: password require_client_auth: false What happens if the certificate expires while in operation?

Re: about remaining data after adding a node

2019-09-05 Thread Eunsu Kim
e.org" Subject: Re: about remaining data after adding a node Hi Eunsu, Are you using DateTieredCompactionStrategy? It optimises the deletion of expired data from disks. If minor compactions are not solving the problem, I suggest to run nodetool compact. Federico On Thu, 5 Sep 2019

about remaining data after adding a node

2019-09-05 Thread Eunsu Kim
Hi, all After adding a new node, all the data was streamed by the newly allocated token. Since nodetool cleanup has not yet been performed on existing nodes, the total size has increased. All data has a short ttl. In this case, will the data remaining on the existing node be deleted after

Re: Data growth is abnormal

2018-12-26 Thread Eunsu Kim
I solved this problem with a sub-properties of compaction. (unchecked_tombstone_compaction, tombstone_threshold, tombstone_compaction_interval) It took time. Eventually, two datacenters were again balanced. Thank you. > On 24 Dec 2018, at 3:48 PM, Eunsu Kim wrote: > > Oh

Re: Data growth is abnormal

2018-12-23 Thread Eunsu Kim
Oh I’m sorry. It is marked as included in 3.11.1. It seems to be confused with other comments in the middle. However, I am not sure what to do with this page.. > On 24 Dec 2018, at 3:35 PM, Eunsu Kim wrote: > > Thank you for your response. > > The patch for the issue page yo

Re: Data growth is abnormal

2018-12-23 Thread Eunsu Kim
gt; > > > -- > Jeff Jirsa > > > On Dec 24, 2018, at 12:05 AM, Eunsu Kim <mailto:eunsu.bil...@gmail.com>> wrote: > >> I’m using TimeWindowCompactionStrategy. >> >> All consistency level is ONE. >> >>> On 24 Dec 2018, at 2:01 P

Re: Data growth is abnormal

2018-12-23 Thread Eunsu Kim
I’m using TimeWindowCompactionStrategy. All consistency level is ONE. > On 24 Dec 2018, at 2:01 PM, Jeff Jirsa wrote: > > What compaction strategy are you using ? > > What consistency level do you use on writes? Reads? > > -- > Jeff Jirsa > > >> On

Data growth is abnormal

2018-12-23 Thread Eunsu Kim
Merry Christmas The Cassandra cluster(3.11.3) I operate consists of two datacenters. Most data has a TTL of 14 days and stores one data for each data center. (NetworkTopologyStrategy, datacenter1: 1, datacenter2: 1) However, for a few days ago, only the datacenter1 disk usage is increasing

Re: Data storage space unbalance issue

2018-12-04 Thread Eunsu Kim
be before running cleanup. Somewhat related, if you're > not running regular repairs already, you should be. You can do it via cron, > but I strongly suggest checking out Reaper. > > On Wed, Nov 28, 2018, 8:05 PM Eunsu Kim <mailto:eunsu.bil...@gmail.com> wrote: > Thank

Re: Data storage space unbalance issue

2018-11-28 Thread Eunsu Kim
repair first for safety on datacenter2, then a "nodetool > cleanup" on those hosts. > > Also run "nodetool snapshot" to make sure you don't have any old snapshots > sitting around taking up space. > > On Wed, Nov 28, 2018 at 5:29 AM Eunsu Kim <mailto:

Data storage space unbalance issue

2018-11-28 Thread Eunsu Kim
(I am sending the previous mail again because it seems that it has not been sent properly.) HI experts, I am running 2 datacenters each containing five nodes. (total 10 nodes, all 3.11.3) My data is stored one at each data center. (REPLICATION = { 'class' :

Re: Adding datacenter and data verification

2018-09-17 Thread Eunsu Kim
ou sure that you changed the configuration of system_auth keyspace > before adding the new datacenter using this: > > ALTER KEYSPACE system_auth WITH REPLICATION = {'class': > 'NetworkTopologyStrategy', 'datacenter1': '3'}; > > Regards, > Pradeep > > > > On T

Re: Adding datacenter and data verification

2018-09-17 Thread Eunsu Kim
In my case, there were authentication issues when adding data centers. I was using a PasswordAuthenticator. As soon as the datacenter was added, the following authentication error log was recorded on the client log file. com.datastax.driver.core.exceptions.AuthenticationException:

Re: Default Single DataCenter -> Multi DataCenter

2018-09-11 Thread Eunsu Kim
(datacenter1). Please fix the snitch configuration, decommission and rebootstrap this node or use the flag -Dcassandra.ignore_dc=true. > On 11 Sep 2018, at 2:25 PM, Eunsu Kim wrote: > > Hello > > Thank you for your responses. > > I’ll share my adding datacenter plan

Re: Default Single DataCenter -> Multi DataCenter

2018-09-10 Thread Eunsu Kim
lan, as I did years ago in the mail previously linked. > People here should be able to confirm the process is ok before you move > forward, giving you an extra confidence. > > C*heers, > --- > Alain Rodriguez - @arodream - al...@thelastpickle.com > <mailto:al...@thel

Default Single DataCenter -> Multi DataCenter

2018-09-10 Thread Eunsu Kim
Hello everyone I operate 5 nodes cluster (3.11.0) in a single data center with SimpleSnitch, SimpleStrategy and all client policy RoundRobin. At this point, I am going to create clusters of the same size in different data centers. I think these two documents are appropriate, but there is

about cassandra..

2018-08-08 Thread Eunsu Kim
Hi all. I’m worried about the amount of disk I use, so I’m more curious about compression. We are currently using 3.11.0 and use default LZ4 Compressor ('chunk_length_in_kb': 64). Is there a setting that can make more powerful compression? Because most of them are time series data with TTL, we

Re: cassandra cluser sizing

2018-07-17 Thread Eunsu Kim
Can I ask you an additional question here? How much free space should I have if most tables use TimeWindowCompactionStrategy? > On 13 Jul 2018, at 10:09 PM, Vitaliy Semochkin wrote: > > Jeff, thank you very much for reply. > Will try to use 4TB per instance. > > If I understand it correctly

Re: What will happen after adding another data disk

2018-06-12 Thread Eunsu Kim
In my experience, adding a new disk and restarting the Cassandra process slowly distributes the disk usage evenly, so that existing disks have less disk usage > On 12 Jun 2018, at 11:09 AM, wxn...@zjqunshuo.com wrote: > > Hi, > I know Cassandra can make use of multiple disks. My data disk is

Re: GUI clients for Cassandra

2018-04-22 Thread Eunsu Kim
I am now writing dbeaver EE, but I’m waiting for TeamSQL (https://teamsql.io) to support cassandra. > On 23 Apr 2018, at 7:56 AM, Tim Moore wrote: > > I use the command-line too, but have heard some recommendations for DBeaver > EE as a cross-database GUI with support

Can I sort it as a result of group by?

2018-04-09 Thread Eunsu Kim
Hello, everyone. I am using 3.11.0 and I have the following table. CREATE TABLE summary_5m ( service_key text, hash_key int, instance_hash int, collected_time timestamp, count int, PRIMARY KEY ((service_key), hash_key, instance_hash, collected_time) ) And I can sum

Re: Self read throughput increased rapidly

2018-03-11 Thread Eunsu Kim
that will increase if people > read the data > > > -- > Jeff Jirsa > > > On Mar 11, 2018, at 7:38 PM, Eunsu Kim <eunsu.bil...@gmail.com > <mailto:eunsu.bil...@gmail.com>> wrote: > >> No I didn’t >> >> Do you mean that thi

Re: Self read throughput increased rapidly

2018-03-11 Thread Eunsu Kim
ing counter writes? > > > -- > Jeff Jirsa > > > On Mar 11, 2018, at 7:30 PM, Eunsu Kim <eunsu.bil...@gmail.com > <mailto:eunsu.bil...@gmail.com>> wrote: > >> We monitored the write/read throughput through the Cassandra cluster via >> JMX. There w

Re: Adding disk to operating C*

2018-03-08 Thread Eunsu Kim
ately nodetool drain. Beyond that I’d expect you to be fine. > > -- > Jeff Jirsa > > >> On Mar 8, 2018, at 9:52 PM, Eunsu Kim <eunsu.bil...@gmail.com> wrote: >> >> There are currently 5 writes per second. I was worried that the server >> downti

Re: Adding disk to operating C*

2018-03-08 Thread Eunsu Kim
eff Jirsa <jji...@gmail.com> wrote: > > I see no reason to believe you’d lose data doing this - why do you suspect > you may? > > -- > Jeff Jirsa > > >> On Mar 8, 2018, at 8:36 PM, Eunsu Kim <eunsu.bil...@gmail.com> wrote: >> >>

Re: Adding disk to operating C*

2018-03-08 Thread Eunsu Kim
t want to have a data > density over 1.5 TB per node. > > -- > Rahul Singh > rahul.si...@anant.us > > Anant Corporation > > On Mar 7, 2018, 1:26 AM -0500, Eunsu Kim <eunsu.bil...@gmail.com>, wrote: >> Hello, >> >> I use 5 nodes to create a

Adding disk to operating C*

2018-03-06 Thread Eunsu Kim
Hello, I use 5 nodes to create a cluster of Cassandra. (SSD 1TB) I'm trying to mount an additional disk(SSD 1TB) on each node because each disk usage growth rate is higher than I expected. Then I will add the the directory to data_file_directories in cassanra.yaml Can I get advice from who

if the heap size exceeds 32GB..

2018-02-12 Thread Eunsu Kim
https://www.elastic.co/guide/en/elasticsearch/guide/current/heap-sizing.html#compressed_oops According to the article above, if the heap size of the JVM is about 32GB, it is a waste of memory because

Re: Even after the drop table, the data actually was not erased.

2018-01-14 Thread Eunsu Kim
- al...@thelastpickle.com > <mailto:al...@thelastpickle.com> > France / Spain > > The Last Pickle - Apache Cassandra Consulting > http://www.thelastpickle.com <http://www.thelastpickle.com/> > > > > 2018-01-12 7:14 GMT+00:00 Eunsu Kim <eunsu.bil...@gmail.co

Even after the drop table, the data actually was not erased.

2018-01-11 Thread Eunsu Kim
hi everyone On the development server, I dropped all the tables and even keyspace dropped to change the table schema. Then I created the keyspace and the table. However, the actual size of the data directory did not decrease at all. Disk Load monitored by JMX has been decreased. After

Re: default_time_to_live setting in time series data

2018-01-11 Thread Eunsu Kim
Thanks for the quick response. TWCS is used. > On 12 Jan 2018, at 11:38 AM, Jeff Jirsa <jji...@gmail.com> wrote: > > Probably not in any measurable way. > > -- > Jeff Jirsa > > >> On Jan 11, 2018, at 6:16 PM, Eunsu Kim <eunsu.bil...@gmail.c

default_time_to_live setting in time series data

2018-01-11 Thread Eunsu Kim
Hi everyone We are collecting monitoring data in excess of 100K TPS in Cassandra. All data is time series data and must have a TTL. Currently we have set default_time_to_live on the table. Does this have a negative impact on Cassandra throughput performance? Thank you in advance.

Re: How to get page id without transmitting data to client

2018-01-01 Thread Eunsu Kim
://docs.datastax.com/en/developer/java-driver/3.3/manual/paging/#saving-and-reusing-the-paging-state> > for storing and reusing the page id later, but not sure if that helps for > your particular use case. > > Thanks, > Andy > > On Thu, Dec 28, 2017 at 9:11 PM, Eunsu

How to get page id without transmitting data to client

2017-12-28 Thread Eunsu Kim
Hello everybody, I am using the datastax Java driver (3.3.0). When query large amounts of data, we set the fetch size (1) and transmit the data to the browser on a page-by-page basis. I am wondering if I can get the page id without receiving the real rows from the cassandra to my server.

about write performance

2017-12-07 Thread Eunsu Kim
There is a table with a timestamp as a cluster key and sorted by ASC for the column. Is it better to insert by the time order when inserting data into this table for insertion performance? Or does it matter? Thank you.