CDC and TTL

2018-06-18 Thread Joy Gao
Hi all! I recently started to look into Cassandra CDC implementation. One question that occurred to me is how/if TTL is handled for CDC. For example, If I insert some data with TTL enabled and expiring in 60 seconds, will CDC be aware of these changes 60 seconds later when the TTL expired? If

Re: saving distinct data in cassandra result in many tombstones

2018-06-18 Thread onmstester onmstester
Two other questions: 1. How to use sharding partition key in a way that partitions end up in different nodes? 2. if i set gc_grace_seconds to 0, would it replace the row at memtable (not saving repeated rows in sstables) or it would be done at first compaction? Sent using Zoho Mail

Re: Write performance degradation

2018-06-18 Thread onmstester onmstester
I think that could have pinpoint the problem, i have a table with a partition key related to timestamp so for one hour so many data would be inserted at one single node, this table creates a very big partitions (300MB-600MB), whatever node the current partition of that table would be inserted

Re: saving distinct data in cassandra result in many tombstones

2018-06-18 Thread onmstester onmstester
Can i set gc_grace_seconds to 0 in this case? because reappearing deleted data has no impact on my Business Logic, i'm just either creating a new row or replacing the exactly same row. Sent using Zoho Mail On Wed, 13 Jun 2018 03:41:51 +0430 Elliott Sims elli...@backblaze.com

Re: Timestamp on hints file and system.hints table data

2018-06-18 Thread kurt greaves
Send through some examples (and any errors)? Sounds like the file might be corrupt. Not that there's much you can do about that. You can try stopping C*, deleting the file, then starting C* again. You'll have to repair, assuming you haven't repaired already since that hint file was created. On 18

Re: RE: [EXTERNAL] Cluster is unbalanced

2018-06-18 Thread learner dba
Hi Sean, Are you using any rack aware topology? --> we are using gossip file Are you using any rack aware topology? --> we are using gossip file What are your partition keys? --> Partition key is uniqIs it possible that your partition keys do not divide up as cleanly as you would like across

RE: [EXTERNAL] Cluster is unbalanced

2018-06-18 Thread Durity, Sean R
Are you using any rack aware topology? What are your partition keys? Is it possible that your partition keys do not divide up as cleanly as you would like across the cluster because the data is not evenly distributed (by partition key)? Sean Durity lord of the (C*) rings (Staff Systems

Cluster is unbalanced

2018-06-18 Thread learner dba
Hi, Data volume varies a lot in our two DC cluster:  Load       Tokens       Owns    20.01 GiB  256          ?       65.32 GiB  256          ?       60.09 GiB  256          ?       46.95 GiB  256          ?       50.73 GiB  256          ?      kaiprodv2 = /Leaving/Joining/Moving  

Re: Timestamp on hints file and system.hints table data

2018-06-18 Thread learner dba
Yes Kurt, system log is flooded with hints sent and replayed messages.  On Monday, June 18, 2018, 7:30:34 AM EDT, kurt greaves wrote: Not sure what to make of that. Are there any log messages regarding the file and replaying hints? Sounds like maybe it's corrupt (although not sure why

Re: How do you monitoring Cassandra Cluster?

2018-06-18 Thread Felipe Esteves
Hi, everyone, I'm running some tests to monitor Cassandra 3.x with jmx_exporter + prometheus + grafana. I've managed to config it all and use the dashboard https://grafana.com/dashboards/5408 However, I still can't aggregate metrics from all my cluster, just nodes individually. Any tips on how

Re: Write performance degradation

2018-06-18 Thread DuyHai Doan
Maybe the disk I/O cannot keep up with the high mutation rate ? Check the number of pending compactions On Sun, Jun 17, 2018 at 9:24 AM, onmstester onmstester wrote: > Hi, > > I was doing 500K inserts + 100K counter update in seconds on my cluster of > 12 nodes (20 core/128GB ram/4 * 600 HDD

Re: Timestamp on hints file and system.hints table data

2018-06-18 Thread kurt greaves
Not sure what to make of that. Are there any log messages regarding the file and replaying hints? Sounds like maybe it's corrupt (although not sure why it keeps getting rewritten). On 14 June 2018 at 13:19, Nitan Kainth wrote: > Kurt, > > Hint file matches UUID matches with another node in the

Re:

2018-06-18 Thread kurt greaves
> > 1) Am I correct to assume that the larger page size some user session has > set - the larger portion of cluster/coordinator node resources will be > hogged by the corresponding session? > 2) Do I understand correctly that page size (imagine we have no timeout > settings) is limited by RAM and

Re: C* data modeling for time series

2018-06-18 Thread mm
Hi, we're currently evaluating KairosDB for time series which looks quite nice. https://kairosdb.github.io/ The cool thing with KairosDB is that it uses Cassandra as storage engine and provide additional features (mainly a REST-based API for accessing data). Maybe you can take a look the

Re: C* data modeling for time series

2018-06-18 Thread Affan Syed
I have looked at this problem for a good year now. My feel is that Cassandra alone as the sole underlying DB for Timeseries just does not cut it. I am starting to look at C* along with another DB for executing the sort of queries we want here. Currently I am evaluating Druid vs Kudu to be this