Re: Cassandra2.0.14 : Obsolete files not being deleted after compaction

2020-02-06 Thread Laxmikant Upadhyay
Hi, Just an update, We deleted obsolete sstables and it worked fine. However I am not able to find out any jira for same issue. On Wed, Jan 22, 2020 at 3:58 PM manish khandelwal < manishkhandelwa...@gmail.com> wrote: > Thanks Jeff. > > There was no restart between "Compacting" and "Compacted"

Re: sstableloader: How much does it actually need?

2020-02-06 Thread manish khandelwal
Yes you will have all the data in two nodes provided there is no mutation drop at node level or data is repaired For example if you data A,B,C and D. with RF=3 and 4 nodes (node1, node2, node3 and node4) Data A is in node1, node2 and node3 Data B is in node2, node3, and node4 Data C is in node3,

Re: Query timeouts after Cassandra Migration

2020-02-06 Thread Erick Ramirez
> > So do you advise copying tokens in such cases ? What procedure is > advisable ? > Specifically for your case with 3 nodes + RF=3, it won't make a difference so leave it as it is. > Latency increased on target cluster. > Have you tried to run a trace of the queries which are slow? It will

Re: Running select against cassandra

2020-02-06 Thread Erick Ramirez
> > Also is materialized view good for production? I agree with Sean's and Reid's sentiments about MVs. I still think of MVs as being experimental and not ready for primetime. I would wait for the improvements which may be coming in C* 4.0 but no promises there... yet. :) Cheers!

Re: Query timeouts after Cassandra Migration

2020-02-06 Thread Ankit Gadhiya
Thanks Eric. So do you advise copying tokens in such cases ? What procedure is advisable ? Latency increased on target cluster. I’d double check on storage disks but it should be same. — Ankit On Thu, Feb 6, 2020 at 9:07 PM Erick Ramirez wrote: > I didn’t copy tokens since it’s an identical

Re: Query timeouts after Cassandra Migration

2020-02-06 Thread Erick Ramirez
> > I didn’t copy tokens since it’s an identical cluster and we have RF as 3 > on 3 node cluster. Is it still needed , why? > In C*, same number of nodes alone isn't enough. Clusters aren't really identical unless token assignments are the same. In your case though since each node has a full copy

Re: Running select against cassandra

2020-02-06 Thread Abdul Patel
Thanks all for valuable inputs. I agree we nees to have query defined then plan the schema of table , but the server is live for 2 yrs now in production and this is new requiremnt so changing schema is not a option and secondary index is also bad idea. I was thinking to go with materialized view

Re: Nodes becoming unresponsive

2020-02-06 Thread Erick Ramirez
> > I tried to debug more and could see using top that Command is > MutationStage in top output , Any clue we get from this ? > That just means there's lots of writes hitting your cluster. Without the thread dump, it would be difficult to know if the threads are blocked by futex_wait or whatever

Re: Nodes becoming unresponsive

2020-02-06 Thread Surbhi Gupta
I have limited options to use JDK based tools because in our environment we are running JRE . I tried to debug more and could see using top that Command is MutationStage in top output , Any clue we get from this ? top - 16:30:47 up 94 days, 5:33, 1 user, load average: 134.83, 142.48, 144.75

Re: [EXTERNAL] Re: Running select against cassandra

2020-02-06 Thread Reid Pinchback
I defer to Sean’s comment on materialized views. I’m more familiar with DynamoDB on that front, where you do this pretty routinely. I was curious so I went looking. This appears to be the C* Jira that points to many of the problem points: https://issues.apache.org/jira/browse/CASSANDRA-13826

RE: [EXTERNAL] Re: Running select against cassandra

2020-02-06 Thread Durity, Sean R
Reid is right. You build the tables to easily answer the queries you want. So, start with the query! I inferred a query for you based on what you mentioned. If my inference is wrong, the table structure is likely wrong, too. So, what kind of query do you want to run? (NOTE: a select count(*)

RE: [EXTERNAL] Re: Running select against cassandra

2020-02-06 Thread Durity, Sean R
From reports on this mailing list, I do not allow materialized views. Sean Durity From: Reid Pinchback Sent: Thursday, February 6, 2020 4:10 PM To: user@cassandra.apache.org Subject: Re: [EXTERNAL] Re: Running select against cassandra Abdul, When in doubt, have a query model that immediately

Re: [EXTERNAL] Re: Running select against cassandra

2020-02-06 Thread Reid Pinchback
Abdul, When in doubt, have a query model that immediately feeds you exactly what you are looking for. That’s kind of the data model philosophy that you want to shoot for as much as feasible with C*. The point of Sean’s table isn’t the similarity to yours, it is how he has it keyed because it

Re: [EXTERNAL] Re: Running select against cassandra

2020-02-06 Thread Abdul Patel
this is the schema similar to what we have , they want to get user connected - concurrent count for every say 1-5 minutes. i am thinking will simple select will have performance issue or we can go for materialized views ? CREATE TABLE usr_session ( userid bigint, session_usr text,

Re: sstableloader: How much does it actually need?

2020-02-06 Thread Voytek Jarnot
Been thinking about it, and I can't really see how with 4 nodes and RF=3, any 2 nodes would *not* have all the data; but am more than willing to learn. On the other thing: that's an attractive option, but in our case, the target cluster will likely come into use before the source-cluster data is

RE: [EXTERNAL] Re: Running select against cassandra

2020-02-06 Thread Durity, Sean R
Do you only need the current count or do you want to keep the historical counts also? By active users, does that mean some kind of user that the application tracks (as opposed to the Cassandra user connected to the cluster)? I would consider a table like this for tracking active users through

Re: Running select against cassandra

2020-02-06 Thread Abdul Patel
Also is materialized view good for production? We are on 3.11.4 On Thursday, February 6, 2020, Abdul Patel wrote: > Its sort of user connected, app team needa number of active users > connected say every 1 to 5 mins. > The timeout at app end is 120ms. > > > > On Thursday, February 6, 2020,

Re: Query timeouts after Cassandra Migration

2020-02-06 Thread Ankit Gadhiya
Hi Michael, Thanks for your response. I didn’t copy tokens since it’s an identical cluster and we have RF as 3 on 3 node cluster. Is it still needed , why? Don’t see anything in cassandra log as such. I don’t have debugs enabled. Thanks & Regards, Ankit On Thu, Feb 6, 2020 at 1:47 PM Michael

Re: Running select against cassandra

2020-02-06 Thread Abdul Patel
Its sort of user connected, app team needa number of active users connected say every 1 to 5 mins. The timeout at app end is 120ms. On Thursday, February 6, 2020, Michael Shuler wrote: > You'll have to be more specific. What is your table schema and what is the > SELECT query? What is the

Re: Query timeouts after Cassandra Migration

2020-02-06 Thread Michael Shuler
Did you copy the tokens from cluster1 to new cluster2? Same Cassandra version, same instance type/size? What to the logs say on cluster2 that look different from the cluster1 norm? There are a number of possible `nodetool` utilities that may help see what is happening on new cluster2. Michael

Re: Running select against cassandra

2020-02-06 Thread Michael Shuler
You'll have to be more specific. What is your table schema and what is the SELECT query? What is the normal response time? As a basic guide for your general question, if the query is something sort of irrelevant that should be stored some other way, like a total row count, or most any SELECT

Running select against cassandra

2020-02-06 Thread Abdul Patel
Hi, Is it advisable to run select query to fetch every minute to grab data from cassandra for reporting purpose, if no then whats the alternative?

Re: Nodes becoming unresponsive

2020-02-06 Thread Elliott Sims
Async-profiler (https://github.com/jvm-profiling-tools/async-profiler ) flamegraphs can also be a really good tool to figure out the exact callgraph that's leading to the futex_wait, both in and out of the JVM.

Query timeouts after Cassandra Migration

2020-02-06 Thread Ankit Gadhiya
Hi Folks, I recently migrated Cassandra keyspace data from one Azure cluster (3 Nodes) to another (3 nodes different region) using simple sstable copy. Post this , we are observing overall response time has increased and timeouts every 20 mins. Has anyone faced such in their experiences ? Do I