Re: Materialized View's additional PrimaryKey column

2019-07-25 Thread mehmet bursali
Thank you again for Clear information Jon! i give up 珞 Android’de Yahoo Postadan gönderildi 0:53’’26e’ 26 Tem 2019 Cum tarihinde, Jon Haddad şunu yazdı: The issues I have with MVs aren't related to how they aren't correctly synchronized, although I'm not happy about that either.  My

Re: Performance impact with ALLOW FILTERING clause.

2019-07-25 Thread Jon Haddad
If you're thinking about rewriting your data to be more performant when doing analytics, you might as well go the distance and put it in an analytics friendly format like Parquet. My 2 cents. On Thu, Jul 25, 2019 at 11:01 AM ZAIDI, ASAD A wrote: > Thank you all for your insights. > > > > When

Re: Materialized View's additional PrimaryKey column

2019-07-25 Thread Jon Haddad
The issues I have with MVs aren't related to how they aren't correctly synchronized, although I'm not happy about that either. My issue with them are in every cluster I've seen that uses them, the cluster has been unstable, and I've put a lot of time into helping teams undo them. You will almost

RE: Performance impact with ALLOW FILTERING clause.

2019-07-25 Thread ZAIDI, ASAD A
Thank you all for your insights. When spark-connector adds allows filtering to a query, it makes the query to just ‘run’ no matter if it is expensive for larger table OR not so expensive for table with fewer rows. In my particular case, nodes are reaching 2TB/per node load in 50 node cluster.

Re: Dropped mutations

2019-07-25 Thread Ayub M
Thanks Jeff, does internal mean local node operations - in this case mutation response from local node and cross node means the time it took to get response back from other nodes depending on the consistency level choosen? On Thu, Jul 25, 2019 at 11:51 AM Jeff Jirsa wrote: > This means your

Differing snitches in different datacenters

2019-07-25 Thread Voytek Jarnot
Quick and hopefully easy question for the list. Background is existing cluster (1 DC) will be migrated to AWS-hosted cluster via standing up a second datacenter, existing cluster will be subsequently decommissioned. We currently use GossipingPropertyFileSnitch and are thinking about using

Re: Unable to integrate jmx_prometheus_javaagent

2019-07-25 Thread Marc Richter
Nevermind, after hours of investigation, I found the solution myself just after having the mail sent to the list ... Even though some resources on the web highlight the importance to wrap what follows "-javaagent:" between "", this seems to be the issue; note that the log complains about it

Unable to integrate jmx_prometheus_javaagent

2019-07-25 Thread Marc Richter
Hi everyone, I have an existing Cassandra node (3.7). Now, I'd like to be able to grab metrics from it for my Prometheus + Grafana based monitoring. I downloaded "jmx_prometheus_javaagent-0.12.0.jar" from [1], copied it to "/etc/cassandra/prometheus/jmx_prometheus_javaagent-0.12.0.jar". I

Re: Dropped mutations

2019-07-25 Thread Rajsekhar Mallick
Hello Jeff, Request you to help on how to visualise the terms 1. Internal mutations 2. Cross node mutations 3. Mean internal dropped latency 4. Cross node dropped latency Thanks, Rajsekhar On Thu, 25 Jul, 2019, 9:21 PM Jeff Jirsa, wrote: > This means your database is seeing commands that have

Re: Dropped mutations

2019-07-25 Thread Jeff Jirsa
This means your database is seeing commands that have already timed out by the time it goes to execute them, so it ignores them and gives up instead of working on work items that have already expired. The first log line shows 5 second latencies, the second line 6s and 8s latencies, which sounds

Dropped mutations

2019-07-25 Thread Ayub M
Hello, how do I read dropped mutations error messages - whats internal and cross node? For mutations it fails on cross-node and read_repair/read it fails on internal. What does it mean? INFO [ScheduledTasks:1] 2019-07-21 11:44:46,150 MessagingService.java:1281 - MUTATION messages were dropped in

Re: Performance impact with ALLOW FILTERING clause.

2019-07-25 Thread Jeff Jirsa
"unpredictable" is such a loaded word. It's quite predictable, but it's often mispredicted by users. "ALLOW FILTERING" basically tells the database you're going to do a query that will require scanning a bunch of data to return some subset of it, and you're not able to provide a WHERE clause

Re: Performance impact with ALLOW FILTERING clause.

2019-07-25 Thread Jacques-Henri Berthemet
Hi Asad, That’s because of the way Spark works. Essentially, when you execute a Spark job, it pulls the full content of the datastore (Cassandra in your case) in it RDDs and works with it “in memory”. While Spark uses “data locality” to read data from the nodes that have the required data on

Performance impact with ALLOW FILTERING clause.

2019-07-25 Thread ZAIDI, ASAD A
Hello Folks, I was going thru documentation and saw at many places saying ALLOW FILTERING causes performance unpredictability. Our developers says ALLOW FILTERING clause is implicitly added on bunch of queries by spark-Cassandra connector and they cannot control it; however at the same time

Cassandra OutOfMemoryError

2019-07-25 Thread raman gugnani
Hi I am using Apace Cassandra version : [cqlsh 5.0.1 | Cassandra 3.11.2 | CQL spec 3.4.4 | Native protocol v4] I am running a 5 node cluster and recently added one node to the cluster. Cluster is running with G1 GC garbage collector with 16GB -Xmx. Cluster is having one materialised view also;

Re: high write latency on a single table

2019-07-25 Thread mehmet bursali
awesome! so we can make a further investigation by using cassandra exporter on this link.  https://github.com/criteo/cassandra_exporter This exporter gives detailed information for read/write operations on each column  by using metrics below.. org:apache:cassandra:metrics:columnfamily:.* (

Re: Materialized View's additional PrimaryKey column

2019-07-25 Thread mehmet bursali
Hi Jon, thanks for your suggestion (or warning :) ). yes, i've read sth. about your point and i know that just because of using MVs, there are really several issues open in JIRA on bootstrapping, compaction and incremental repair stuff   but, after reading almost all jira tickets (with comments