Re: Materialized View's additional PrimaryKey column

2019-07-25 Thread mehmet bursali
Thank you again for Clear information Jon! i give up 珞

Android’de Yahoo Postadan gönderildi 
 
  0:53’’26e’ 26 Tem 2019 Cum tarihinde, Jon Haddad şunu 
yazdı:   The issues I have with MVs aren't related to how they aren't correctly 
synchronized, although I'm not happy about that either.  My issue with them are 
in every cluster I've seen that uses them, the cluster has been unstable, and 
I've put a lot of time into helping teams undo them.  You will almost certainly 
have several hours or days of downtime as a result of using them.
There's a good reason they're marked as experimental (and disabled by default). 
 You should maintain the other tables yourself.
Jon


On Thu, Jul 25, 2019 at 12:22 AM mehmet bursali  
wrote:

Hi Jon, thanks for your suggestion (or warning :) ).
yes, i've read sth. about your point and i know that just because of using MVs, 
there are really several issues open in JIRA on bootstrapping, compaction and 
incremental repair stuff   but, after reading almost all jira tickets (with 
comments and history) related to using MVs,  AFAU  all that issues come out by 
either loosing syncronization between base table and MV by deleting columns or 
rows values on base table or having a huge system that has large and dynamic 
number of nodes/data/workloads. We use 3.11.3 version and most of the critical 
issues were fixed on 3.10 but  of course I might be miss sth so i 'll be glad 
if you point me some specific jira ticket. 
We have a certain use case that require updates on filtering (clustering) 
columns.Our motivation for using MV was avoiding updates (delete + create) on 
primaryKey columns  because we suppose that cassandra developers can manage 
this unpreferred operation better then us. I'm really confused now.
 

On Wednesday, July 24, 2019, 11:30:15 PM GMT+3, Jon Haddad 
 wrote:  
 
 I really, really advise against using MVs.  I've had to help a number of teams 
move off them.  Not sure what list of bugs you read, but if the list didn't 
include "will destabilize your cluster to the point of constant downtime" then 
the list was incomplete.
Jon
On Wed, Jul 24, 2019 at 6:32 AM mehmet bursali  
wrote:

+ additional info: our production environment is a multiDC cluster that consist 
of 6 nodes in 2 DataCenters


 

On Wednesday, July 24, 2019, 3:35:11 PM GMT+3, mehmet bursali 
 wrote:  
 
 Hi Cassandra folks,I'm planning to use Materialized View (MV) on production 
for some specific cases.  I've read a lot of blogs, technical documents about 
the risks of using it  and everything seems ok for our use case. 
My question is about consistency(also durability) evaluation of MV usage with 
an additional primary key column.  İn one of our case, we select an UDT column 
of base table as addtional primary key column on MV. (UDT possible values are 
non nullable and restricted with domain.) . After inserting a record in base 
table, this additonal column (MVs primary key column) 
value also will be updated  for 1 or 2 time. So in our case,  for each update 
operation that will be occured on base table there are going to be delete and 
create operations inside MV.
Does it matter  from consistency(also durability) perspective  that using 
additional primary key column whether as partition column or  clustering column?

  
  
  


Re: Performance impact with ALLOW FILTERING clause.

2019-07-25 Thread Jon Haddad
If you're thinking about rewriting your data to be more performant when
doing analytics, you might as well go the distance and put it in an
analytics friendly format like Parquet.  My 2 cents.

On Thu, Jul 25, 2019 at 11:01 AM ZAIDI, ASAD A  wrote:

> Thank you all for your insights.
>
>
>
> When spark-connector adds allows filtering to a query, it makes the query
> to just ‘run’ no matter if it is expensive for larger table OR  not so
> expensive for table with fewer rows.
>
> In my particular case, nodes are reaching 2TB/per node load in 50 node
> cluster. When bunch of such queries run ,  causes impact on server
> resources.
>
>
>
> Since allow filtering is an expensive operation - I’m trying find knobs
> which if I turn, mitigate the impact.
>
>
>
> What I think , correct me if I am wrong , is – it is query design itself
> which is not optimized per table design  - that in turn causing connector
> to add allow filtering implicitly.  I’m not thinking to add secondary
> indexes on tables because they’ve their own overheads.  kindly share if
> there are  other means which we can use to influence connector not to use
> allow filtering.
>
>
>
> Thanks again.
>
> Asad
>
>
>
>
>
>
>
> *From:* Jeff Jirsa [mailto:jji...@gmail.com]
> *Sent:* Thursday, July 25, 2019 10:24 AM
> *To:* cassandra 
> *Subject:* Re: Performance impact with ALLOW FILTERING clause.
>
>
>
> "unpredictable" is such a loaded word. It's quite predictable, but it's
> often mispredicted by users.
>
>
>
> "ALLOW FILTERING" basically tells the database you're going to do a query
> that will require scanning a bunch of data to return some subset of it, and
> you're not able to provide a WHERE clause that's sufficiently fine grained
> to avoid the scan. It's a loose equivalent of doing a full table scan in
> SQL databases - sometimes it's a valid use case, but it's expensive, you're
> ignoring all of the indexes, and you're going to do a lot more work.
>
>
>
> It's predictable, though - you're probably going to walk over some range
> of data. Spark is grabbing all of the data to load into RDDs, and it
> probably does it by slicing up the range, doing a bunch of range scans.
>
>
>
> It's doing that so it can get ALL of the data and do the filtering /
> joining / searching in-memory in spark, rather than relying on cassandra to
> do the scanning/searching on disk.
>
>
>
> On Thu, Jul 25, 2019 at 6:49 AM ZAIDI, ASAD A  wrote:
>
> Hello Folks,
>
>
>
> I was going thru documentation and saw at many places saying ALLOW
> FILTERING causes performance unpredictability.  Our developers says ALLOW
> FILTERING clause is implicitly added on bunch of queries by spark-Cassandra
>  connector and they cannot control it; however at the same time we see
> unpredictability in application performance – just as documentation says.
>
>
>
> I’m trying to understand why would a connector add a clause in query when
> this can cause negative impact on database/application performance. Is that
> data model that is driving connector make its decision and add allow
> filtering to query automatically or if there are other reason this clause
> is added to the code. I’m not a developer though I want to know why
> developer don’t have any control on this to happen.
>
>
>
> I’ll appreciate your guidance here.
>
>
>
> Thanks
>
> Asad
>
>
>
>
>
>


Re: Materialized View's additional PrimaryKey column

2019-07-25 Thread Jon Haddad
The issues I have with MVs aren't related to how they aren't correctly
synchronized, although I'm not happy about that either.  My issue with them
are in every cluster I've seen that uses them, the cluster has been
unstable, and I've put a lot of time into helping teams undo them.  You
will almost certainly have several hours or days of downtime as a result of
using them.

There's a good reason they're marked as experimental (and disabled by
default).  You should maintain the other tables yourself.

Jon



On Thu, Jul 25, 2019 at 12:22 AM mehmet bursali 
wrote:

> Hi Jon, thanks for your suggestion (or warning :) ).
> yes, i've read sth. about your point and i know that just because of
> using MVs, there are really several issues open in JIRA on bootstrapping,
> compaction and incremental repair stuff   but, after reading almost all
> jira tickets (with comments and history) related to using MVs,  AFAU  all
> that issues come out by either loosing syncronization between base table
> and MV by deleting columns or rows values on base table or having a huge
> system that has large and dynamic number of nodes/data/workloads. We use
> 3.11.3 version and most of the critical issues were fixed on 3.10 but  of
> course I might be miss sth so i 'll be glad if you point me some specific
> jira ticket.
> We have a certain use case that require updates on filtering (clustering)
> columns.Our motivation for using MV was avoiding updates (delete +
> create) on primaryKey columns  because we suppose that cassandra developers
> can manage this unpreferred operation better then us. I'm really confused
> now.
>
>
>
> On Wednesday, July 24, 2019, 11:30:15 PM GMT+3, Jon Haddad <
> j...@jonhaddad.com> wrote:
>
>
> I really, really advise against using MVs.  I've had to help a number of
> teams move off them.  Not sure what list of bugs you read, but if the list
> didn't include "will destabilize your cluster to the point of constant
> downtime" then the list was incomplete.
>
> Jon
>
> On Wed, Jul 24, 2019 at 6:32 AM mehmet bursali 
> wrote:
>
> + additional info: our production environment is a multiDC cluster that
> consist of 6 nodes in 2 DataCenters
>
>
>
>
> On Wednesday, July 24, 2019, 3:35:11 PM GMT+3, mehmet bursali
>  wrote:
>
>
> Hi Cassandra folks,
> I'm planning to use Materialized View (MV) on production for some specific
> cases.  I've read a lot of blogs, technical documents about the risks of
> using it  and everything seems ok for our use case.
> My question is about consistency(also durability) evaluation of MV usage
> with an additional primary key column.  İn one of our case, we select an
> UDT column of base table as addtional primary key column on MV. (UDT
> possible values are non nullable and restricted with domain.) . After
> inserting a record in base table, this additonal column (MVs primary key
> column)
> value also will be updated  for 1 or 2 time. So in our case,  for each
> update operation that will be occured on base table there are going to be
> delete and create operations inside MV.
> Does it matter  from consistency(also durability) perspective that using
> additional primary key column whether as partition column or  clustering
> column?
>
>


RE: Performance impact with ALLOW FILTERING clause.

2019-07-25 Thread ZAIDI, ASAD A
Thank you all for your insights.

When spark-connector adds allows filtering to a query, it makes the query to 
just ‘run’ no matter if it is expensive for larger table OR  not so expensive 
for table with fewer rows.
In my particular case, nodes are reaching 2TB/per node load in 50 node cluster. 
When bunch of such queries run ,  causes impact on server resources.

Since allow filtering is an expensive operation - I’m trying find knobs which 
if I turn, mitigate the impact.

What I think , correct me if I am wrong , is – it is query design itself which 
is not optimized per table design  - that in turn causing connector to add 
allow filtering implicitly.  I’m not thinking to add secondary indexes on 
tables because they’ve their own overheads.  kindly share if there are  other 
means which we can use to influence connector not to use allow filtering.

Thanks again.
Asad



From: Jeff Jirsa [mailto:jji...@gmail.com]
Sent: Thursday, July 25, 2019 10:24 AM
To: cassandra 
Subject: Re: Performance impact with ALLOW FILTERING clause.

"unpredictable" is such a loaded word. It's quite predictable, but it's often 
mispredicted by users.

"ALLOW FILTERING" basically tells the database you're going to do a query that 
will require scanning a bunch of data to return some subset of it, and you're 
not able to provide a WHERE clause that's sufficiently fine grained to avoid 
the scan. It's a loose equivalent of doing a full table scan in SQL databases - 
sometimes it's a valid use case, but it's expensive, you're ignoring all of the 
indexes, and you're going to do a lot more work.

It's predictable, though - you're probably going to walk over some range of 
data. Spark is grabbing all of the data to load into RDDs, and it probably does 
it by slicing up the range, doing a bunch of range scans.

It's doing that so it can get ALL of the data and do the filtering / joining / 
searching in-memory in spark, rather than relying on cassandra to do the 
scanning/searching on disk.

On Thu, Jul 25, 2019 at 6:49 AM ZAIDI, ASAD A 
mailto:az1...@att.com>> wrote:
Hello Folks,

I was going thru documentation and saw at many places saying ALLOW FILTERING 
causes performance unpredictability.  Our developers says ALLOW FILTERING 
clause is implicitly added on bunch of queries by spark-Cassandra  connector 
and they cannot control it; however at the same time we see unpredictability in 
application performance – just as documentation says.

I’m trying to understand why would a connector add a clause in query when this 
can cause negative impact on database/application performance. Is that data 
model that is driving connector make its decision and add allow filtering to 
query automatically or if there are other reason this clause is added to the 
code. I’m not a developer though I want to know why developer don’t have any 
control on this to happen.

I’ll appreciate your guidance here.

Thanks
Asad




Re: Dropped mutations

2019-07-25 Thread Ayub M
Thanks Jeff, does internal mean local node operations - in this case
mutation response from local node and cross node means the time it took to
get response back from other nodes depending on the consistency level
choosen?

On Thu, Jul 25, 2019 at 11:51 AM Jeff Jirsa  wrote:

> This means your database is seeing commands that have already timed out by
> the time it goes to execute them, so it ignores them and gives up instead
> of working on work items that have already expired.
>
> The first log line shows 5 second latencies, the second line 6s and 8s
> latencies, which sounds like either really bad disks or really bad JVM GC
> pauses.
>
>
> On Thu, Jul 25, 2019 at 8:45 AM Ayub M  wrote:
>
>> Hello, how do I read dropped mutations error messages - whats internal
>> and cross node? For mutations it fails on cross-node and read_repair/read
>> it fails on internal. What does it mean?
>>
>> INFO  [ScheduledTasks:1] 2019-07-21 11:44:46,150
>> MessagingService.java:1281 - MUTATION messages were dropped in last 5000
>> ms: 0 internal and 65 cross node. Mean internal dropped latency: 0 ms and
>> Mean cross-node dropped latency: 4966 ms
>> INFO  [ScheduledTasks:1] 2019-07-19 05:01:10,620
>> MessagingService.java:1281 - READ_REPAIR messages were dropped in last 5000
>> ms: 9 internal and 8 cross node. Mean internal dropped latency: 6013 ms and
>> Mean cross-node dropped latency: 8164 ms
>>
>> --
>>
>> Regards,
>> Ayub
>>
>

-- 
Regards,
Ayub


Differing snitches in different datacenters

2019-07-25 Thread Voytek Jarnot
Quick and hopefully easy question for the list. Background is existing
cluster (1 DC) will be migrated to AWS-hosted cluster via standing up a
second datacenter, existing cluster will be subsequently decommissioned.

We currently use GossipingPropertyFileSnitch and are thinking about using
Ec2MultiRegionSnitch in the new AWS DC - that'd position us nicely if in
the future we want to run a multi-DC cluster in AWS. My question is: are
there any issues with one DC using GossipingPropertyFileSnitch and the
other using Ec2MultiRegionSnitch? This setup would be temporary, existing
until the new DC nodes have rebuilt and the old DC is decommissioned.

Thanks,
Voytek Jarnot


Re: Unable to integrate jmx_prometheus_javaagent

2019-07-25 Thread Marc Richter

Nevermind,

after hours of investigation, I found the solution myself just after 
having the mail sent to the list ...


Even though some resources on the web highlight the importance to wrap 
what follows "-javaagent:" between "", this seems to be the issue; note 
that the log complains about it could not find:


"/etc/cassandra/prometheus/jmx_prometheus_javaagent-0.12.0.jar

Note the leading double-quote here ...

Removing the quotes makes it work like a charm.

Sorry for bothering!

BR,
Marc

On 25.07.19 18:02, Marc Richter wrote:

Hi everyone,

I have an existing Cassandra node (3.7). Now, I'd like to be able to 
grab metrics from it for my Prometheus + Grafana based monitoring.


I downloaded "jmx_prometheus_javaagent-0.12.0.jar" from [1], copied it 
to "/etc/cassandra/prometheus/jmx_prometheus_javaagent-0.12.0.jar".
I also downloaded "cassandra.yml" from [2] and saved it to 
"/etc/cassandra/prometheus/jmx_prometheus_javaagent_cassandra.yml".

Next, I appended the following to my cassandra-env.sh:


```
PROMETHEUS_AGENT='-javaagent:"/etc/cassandra/prometheus/jmx_prometheus_javaagent-0.12.0.jar=7070:/etc/cassandra/prometheus/jmx_prometheus_javaagent_cassandra.yml"' 


JVM_OPTS="$JVM_OPTS $PROMETHEUS_AGENT"
```


When I now try to start my Cassandra node, it fails and writes this to 
my logfile:



```
Error opening zip file or JAR manifest missing : 
"/etc/cassandra/prometheus/jmx_prometheus_javaagent-0.12.0.jar

Error occurred during initialization of VM
agent library failed to init: instrument
```


I'm using the official Cassandra Docker image [3], tag 3.7.


I found the steps I did here in many online resources. I could not find 
any issue which matches what I'm facing.


Does anybody have an idea?

BR,
Marc


[1] 
https://repo1.maven.org/maven2/io/prometheus/jmx/jmx_prometheus_javaagent/0.12.0/jmx_prometheus_javaagent-0.12.0.jar 

[2] 
https://raw.githubusercontent.com/prometheus/jmx_exporter/master/example_configs/cassandra.yml 


[3] https://hub.docker.com/_/cassandra

-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org



-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org



Unable to integrate jmx_prometheus_javaagent

2019-07-25 Thread Marc Richter

Hi everyone,

I have an existing Cassandra node (3.7). Now, I'd like to be able to 
grab metrics from it for my Prometheus + Grafana based monitoring.


I downloaded "jmx_prometheus_javaagent-0.12.0.jar" from [1], copied it 
to "/etc/cassandra/prometheus/jmx_prometheus_javaagent-0.12.0.jar".
I also downloaded "cassandra.yml" from [2] and saved it to 
"/etc/cassandra/prometheus/jmx_prometheus_javaagent_cassandra.yml".

Next, I appended the following to my cassandra-env.sh:


```
PROMETHEUS_AGENT='-javaagent:"/etc/cassandra/prometheus/jmx_prometheus_javaagent-0.12.0.jar=7070:/etc/cassandra/prometheus/jmx_prometheus_javaagent_cassandra.yml"'
JVM_OPTS="$JVM_OPTS $PROMETHEUS_AGENT"
```


When I now try to start my Cassandra node, it fails and writes this to 
my logfile:



```
Error opening zip file or JAR manifest missing : 
"/etc/cassandra/prometheus/jmx_prometheus_javaagent-0.12.0.jar

Error occurred during initialization of VM
agent library failed to init: instrument
```


I'm using the official Cassandra Docker image [3], tag 3.7.


I found the steps I did here in many online resources. I could not find 
any issue which matches what I'm facing.


Does anybody have an idea?

BR,
Marc


[1] 
https://repo1.maven.org/maven2/io/prometheus/jmx/jmx_prometheus_javaagent/0.12.0/jmx_prometheus_javaagent-0.12.0.jar
[2] 
https://raw.githubusercontent.com/prometheus/jmx_exporter/master/example_configs/cassandra.yml

[3] https://hub.docker.com/_/cassandra

-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org



Re: Dropped mutations

2019-07-25 Thread Rajsekhar Mallick
Hello Jeff,

Request you to help on how to visualise the terms
1. Internal mutations
2. Cross node mutations
3. Mean internal dropped latency
4. Cross node dropped latency

Thanks,
Rajsekhar

On Thu, 25 Jul, 2019, 9:21 PM Jeff Jirsa,  wrote:

> This means your database is seeing commands that have already timed out by
> the time it goes to execute them, so it ignores them and gives up instead
> of working on work items that have already expired.
>
> The first log line shows 5 second latencies, the second line 6s and 8s
> latencies, which sounds like either really bad disks or really bad JVM GC
> pauses.
>
>
> On Thu, Jul 25, 2019 at 8:45 AM Ayub M  wrote:
>
>> Hello, how do I read dropped mutations error messages - whats internal
>> and cross node? For mutations it fails on cross-node and read_repair/read
>> it fails on internal. What does it mean?
>>
>> INFO  [ScheduledTasks:1] 2019-07-21 11:44:46,150
>> MessagingService.java:1281 - MUTATION messages were dropped in last 5000
>> ms: 0 internal and 65 cross node. Mean internal dropped latency: 0 ms and
>> Mean cross-node dropped latency: 4966 ms
>> INFO  [ScheduledTasks:1] 2019-07-19 05:01:10,620
>> MessagingService.java:1281 - READ_REPAIR messages were dropped in last 5000
>> ms: 9 internal and 8 cross node. Mean internal dropped latency: 6013 ms and
>> Mean cross-node dropped latency: 8164 ms
>>
>> --
>>
>> Regards,
>> Ayub
>>
>


Re: Dropped mutations

2019-07-25 Thread Jeff Jirsa
This means your database is seeing commands that have already timed out by
the time it goes to execute them, so it ignores them and gives up instead
of working on work items that have already expired.

The first log line shows 5 second latencies, the second line 6s and 8s
latencies, which sounds like either really bad disks or really bad JVM GC
pauses.


On Thu, Jul 25, 2019 at 8:45 AM Ayub M  wrote:

> Hello, how do I read dropped mutations error messages - whats internal and
> cross node? For mutations it fails on cross-node and read_repair/read it
> fails on internal. What does it mean?
>
> INFO  [ScheduledTasks:1] 2019-07-21 11:44:46,150
> MessagingService.java:1281 - MUTATION messages were dropped in last 5000
> ms: 0 internal and 65 cross node. Mean internal dropped latency: 0 ms and
> Mean cross-node dropped latency: 4966 ms
> INFO  [ScheduledTasks:1] 2019-07-19 05:01:10,620
> MessagingService.java:1281 - READ_REPAIR messages were dropped in last 5000
> ms: 9 internal and 8 cross node. Mean internal dropped latency: 6013 ms and
> Mean cross-node dropped latency: 8164 ms
>
> --
>
> Regards,
> Ayub
>


Dropped mutations

2019-07-25 Thread Ayub M
Hello, how do I read dropped mutations error messages - whats internal and
cross node? For mutations it fails on cross-node and read_repair/read it
fails on internal. What does it mean?

INFO  [ScheduledTasks:1] 2019-07-21 11:44:46,150
MessagingService.java:1281 - MUTATION messages were dropped in last 5000
ms: 0 internal and 65 cross node. Mean internal dropped latency: 0 ms and
Mean cross-node dropped latency: 4966 ms
INFO  [ScheduledTasks:1] 2019-07-19 05:01:10,620
MessagingService.java:1281 - READ_REPAIR messages were dropped in last 5000
ms: 9 internal and 8 cross node. Mean internal dropped latency: 6013 ms and
Mean cross-node dropped latency: 8164 ms

-- 

Regards,
Ayub


Re: Performance impact with ALLOW FILTERING clause.

2019-07-25 Thread Jeff Jirsa
"unpredictable" is such a loaded word. It's quite predictable, but it's
often mispredicted by users.

"ALLOW FILTERING" basically tells the database you're going to do a query
that will require scanning a bunch of data to return some subset of it, and
you're not able to provide a WHERE clause that's sufficiently fine grained
to avoid the scan. It's a loose equivalent of doing a full table scan in
SQL databases - sometimes it's a valid use case, but it's expensive, you're
ignoring all of the indexes, and you're going to do a lot more work.

It's predictable, though - you're probably going to walk over some range of
data. Spark is grabbing all of the data to load into RDDs, and it probably
does it by slicing up the range, doing a bunch of range scans.

It's doing that so it can get ALL of the data and do the filtering /
joining / searching in-memory in spark, rather than relying on cassandra to
do the scanning/searching on disk.

On Thu, Jul 25, 2019 at 6:49 AM ZAIDI, ASAD A  wrote:

> Hello Folks,
>
>
>
> I was going thru documentation and saw at many places saying ALLOW
> FILTERING causes performance unpredictability.  Our developers says ALLOW
> FILTERING clause is implicitly added on bunch of queries by spark-Cassandra
>  connector and they cannot control it; however at the same time we see
> unpredictability in application performance – just as documentation says.
>
>
>
> I’m trying to understand why would a connector add a clause in query when
> this can cause negative impact on database/application performance. Is that
> data model that is driving connector make its decision and add allow
> filtering to query automatically or if there are other reason this clause
> is added to the code. I’m not a developer though I want to know why
> developer don’t have any control on this to happen.
>
>
>
> I’ll appreciate your guidance here.
>
>
>
> Thanks
>
> Asad
>
>
>
>
>


Re: Performance impact with ALLOW FILTERING clause.

2019-07-25 Thread Jacques-Henri Berthemet
Hi Asad,

That’s because of the way Spark works. Essentially, when you execute a Spark 
job, it pulls the full content of the datastore (Cassandra in your case) in it 
RDDs and works with it “in memory”. While Spark uses “data locality” to read 
data from the nodes that have the required data on its local disks, it’s still 
reading all data from Cassandra tables. To do so it’s sending ‘select * from 
Table ALLOW FILTERING’ query to Cassandra.

From Spark you don’t have much control on the initial query to fill the RDDs, 
sometimes you’ll read the whole table even if you only need one row.

Regards,
Jacques-Henri Berthemet

From: "ZAIDI, ASAD A" 
Reply to: "user@cassandra.apache.org" 
Date: Thursday 25 July 2019 at 15:49
To: "user@cassandra.apache.org" 
Subject: Performance impact with ALLOW FILTERING clause.

Hello Folks,

I was going thru documentation and saw at many places saying ALLOW FILTERING 
causes performance unpredictability.  Our developers says ALLOW FILTERING 
clause is implicitly added on bunch of queries by spark-Cassandra  connector 
and they cannot control it; however at the same time we see unpredictability in 
application performance – just as documentation says.

I’m trying to understand why would a connector add a clause in query when this 
can cause negative impact on database/application performance. Is that data 
model that is driving connector make its decision and add allow filtering to 
query automatically or if there are other reason this clause is added to the 
code. I’m not a developer though I want to know why developer don’t have any 
control on this to happen.

I’ll appreciate your guidance here.

Thanks
Asad




Performance impact with ALLOW FILTERING clause.

2019-07-25 Thread ZAIDI, ASAD A
Hello Folks,

I was going thru documentation and saw at many places saying ALLOW FILTERING 
causes performance unpredictability.  Our developers says ALLOW FILTERING 
clause is implicitly added on bunch of queries by spark-Cassandra  connector 
and they cannot control it; however at the same time we see unpredictability in 
application performance – just as documentation says.

I’m trying to understand why would a connector add a clause in query when this 
can cause negative impact on database/application performance. Is that data 
model that is driving connector make its decision and add allow filtering to 
query automatically or if there are other reason this clause is added to the 
code. I’m not a developer though I want to know why developer don’t have any 
control on this to happen.

I’ll appreciate your guidance here.

Thanks
Asad




Cassandra OutOfMemoryError

2019-07-25 Thread raman gugnani
Hi

I am using Apace Cassandra version :

[cqlsh 5.0.1 | Cassandra 3.11.2 | CQL spec 3.4.4 | Native protocol v4]


I am running a 5 node cluster and recently added one node to the cluster.
Cluster is running with G1 GC garbage collector with 16GB -Xmx.
Cluster is having one materialised view also;

On the newly added node I got OutOfMemory Error.

Heap Dump analysis shows below error:

BatchlogTasks:1
  at java.lang.OutOfMemoryError.()V (OutOfMemoryError.java:48)
  at java.util.HashMap.resize()[Ljava/util/HashMap$Node; (HashMap.java:704)
  at 
java.util.HashMap.putVal(ILjava/lang/Object;Ljava/lang/Object;ZZ)Ljava/lang/Object;
(HashMap.java:663)
  at 
java.util.HashMap.put(Ljava/lang/Object;Ljava/lang/Object;)Ljava/lang/Object;
(HashMap.java:612)
  at java.util.HashSet.add(Ljava/lang/Object;)Z (HashSet.java:220)
  at 
org.apache.cassandra.batchlog.BatchlogManager.finishAndClearBatches(Ljava/util/ArrayList;Ljava/util/Set;Ljava/util/Set;)V
(BatchlogManager.java:281)
  at 
org.apache.cassandra.batchlog.BatchlogManager.processBatchlogEntries(Lorg/apache/cassandra/cql3/UntypedResultSet;ILcom/google/common/util/concurrent/RateLimiter;)V
(BatchlogManager.java:261)
  at org.apache.cassandra.batchlog.BatchlogManager.replayFailedBatches()V
(BatchlogManager.java:210)
  at org.apache.cassandra.batchlog.BatchlogManager$$Lambda$269.run()V
(Unknown Source)
  at 
org.apache.cassandra.concurrent.DebuggableScheduledThreadPoolExecutor$UncomplainingRunnable.run()V
(DebuggableScheduledThreadPoolExecutor.java:118)
  at java.util.concurrent.Executors$RunnableAdapter.call()Ljava/lang/Object;
(Executors.java:511)
  at java.util.concurrent.FutureTask.runAndReset()Z (FutureTask.java:308)
  at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(Ljava/util/concurrent/ScheduledThreadPoolExecutor$ScheduledFutureTask;)Z
(ScheduledThreadPoolExecutor.java:180)
  at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run()V
(ScheduledThreadPoolExecutor.java:294)
  at 
java.util.concurrent.ThreadPoolExecutor.runWorker(Ljava/util/concurrent/ThreadPoolExecutor$Worker;)V
(ThreadPoolExecutor.java:1149)
  at java.util.concurrent.ThreadPoolExecutor$Worker.run()V
(ThreadPoolExecutor.java:624)
  at 
org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(Ljava/lang/Runnable;)V
(NamedThreadFactory.java:81)
  at org.apache.cassandra.concurrent.NamedThreadFactory$$Lambda$4.run()V
(Unknown Source)
  at java.lang.Thread.run()V (Thread.java:748)


I have found system.bacthes file have huge data on this node.

nodetool -u cassandra -pw cassandra tablestats system.batches -H
Total number of tables: 65

Keyspace : system
Read Count: 3990928
Read Latency: 0.07400208372589032 ms
Write Count: 4898771
Write Latency: 0.012194797838069997 ms
Pending Flushes: 0
Table: batches
SSTable count: 5
Space used (live): 50.89 GiB
Space used (total): 50.89 GiB
Space used by snapshots (total): 0 bytes
Off heap memory used (total): 1.05 GiB
SSTable Compression Ratio: 0.38778672943000886
Number of partitions (estimate): 727971046
Memtable cell count: 12
Memtable data size: 918 bytes
Memtable off heap memory used: 0 bytes
Memtable switch count: 10
Local read count: 0
Local read latency: NaN ms
Local write count: 618894
Local write latency: 0.010 ms
Pending flushes: 0
Percent repaired: 0.0
Bloom filter false positives: 0
Bloom filter false ratio: 0.0
Bloom filter space used: 906.25 MiB
Bloom filter off heap memory used: 906.25 MiB
Index summary off heap memory used: 155.86 MiB
Compression metadata off heap memory used: 10.6 MiB
Compacted partition minimum bytes: 30
Compacted partition maximum bytes: 258
Compacted partition mean bytes: 136
Average live cells per slice (last five minutes): 149.0
Maximum live cells per slice (last five minutes): 149
Average tombstones per slice (last five minutes): 1.0
Maximum tombstones per slice (last five minutes): 1
Dropped Mutations: 0 bytes


*Can someone please help, what can be the issue ?*

-- 
Raman Gugnani


Re: high write latency on a single table

2019-07-25 Thread mehmet bursali
awesome! so we can make a further investigation by using cassandra exporter on 
this link.  https://github.com/criteo/cassandra_exporter
This exporter gives detailed information for read/write operations on each 
column  by using metrics below..
 org:apache:cassandra:metrics:columnfamily:.*  ( reads from table metrics in 
cassandra 
https://cassandra.apache.org/doc/latest/operating/metrics.html#table-metrics )








 

On Wednesday, July 24, 2019, 11:51:28 PM GMT+3, CPC  
wrote:  
 
 Hi Mehmet,
Yes prometheus and opscenter

On Wed, 24 Jul 2019 at 17:09, mehmet bursali  
wrote:

hi,do you use any perfomance monitoring tool like prometheus?


 

On Monday, July 22, 2019, 1:16:58 PM GMT+3, CPC  wrote:  
 
 Hi everybody,
State column contains "R" or "D" values. Just a single character. As Rajsekhar 
said, only difference is the table can contain high number of cell count. In 
the mean time we made a major compaction and data per node was 5-6 gb. 
On Mon, Jul 22, 2019, 10:56 AM Rajsekhar Mallick  
wrote:

Hello Team,
The difference in write latencies between both the tables though 
significant,but the higher latency being 11.353 ms is still acceptable.
Overall Writes not being an issue, but write latency for this particular table 
on the higher side does point towards data being written to the table.Few 
things which I noticed, is the data in cell count column in nodetool 
tablehistogram o/p for message_history_state table is scattered.The partition 
size histogram for the tables is consistent, but the column count histogram for 
the impacted table isn't uniform.May be we can start thinking on these lines.
I would also wait for some expert advice here.
Thanks

On Mon,the 22 Jul, 2019, 12:31 PM Ben Slater,  
wrote:

Is the size of the data in your “state” column variable? The higher write 
latencies at the 95%+ could line up with large volumes of data for particular 
rows in that column (the one column not in both tables)?
CheersBen

--- 

Ben Slater
Chief Product Officer







Read our latest technical blog posts here.

This email has been sent on behalf of Instaclustr Pty. Limited (Australia) and 
Instaclustr Inc (USA).

This email and any attachments may contain confidential and legally privileged 
information.  If you are not the intended recipient, do not copy or disclose 
its content, but please reply to this email immediately and highlight the error 
to the sender and then immediately delete the message.


On Mon, 22 Jul 2019 at 16:46, CPC  wrote:

Hi guys,
Any idea? I thought it might be a bug but could not find anything related on 
jira.
On Fri, Jul 19, 2019, 12:45 PM CPC  wrote:

Hi Rajsekhar,
Here the details:
1)[cassadm@bipcas00 ~]$ nodetool tablestats tims.MESSAGE_HISTORY
Total number of tables: 259

Keyspace : tims
Read Count: 208256144
Read Latency: 7.655146714749506 ms
Write Count: 2218205275
Write Latency: 1.7826005103175133 ms
Pending Flushes: 0
Table: MESSAGE_HISTORY
SSTable count: 41
Space used (live): 976964101899
Space used (total): 976964101899
Space used by snapshots (total): 3070598526780
Off heap memory used (total): 185828820
SSTable Compression Ratio: 0.8219217809913125
Number of partitions (estimate): 8175715
Memtable cell count: 73124
Memtable data size: 26543733
Memtable off heap memory used: 27829672
Memtable switch count: 1607
Local read count: 7871917
Local read latency: 1.187 ms
Local write count: 172220954
Local write latency: 0.021 ms
Pending flushes: 0
Percent repaired: 0.0
Bloom filter false positives: 130
Bloom filter false ratio: 0.0
Bloom filter space used: 10898488
Bloom filter off heap memory used: 10898160
Index summary off heap memory used: 2480140
Compression metadata off heap memory used: 144620848
Compacted partition minimum bytes: 36
Compacted partition maximum bytes: 557074610
Compacted partition mean bytes: 155311
Average live cells per slice (last five minutes): 
25.56639344262295
Maximum live cells per slice (last five minutes): 5722
Average tombstones per slice (last five minutes): 
1.8681948424068768
Maximum tombstones per slice (last five minutes): 770
Dropped Mutations: 97812


[cassadm@bipcas00 ~]$ nodetool tablestats tims.MESSAGE_HISTORY_STATE
Total number of tables: 259

Keyspace : tims
Read Count: 208257486
Read Latency: 7.655137315414438 ms
Write Count: 2218218966
Write Latency: 1.7825896304427324 ms

Re: Materialized View's additional PrimaryKey column

2019-07-25 Thread mehmet bursali
Hi Jon, thanks for your suggestion (or warning :) ).
yes, i've read sth. about your point and i know that just because of using MVs, 
there are really several issues open in JIRA on bootstrapping, compaction and 
incremental repair stuff   but, after reading almost all jira tickets (with 
comments and history) related to using MVs,  AFAU  all that issues come out by 
either loosing syncronization between base table and MV by deleting columns or 
rows values on base table or having a huge system that has large and dynamic 
number of nodes/data/workloads. We use 3.11.3 version and most of the critical 
issues were fixed on 3.10 but  of course I might be miss sth so i 'll be glad 
if you point me some specific jira ticket. 
We have a certain use case that require updates on filtering (clustering) 
columns.Our motivation for using MV was avoiding updates (delete + create) on 
primaryKey columns  because we suppose that cassandra developers can manage 
this unpreferred operation better then us. I'm really confused now.
 

On Wednesday, July 24, 2019, 11:30:15 PM GMT+3, Jon Haddad 
 wrote:  
 
 I really, really advise against using MVs.  I've had to help a number of teams 
move off them.  Not sure what list of bugs you read, but if the list didn't 
include "will destabilize your cluster to the point of constant downtime" then 
the list was incomplete.
Jon
On Wed, Jul 24, 2019 at 6:32 AM mehmet bursali  
wrote:

+ additional info: our production environment is a multiDC cluster that consist 
of 6 nodes in 2 DataCenters


 

On Wednesday, July 24, 2019, 3:35:11 PM GMT+3, mehmet bursali 
 wrote:  
 
 Hi Cassandra folks,I'm planning to use Materialized View (MV) on production 
for some specific cases.  I've read a lot of blogs, technical documents about 
the risks of using it  and everything seems ok for our use case. 
My question is about consistency(also durability) evaluation of MV usage with 
an additional primary key column.  İn one of our case, we select an UDT column 
of base table as addtional primary key column on MV. (UDT possible values are 
non nullable and restricted with domain.) . After inserting a record in base 
table, this additonal column (MVs primary key column) 
value also will be updated  for 1 or 2 time. So in our case,  for each update 
operation that will be occured on base table there are going to be delete and 
create operations inside MV.
Does it matter  from consistency(also durability) perspective  that using 
additional primary key column whether as partition column or  clustering column?