Re: sstablescrum fails with OOM

2017-11-03 Thread kurt greaves
Try run nodetool refresh or restarting Cassandra after removing the
corrupted file

On 4 Nov. 2017 03:54, "Shashi Yachavaram"  wrote:

> When i tried to simulate this in the lab by moving files (mv
> KS-CF-ka-10143-* /tmp/files).
>
> Ran repair but it fails during snapshot creation. Where does it get
> the list of files and how do we update this list so we can get rid of
> corrupted files, update list/index and move on with offline scrub/repair.
>
> Thanks
>
> shashi
>
>
> On Thu, Nov 2, 2017 at 10:33 PM, Jeff Jirsa  wrote:
>
>> This is not guaranteed to be safe
>>
>> If the corrupted sstable has a tombstone past gc grace, and another
>> sstable has shadowed deleted data, removing the corrupt sstable will cause
>> the data to come back to life, and repair will spread it around the ring
>>
>> If that’s problematic to you, you should consider the entire node failed,
>> run repair among the surviving replicas and then replace the down server
>>
>> If you don’t do deletes, and write with consistency higher than ONE,
>> there’s a bit less risk to removing a single sstable
>>
>>
>> --
>> Jeff Jirsa
>>
>>
>> On Nov 2, 2017, at 7:58 PM, sai krishnam raju potturi <
>> pskraj...@gmail.com> wrote:
>>
>> Yes. Move the corrupt sstable, and run a repair on this node, so that it
>> gets in sync with it's peers.
>>
>> On Thu, Nov 2, 2017 at 6:12 PM, Shashi Yachavaram 
>> wrote:
>>
>>> We are cassandra 2.0.17 and have corrupted sstables. Ran offline
>>> sstablescrub but it fails with OOM. Increased the MAX_HEAP_SIZE to 8G it
>>> still fails.
>>>
>>> Can we move the corrupted sstable file and rerun sstablescrub followed
>>> by repair.
>>>
>>> -shashi..
>>>
>>
>>
>


Unexpected rows in MV after upgrading to 3.0.15

2017-11-03 Thread Tom van der Woerdt
Hello,

While testing 3.0.15, we noticed that some materialized views started
showing rows that shouldn't exist, as multiple rows in the view map to a
single row in the base table.

I've pasted the table structure below, but essentially there's a base table
"((pk1,pk2,pk3),ck1),col1" and MV "((pk1,pk2,pk3),col1,ck1)". This means
that if col1 changes, we expect a delete and insert on the MV. And yet this
happens:

> select col1, ck1, dateof(col1) FROM view_1 where pk1='abc' and pk2='123'
and pk3='def';

 col1 |
ck1  | system.dateof(col1)
--+--+-
 7bd437d9-bccc-11e7-9748-40749b41c1e0 |
295eae9b-d544-4064-8dbc-0c56772759f3 | 2017-10-29 17:13:29.494000+
 df39e364-bed3-11e7-8a3d-953c29bf01ff |
295eae9b-d544-4064-8dbc-0c56772759f3 | 2017-11-01 07:11:25.057000+
 928980ae-bed5-11e7-8b41-6e709b16923d |
295eae9b-d544-4064-8dbc-0c56772759f3 | 2017-11-01 07:23:35.388000+
 # Only relevant rows are shown

> select col1,writetime(col1),dateof(col1) from table_1 where pk1='abc' and
pk2='123' and pk3='def' and ck1='295eae9b-d544-4064-8dbc-0c56772759f3';

 col1 | writetime(col1) |
system.dateof(col1)
--+-+-
 928980ae-bed5-11e7-8b41-6e709b16923d |1509728864328000 |
2017-11-01 07:23:35.388000+

It's not supposed to be possible, and yet there are three rows that all map
onto the same primary key in the base table.

The cluster was upgraded on 2017-10-31, so the first row could *maybe* be
explained by CASSANDRA-11500, but the second row can't. The third row is
the one we expect to be there.

Is this a new regression in 3.0.15? Is anyone else experiencing this, or
should I file a ticket?

Thanks,
Tom


--- Full structure: -

CREATE TABLE the_keyspace.table_1 (
pk1 ascii,
pk2 ascii,
pk3 ascii,
ck1 ascii,
col1 timeuuid,
PRIMARY KEY ((pk1, pk2, pk3), ck1)
) WITH CLUSTERING ORDER BY (ck1 ASC)
AND bloom_filter_fp_chance = 0.1
AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
AND comment = ''
AND compaction = {'class':
'org.apache.cassandra.db.compaction.LeveledCompactionStrategy'}
AND compression = {'chunk_length_in_kb': '64', 'class':
'org.apache.cassandra.io.compress.LZ4Compressor'}
AND crc_check_chance = 1.0
AND dclocal_read_repair_chance = 0.1
AND default_time_to_live = 0
AND gc_grace_seconds = 864000
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = '99PERCENTILE';

CREATE MATERIALIZED VIEW the_keyspace.view_1 AS
SELECT *
FROM the_keyspace.table_1
WHERE pk1 IS NOT NULL AND pk2 IS NOT NULL AND pk3 IS NOT NULL AND col1
IS NOT NULL AND ck1 IS NOT NULL
PRIMARY KEY ((pk1, pk2, pk3), col1, ck1)
WITH CLUSTERING ORDER BY (col1 ASC, ck1 ASC)
AND bloom_filter_fp_chance = 0.01
AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
AND comment = ''
AND compaction = {'class':
'org.apache.cassandra.db.compaction.LeveledCompactionStrategy'}
AND compression = {'chunk_length_in_kb': '64', 'class':
'org.apache.cassandra.io.compress.LZ4Compressor'}
AND crc_check_chance = 1.0
AND dclocal_read_repair_chance = 0.1
AND default_time_to_live = 0
AND gc_grace_seconds = 864000
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = '99PERCENTILE';


Tom van der Woerdt
Site Reliability Engineer

Booking.com B.V.
Vijzelstraat 66-80 Amsterdam 1017HL Netherlands
[image: Booking.com] 
The world's #1 accommodation site
43 languages, 204+ offices worldwide, 118,000+ global destinations,
1,500,000+ room nights booked every day
No booking fees, best price always guaranteed
Subsidiary of the Priceline Group (NASDAQ: PCLN)


Re: Cassandra using a ton of native memory

2017-11-03 Thread DuyHai Doan
8Gb of RAM being a recommended production setting for most of the workload
out there. Having only 16Gb of RAM, and because Cassandra is relying a lot
on system page cache, there should be no surprise that your 16Gb being
eaten up.

On Fri, Nov 3, 2017 at 5:40 PM, Austin Sharp  wrote:

> I’ve investigated further. It appears that the performance issues are
> because Cassandra’s memory-mapped files (*.db files) fill up the physical
> memory and start being swapped to disk. Is this related to recommendations
> to disable swapping on a machine where Cassandra is installed? Should I
> disable memory-mapped IO?
>
> I can see issues in JIRA related to Windows memory-mapped I/O but they all
> appear to be fixed prior to 3.11.
>
>
>
> *From:* Austin Sharp [mailto:austin.sh...@seeq.com]
> *Sent:* Thursday, November 2, 2017 17:51
> *To:* user@cassandra.apache.org
> *Subject:* Cassandra using a ton of native memory
>
>
>
> Hi,
>
>
>
> I have a problem with Cassandra 3.11.0 on Windows. I'm testing a workload
> w= ith a lot of read-then-writes that had no significant problems on
> Cassandra=  2.x. However, now when this workload continues for a while
> (perhaps an hou= r), Cassandra or its JVM effectively use up all of the
> machine's 16GB of me= mory. Cassandra is started with -Xmx2147M, and JMX
> shows <2GB heap memory a= nd <100MB of off-heap memory. However, when I use
> something like Process Ex= plorer, I see that Cassandra has 10 to 11GB of
> memory in its working set, a= nd Windows shows essentially no free memory
> at all. Once the system has no = free memory, other processes suffer long
> sequences of unresponsiveness.
>
>
>
> I can't see anything terribly wrong from JMX metrics or log files - they
> ne= ver show more than 1GB of non-heap memory. Where should I look to
> investiga= te this further?
>
>
>
> Thanks,
>
> Austin
>
>
>


Re: sstablescrum fails with OOM

2017-11-03 Thread Shashi Yachavaram
When i tried to simulate this in the lab by moving files (mv
KS-CF-ka-10143-* /tmp/files).

Ran repair but it fails during snapshot creation. Where does it get
the list of files and how do we update this list so we can get rid of
corrupted files, update list/index and move on with offline scrub/repair.

Thanks

shashi


On Thu, Nov 2, 2017 at 10:33 PM, Jeff Jirsa  wrote:

> This is not guaranteed to be safe
>
> If the corrupted sstable has a tombstone past gc grace, and another
> sstable has shadowed deleted data, removing the corrupt sstable will cause
> the data to come back to life, and repair will spread it around the ring
>
> If that’s problematic to you, you should consider the entire node failed,
> run repair among the surviving replicas and then replace the down server
>
> If you don’t do deletes, and write with consistency higher than ONE,
> there’s a bit less risk to removing a single sstable
>
>
> --
> Jeff Jirsa
>
>
> On Nov 2, 2017, at 7:58 PM, sai krishnam raju potturi 
> wrote:
>
> Yes. Move the corrupt sstable, and run a repair on this node, so that it
> gets in sync with it's peers.
>
> On Thu, Nov 2, 2017 at 6:12 PM, Shashi Yachavaram 
> wrote:
>
>> We are cassandra 2.0.17 and have corrupted sstables. Ran offline
>> sstablescrub but it fails with OOM. Increased the MAX_HEAP_SIZE to 8G it
>> still fails.
>>
>> Can we move the corrupted sstable file and rerun sstablescrub followed by
>> repair.
>>
>> -shashi..
>>
>
>


RE: Cassandra using a ton of native memory

2017-11-03 Thread Austin Sharp
I've investigated further. It appears that the performance issues are because 
Cassandra's memory-mapped files (*.db files) fill up the physical memory and 
start being swapped to disk. Is this related to recommendations to disable 
swapping on a machine where Cassandra is installed? Should I disable 
memory-mapped IO?

I can see issues in JIRA related to Windows memory-mapped I/O but they all 
appear to be fixed prior to 3.11.

From: Austin Sharp [mailto:austin.sh...@seeq.com]
Sent: Thursday, November 2, 2017 17:51
To: user@cassandra.apache.org
Subject: Cassandra using a ton of native memory


Hi,



I have a problem with Cassandra 3.11.0 on Windows. I'm testing a workload w= 
ith a lot of read-then-writes that had no significant problems on Cassandra=  
2.x. However, now when this workload continues for a while (perhaps an hou= r), 
Cassandra or its JVM effectively use up all of the machine's 16GB of me= mory. 
Cassandra is started with -Xmx2147M, and JMX shows <2GB heap memory a= nd 
<100MB of off-heap memory. However, when I use something like Process Ex= 
plorer, I see that Cassandra has 10 to 11GB of memory in its working set, a= nd 
Windows shows essentially no free memory at all. Once the system has no = free 
memory, other processes suffer long sequences of unresponsiveness.



I can't see anything terribly wrong from JMX metrics or log files - they ne= 
ver show more than 1GB of non-heap memory. Where should I look to investiga= te 
this further?



Thanks,

Austin



Re: system_auth permissions issue C* 2.0.14

2017-11-03 Thread pabbireddy avinash
Hi
We are seeing this issue on some nodes where even if we are providing
correct credentials we are seeing incorrect username/password exception &
when we try again with same credentials we are able to login .

[hostname ~ ]$ ./cqlsh -u  -p 

Traceback (most recent call last):
  File "/opt/xcal/apps/cassandra/bin/cqlsh", line 2094, in 
main(*read_options(sys.argv[1:], os.environ))
  File "/opt/xcal/apps/cassandra/bin/cqlsh", line 2077, in main
single_statement=options.execute)
  File "/opt/xcal/apps/cassandra/bin/cqlsh", line 492, in __init__
password=password, cql_version=cqlver, transport=transport)
  File
"/opt/xcal/apps/apache-cassandra-2.0.14/bin/../lib/cql-internal-only-1.4.1.zip/cql-1.4.1/cql/connection.py",
line 143, in connect
  File
"/opt/xcal/apps/apache-cassandra-2.0.14/bin/../lib/cql-internal-only-1.4.1.zip/cql-1.4.1/cql/connection.py",
line 59, in __init__
  File
"/opt/xcal/apps/apache-cassandra-2.0.14/bin/../lib/cql-internal-only-1.4.1.zip/cql-1.4.1/cql/thrifteries.py",
line 157, in establish_connection
  File
"/opt/xcal/apps/apache-cassandra-2.0.14/bin/../lib/cql-internal-only-1.4.1.zip/cql-1.4.1/cql/cassandra/Cassandra.py",
line 465, in login
  File
"/opt/xcal/apps/apache-cassandra-2.0.14/bin/../lib/cql-internal-only-1.4.1.zip/cql-1.4.1/cql/cassandra/Cassandra.py",
line 486, in recv_login
cql.cassandra.ttypes.AuthenticationException:
AuthenticationException(why='Username and/or password are incorrect')
 [hostname ~ ]~ ]$ ./cqlsh -u  -p 
Connected to Cluster at host:9160.
[cqlsh 4.1.1 | Cassandra 2.0.14 | CQL spec 3.1.1 | Thrift protocol 19.39.0]
Use HELP for help.
cqlsh> exit;

Regards,
Avinash.


On Fri, Nov 3, 2017 at 11:05 AM, pabbireddy avinash <
pabbireddyavin...@gmail.com> wrote:

> Hi,
> We are seeing system_auth related exceptions from application side on
> cassandra 2.0.14 .
>
>
> at org.glassfish.jersey.internal.Errors$1.call(Errors.java:267)
> [jersey-common-2.14.jar:na]
> at org.glassfish.jersey.internal.Errors.process(Errors.java:315)
> [jersey-common-2.14.jar:na]
> at org.glassfish.jersey.internal.Errors.process(Errors.java:297)
> [jersey-common-2.14.jar:na]
> ... 33 lines omitted ...
> Caused by: com.datastax.driver.core.exceptions.UnauthorizedException:*
> User has no MODIFY permission on  parents*
> at 
> com.datastax.driver.core.Responses$Error.asException(Responses.java:101)
> ~[cassandra-driver-core-2.1.7.jar:na]
>
> When we check permissions on all the hosts we did not find any issues all
> the nodes have modify , select permissions for the user . We repaired
> system_auth on all the nodes but still we are seeing this issue time to
> time .We have RF= so that all nodes will have
> system_auth data .
>
>
> Please help me understand this issue .
>
> Regards,
> Avinash.
>
>


Re: What is OneMinuteRate in Write Latency?

2017-11-03 Thread Chris Lohfink
Its from the metrics library Meter

object which tracks the exponentially weighted moving average

of events.

Chris

On Thu, Nov 2, 2017 at 12:10 PM, AI Rumman  wrote:

> Hi,
>
> I am trying to calculate the Read/second and Write/Second in my Cassandra
> 2.1 cluster. After searching and reading, I came to know about JMX bean
> "org.apache.cassandra.metrics:type=ClientRequest,scope=
> Write,name=Latency".
> Here I can see oneMinuteRate. I have started a brand new cluster and
> started collected these metrics from 0.
> When I started my first record, I can see
>
> Count = 1
>> OneMinuteRate = 0.01599111...
>
>
> Does it mean that my write/s is 0.0159911? Or does it mean that based on 1
> minute data, my write latency is 0.01599 where Write Latency refers to the
> response time for writing a record?
>
> Please help me understand the value.
>
> Thanks.
>
>
>
>
>


system_auth permissions issue C* 2.0.14

2017-11-03 Thread pabbireddy avinash
Hi,
We are seeing system_auth related exceptions from application side on
cassandra 2.0.14 .


at org.glassfish.jersey.internal.Errors$1.call(Errors.java:267)
[jersey-common-2.14.jar:na]
at org.glassfish.jersey.internal.Errors.process(Errors.java:315)
[jersey-common-2.14.jar:na]
at org.glassfish.jersey.internal.Errors.process(Errors.java:297)
[jersey-common-2.14.jar:na]
... 33 lines omitted ...
Caused by: com.datastax.driver.core.exceptions.UnauthorizedException:* User
has no MODIFY permission on  so that all nodes will have
system_auth data .


Please help me understand this issue .

Regards,
Avinash.


Questions about maintaining secondary indexes.

2017-11-03 Thread Razi Khaja
Hello,

I have a few questions about secondary indexes.


1st Question:

Quoting this FAQ: https://wiki.apache.org/cassandra/SecondaryIndexes

Q: When you write a new row, when/how does the index get updated? What I
> would like to know is the atomicity of the operation--is the "index write"
> part of the "row write"?
>
> A: The row and index updates are one, atomic operation.


Suppose that I have created a table and a secondary index. Since a delete
is considered a write, if I delete rows in my table, is it correct that the
index would automatically be changed as well?

-
2nd Question:

Quoting this documentation:
https://docs.datastax.com/en/cql/3.1/cql/ddl/ddl_primary_index_c.html

The index indexes column values in a separate, hidden table from the one
> that contains the values being indexed.


Wondering, under what situations is it necessary to run *nodetool
rebuild_index*?

Does it need to be run in order to evict tombstones in the hidden table? I
would think that if the secondary index is stored in a Cassandra table,
whether hidden or not, that *repair* should handle evicting tombstones. My
guess is that *rebuild_index* like *repair* needs to be run in the event of
prolonged network outages or downed nodes, but should be run regularly, but
this is confusing, since if *repair *were to write data to a node, wouldn't
the "*row and index updates occur in one, atomic operation*".  I need
clarification on the reasons to run *rebuild_index*.

Thank you and best regards,
-Razi


Re: What is OneMinuteRate in Write Latency?

2017-11-03 Thread Nicolas Guyomar
Hi,

OneMinuteRate is the mean rate of write/s over a minute bucket of data
AFAIK.

You can find Latencies on every attributes whose name does not end with
"Rate"



On 2 November 2017 at 18:10, AI Rumman  wrote:

> Hi,
>
> I am trying to calculate the Read/second and Write/Second in my Cassandra
> 2.1 cluster. After searching and reading, I came to know about JMX bean
> "org.apache.cassandra.metrics:type=ClientRequest,scope=
> Write,name=Latency".
> Here I can see oneMinuteRate. I have started a brand new cluster and
> started collected these metrics from 0.
> When I started my first record, I can see
>
> Count = 1
>> OneMinuteRate = 0.01599111...
>
>
> Does it mean that my write/s is 0.0159911? Or does it mean that based on 1
> minute data, my write latency is 0.01599 where Write Latency refers to the
> response time for writing a record?
>
> Please help me understand the value.
>
> Thanks.
>
>
>
>
>


cassandra.yaml configuration for large machines (scale up vs. scale out)

2017-11-03 Thread Steinmaurer, Thomas
Hello,

I know that Cassandra is built for scale out on commodity hardware, but I 
wonder if anyone can share some experience when running Cassandra on rather 
capable machines.

Let's say we have a 3 node cluster with 128G RAM, 32 physical cores (16 per CPU 
socket), Large Raid with Spinning Disks (so somewhere beyond 2000 IOPS).

What are some recommended cassandra.yaml configuration / JVM settings, e.g. we 
have been using with something like that as a first baseline:

* 31G heap, G1, -XX:MaxGCPauseMillis=2000

* concurrent_compactors: 8

* compaction_throughput_mb_per_sec: 128

* key_cache_size_in_mb: 2048

* concurrent_reads: 256

* concurrent_writes: 256

* native_transport_max_threads: 256

Anything else we should add to our first baseline of settings?

E.g. although we have a key cache of 2G, nodetool info gives me only 0.451 as 
hit rate:

Key Cache  : entries 2919619, size 1.99 GB, capacity 2 GB, 71493172 
hits, 158411217 requests, 0.451 recent hit rate, 14400 save period in seconds


Thanks,
Thomas

The contents of this e-mail are intended for the named addressee only. It 
contains information that may be confidential. Unless you are the named 
addressee or an authorized designee, you may not copy or use it, or disclose it 
to anyone else. If you received it in error please notify us immediately and 
then destroy it. Dynatrace Austria GmbH (registration number FN 91482h) is a 
company registered in Linz whose registered office is at 4040 Linz, Austria, 
Freist?dterstra?e 313


Re: Cassandra 3.10 - Hints & Dropped messages logs Vs Cass 2.x version

2017-11-03 Thread Anumod Mullachery
Okay ..

Thanks,will check ..

Anumod
PA,USA

Sent from my iPhone

> On Nov 2, 2017, at 5:40 PM, kurt greaves  wrote:
> 
> Well, pretty sure they still are. at least the mutation one is. but you 
> should really use the dedicated metrics for this.
> 
>> On 3 Nov. 2017 01:38, "Anumod Mullachery"  
>> wrote:
>> thanks ..
>> 
>> so the dropped hints & messages are not captured in cassandra logs, post 3.x 
>> vs 2.x.
>> 
>> -Anumod
>> 
>> 
>> 
>>> On Wed, Nov 1, 2017 at 4:50 PM, kurt greaves  wrote:
>>> You can get dropped message statistics over JMX. for example nodetool 
>>> tpstats has a counter for dropped hints from startup. that would be the 
>>> preferred method for tracking this info, rather than parsing logs
>>> 
>>> On 2 Nov. 2017 6:24 am, "Anumod Mullachery"  
>>> wrote:
>>> 
>>> Hi All,
>>> 
>>> In cassandra  v 2.1.15 , I'm able to pull the hints drop and dropped 
>>> messages from cassandra.log as below-
>>> 
>>> 
>>> dropped hints-->
>>> 
>>> "/opt/xcal/apps/cassandra/logs/cassandra.log--> 
>>> HintedHandoffMetrics.java:79 - /96.115.91.69 has 5 dropped hints, because 
>>> node is down past configured hint window."
>>> 
>>> Dropped messages -->
>>> 
>>> "/opt/xcal/apps/cassandra/logs/cassandra.log--> NFO 
>>> MessagingService 1 MUTATION messages dropped in last 5000ms"
>>> 
>>> 
>>> But in Cassandra V 3.10 , I'm not able to get the cassandra logs as v 
>>> 2.1.15.
>>> 
>>> I'm not able to get any info from the cassandra logs for dropped hints & 
>>> dropped messages. ( I was testing this on Test cluster by stopping the 
>>> cassandra services on one of the node , where the hints window is set to 05 
>>> Mins and kept the node down for 1 hour , and monitoring the cassandra logs, 
>>> but nothing generated on dropped hints on any other nodes in the cluster ).
>>> 
>>> The only message I found is on the down node is that - "Paused hints 
>>> dispatch".
>>> 
>>> Can some one put some light on this issue on 3.10 , is there any changes in 
>>> the hints & dropped messages , logs or the process.
>>> 
>>> thanks in advance,
>>> 
>>> regards,
>>> 
>>> Anumod.
>>> 
>>> 
>> 


Cassandra | MODIFY keyword

2017-11-03 Thread Chandan Goel

Hi,

I would be appreciative if you could help with below questions.

I have two question on Modify keyword with respect to Cassandra(No SQL 
database).
1) Whenever we grant MODIFY permissions to Cassandra users, for example:
GRANT MODIFY ON table1 TO User1;

Then all four: Insert, Update, Delete and Truncate permissions are granted 
automatically.
But out of the four I do not want to give Truncate permission because I do not 
want a user to mistakenly truncate a table.
I tried giving individual permissions of Insert ,Update and delete but this way 
it does not work. Only the Modify keyword works. Is there a way I can grant 
only Insert, Update and Delete to users in Cassandra.

2) I also have another question regarding Modify. Why is it mandatory to give 
MODIFY permission explicitly to a Cassandra Materialized View even when related 
table already has MODIFY permission?
Whenever a table in Cassandra is modified (data updated/deleted/inserted), the 
related materialized View(MV) is updated automatically. Why is it necessary to 
give explicit MODIFY permission to MV (Materialized Views) as well in order to 
execute a Insert/Update/Delete on the related table? Further-more the 
contradictory thing here is that a Modify command (Insert/Update/delete) cannot 
be executed altogether on MVs. So below CQL will fail even if the user has 
Modify access on MV - "delete from materializedView1 where id =10"
Another issue is that whenever a new MV is created on the same table, then the 
table Modify will go for a toss, if Modify is not given/missed on the New MV.
Thanks
Chandan