Bootstrap resume , streamed all data again and 2nd bootstrap id in netstats

2020-06-04 Thread Surbhi Gupta
Hi,

We are on 3.11.5 .
We are trying to add a node in a DC and after all the streaming is done, no
streaming is active in nodetool netstats output , the node was just waiting
for 1 hour doing nothing.
So we thought it might be hung, so we tried
nodetool bootstrap resume

But bootstrap resume , started streaming all the data again and after all
streaming is done it again showed same behavior as the normal bootstrap
like doing nothing and just stuck  , now disk is used twice as it should be
and it has 500s of pending compaction.

When bootstrap resume created new session id and now in nodetool netstats
the status is showing  Normal (Which came from first bootstrap , which
eventually finished after waiting for a long time when bootstrap resume was
going on ) .

Now the condition is, node is in UN state as seen from all the nodes and
started accepting the traffic .

However bootstrap resume is still going on . What happens in this scenario ?

[root@abcdef ~]# nta netstats |grep -v "100%"

Mode: NORMAL

Bootstrap b940a710-a6a8-11ea-b467-3d5e11ea164a

/10.abc

Receiving 1414 files, 74428129981 bytes total. Already
received 46 files, 3425181474 bytes total

/10.def

Receiving 1392 files, 61042286685 bytes total. Already
received 44 files, 4198620698 bytes total

/10.ijk

Receiving 1449 files, 70624858458 bytes total. Already
received 45 files, 6266730847 bytes total

/10.lmn

Receiving 1399 files, 59352202847 bytes total. Already
received 45 files, 4518550733 bytes total

/10.xyz

Receiving 1463 files, 74140648517 bytes total. Already
received 45 files, 3231112921 bytes total

Read Repair Statistics:

Attempted: 31108

Mismatch (Blocking): 0

Mismatch (Background): 67

Pool NameActive   Pending  Completed   Dropped

Large messages  n/a 02068193 0

Small messages  n/a30  501343037 0

Gossip messages n/a 0 101098 0

Thanks

Surbhi


data model for TWCS+TTL

2020-06-04 Thread Arvinder Dhillon
Hi eveyone,

In our use-case, we need to insert 200 millions rows per day.

By default we need to retain data for 10 days unless a certain condition is
matched from client within same day(in that case we need to update ONE
column and set ttl to 1 day). In 98% of cases we will find that match and
2% data is going to stay for 10 days.

We decided to use TWCS with 1 day bucket and default TTL 10 days. GC grade
is 3 days.

We have 2 options:
1. When the match is found, update one non-primary column with updated
record and rest non-primary columns with same data USING TTL 1 day(to set
ttl for all the columns). So data will be purged after 4 days.
OR
2. As soon as we find a match, delete that row and insert a new rows with 1
day ttl. (one extra delete and lots of tombstones). 'deleted' row will be
purged after 3rd day and ttled data will be purged on after 4th day I
believe.

Which approach is better for TWCS and TTL here?

Thanks.


Unable to connect with Cassandra Docker image from outside

2020-06-04 Thread Manu Chadha
Hi



I want to run cassandra docker image and want to connect it with my application 
running in another container. I am on Windows10 and both containers are running 
on same Windows10 Home machine. I also have Cassandra installed on the machine 
as standalone application (without Docker).



I thought to use host.docker.internal as domain name in both the applications. 
But when I start cassandra image, I get error

org.apache.cassandra.exceptions.ConfigurationException: Unable to bind to 
address host.docker.internal/192.168.65.2:7000. Set listen_address in 
cassandra.yaml to an interface you can bind to, e.g., your private IP address 
on EC2



Question 1) Why is host.docker.internal resolving to 192.168.65.2/7000?. 
Shouldn't it be 192.168.1.12 as that is what is configured in my etc/hosts file 
on Windows 10

C:\Users\manuc>ping host.docker.internal



Pinging host.docker.internal [192.168.1.12] with 32 bytes of data:

Reply from 192.168.1.12: bytes=32 time<1ms TTL=128

Reply from 192.168.1.12: bytes=32 time=1ms TTL=128

Reply from 192.168.1.12: bytes=32 time<1ms TTL=128

Reply from 192.168.1.12: bytes=32 time<1ms TTL=128



I also tried explicitly specifying 192.168.1.12 when starting cassandra images 
but I get similar error.



The only way I am able to start the container is by running docker run 
ca795bbd8fd7 but in this case, cassandra listens at address 0.0.0.0 for cqlsh



Starting listening for CQL clients on /0.0.0.0:9042 (unencrypted)..



But in this case, my other docker application or cqlsh of the standalone 
CasSandra is unable to connect with it (running on the same windows machine)



C:\Users\manuc>cqlsh host.docker.internal 9042

Connection error: ('Unable to connect to any servers', {'192.168.1.12': 
error(10061, "Tried connecting to [('192.168.1.12', 9042)]. Last error: No 
connection could be made because the target machine actively refused it")})

C:\Users\manuc>cqlsh 0.0.0.0 9042

Connection error: ('Unable to connect to any servers', {'0.0.0.0': error(10049, 
"Tried connecting to [('0.0.0.0', 9042)]. Last error: The requested address is 
not valid in its context")})



C:\Users\manuc>docker run .. my_other_docker_application

[trace] s.d.c.CassandraConnectionManagementService - creating session with uri 
CassandraConnectionUri(cassandra://host.docker.internal:9042/myKeyspace) and 
cluster name Test Cluster

[trace] s.d.c.CassandraConnectionManagementService - exception in connecting 
with database com.datastax.driver.core.exceptions.NoHostAvailableException: All 
host(s) tried for query failed (tried: host.docker.internal/192.168.65.2:9042 
(com.datastax.driver.core.exceptions.TransportException: 
[host.docker.internal/192.168.65.2:9042] Cannot connect))

Oops, cannot start the server.

play.api.libs.json.JsResult$Exception: {"obj":[{"msg":["Unable to connect with 
database"],"args":[]}]}


Question 2) How can I make Cassandra container connect with my other 
application container or cqlsh of my standalone Cassandra installation?

Thanks

Sent from Mail for Windows 10



Re: Cassandra and Docker

2020-06-04 Thread amit sehas
What If I were to deploy this in AWS?  Is there a straightforward way to 
allocate resource in AWS and tell Cassandra about them?

thanks






On Thursday, June 4, 2020, 03:10:11 AM PDT, Cédrick Lunven 
 wrote: 





Hello,

Having Cassandra in Docker is nice because you don't have anything to install.

Cassandra can be installed in multiple ways but this is tarball and as such not 
for windows.

The Docker hub website is very detailed about what are the options you can usem 
which ports to open  (as stated by Rhys)
https://hub.docker.com/_/cassandra/#!

I would propose you to go with docker-compose in a file where everyhing is 
already defined for you. 

I have attached 2 files. One for a single node and one for 1 dc and 3 nodes for 
Cassandra 3.

docker-compose -f cassandra3-1dc-1node.yaml


docker exec -it `docker ps | grep cassandra-node1 | cut -b 1-12` cqlsh

Cheers

On Wed, Jun 3, 2020 at 8:53 AM Erick Ramirez  wrote:
> Personally, I'd recommend learning Docker on its own or Cassandra on its own. 
> I wouldn't try to do it at the same time if you're new to both technologies. 
> It's hard enough as it is for experienced users. If you're using both and you 
> run into issues, you will find it difficult to know whether the problem is 
> Docker, Cassandra, or both. As always, YMMV. Good luck. Cheers!
>>  
>> 
>> 
> 


-- 

Cedrick Lunven

e. cedrick.lun...@datastax.com
w. www.datastax.com



-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org

-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org



Re: Impact of enabling authentication on performance

2020-06-04 Thread Sam Tunnicliffe
Passwords are hashed using bcrypt, which performs a configurable number of 
encryption rounds on the input. The more rounds, the more computationally 
expensive the hashing and so the more effort required to defeat by brute force. 
By default, Cassandra encrypts with 2^10 rounds, but this can be set anywhere 
between 2^4 and 2^31, the trade off being a lower number of rounds is 
technically less secure but puts less strain on the servers, particularly if 
you have a lot of short lived client connections and/or thundering herd issues. 

To override the default use a system property, which can be added to 
jvm-server.options, e.g.:

cassandra.auth_bcrypt_gensalt_log2_rounds=4 

Bcrypt encodes the number of rounds used to generate a hash in the hash itself 
so existing passwords will continue to work, they just won't benefit from the 
reduced costs. See https://issues.apache.org/jira/browse/CASSANDRA-8085 for 
(slightly) more info.


> On 4 Jun 2020, at 07:39, Gil Ganz  wrote:
> 
> Great advice guys, will check it out.
> Jeff, what do you mean exactly by dropping bcrypt rounds?
> 
> 
> On Wed, Jun 3, 2020 at 10:22 AM Alex Ott  > wrote:
> You can decrease this time for picking up the change by using lower number
> for credentials_update_interval_in_ms, roles_update_interval_in_ms &
> permissions_update_interval_in_ms 
> 
> Durity, Sean R  at "Tue, 2 Jun 2020 14:48:28 +" wrote:
>  DSR> To flesh this out a bit, I set roles_validity_in_ms and 
> permissions_validity_in_ms to
>  DSR> 360 (10 minutes). The default of 2000 is far too often for my use 
> cases. Usually I set
>  DSR> the RF for system_auth to 3 per DC. On a larger, busier cluster I have 
> set it to 6 per
>  DSR> DC. NOTE: if you set the validity higher, it may take that amount of 
> time before a change
>  DSR> in password or table permissions is picked up (usually less).
> 
> 
>  DSR> Sean Durity
> 
>  DSR> -Original Message-
>  DSR> From: Jeff Jirsa mailto:jji...@gmail.com>>
>  DSR> Sent: Tuesday, June 2, 2020 2:39 AM
>  DSR> To: user@cassandra.apache.org 
>  DSR> Subject: [EXTERNAL] Re: Impact of enabling authentication on performance
> 
>  DSR> Set the Auth cache to a long validity
> 
>  DSR> Don’t go crazy with RF of system auth
> 
>  DSR> Drop bcrypt rounds if you see massive cpu spikes on reconnect storms
> 
> 
>  >> On Jun 1, 2020, at 11:26 PM, Gil Ganz  > wrote:
>  >>
>  >> 
>  >> Hi
>  >> I have a production 3.11.6 cluster which I'm might want to enable 
> authentication in, I'm trying to understand what will be the performance 
> impact, if any.
>  >> I understand each use case might be different, trying to understand if 
> there is a common % people usually see their performance hit, or if someone 
> has looked into this.
>  >> Gil
> 
>  DSR> -
>  DSR> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org 
> 
>  DSR> For additional commands, e-mail: user-h...@cassandra.apache.org 
> 
> 
> 
>  DSR> 
> 
>  DSR> The information in this Internet Email is confidential and may be 
> legally privileged. It is intended solely for the addressee. Access to this 
> Email by anyone else is unauthorized. If you are not the intended recipient, 
> any disclosure, copying, distribution or any action taken or omitted to be 
> taken in reliance on it, is prohibited and may be unlawful. When addressed to 
> our clients any opinions or advice contained in this Email are subject to the 
> terms and conditions expressed in any applicable governing The Home Depot 
> terms of business or client engagement letter. The Home Depot disclaims all 
> responsibility and liability for the accuracy and content of this attachment 
> and for any damages or losses arising from any inaccuracies, errors, viruses, 
> e.g., worms, trojan horses, etc., or other items of a destructive nature, 
> which may be contained in this attachment and shall not be liable for direct, 
> indirect, consequential or special damages in connection with this e-mail 
> message or its attachment.
> 
>  DSR> -
>  DSR> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org 
> 
>  DSR> For additional commands, e-mail: user-h...@cassandra.apache.org 
> 
> 
> 
> -- 
> With best wishes,Alex Ott
> Principal Architect, DataStax
> http://datastax.com/ 
> 
> -
> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org 
> 
> For additional commands, e-mail: user-h...@cassandra.apache.org 
> 

Re: Cassandra and Docker

2020-06-04 Thread Cédrick Lunven
Hello,

Having Cassandra in Docker is nice because you don't have anything to
install.

Cassandra can be installed in multiple ways but this is tarball and as such
not for windows.

The Docker hub website is very detailed about what are the options you can
usem which ports to open  (as stated by Rhys)
https://hub.docker.com/_/cassandra/#!

I would propose you to go with docker-compose in a file where everyhing is
already defined for you.

I have attached 2 files. One for a single node and one for 1 dc and 3 nodes
for Cassandra 3.

docker-compose -f cassandra3-1dc-1node.yaml

docker exec -it `docker ps | grep cassandra-node1 | cut -b 1-12` cqlsh

Cheers

On Wed, Jun 3, 2020 at 8:53 AM Erick Ramirez 
wrote:

> Personally, I'd recommend learning Docker on its own or Cassandra on its
> own. I wouldn't try to do it at the same time if you're new to both
> technologies. It's hard enough as it is for experienced users. If you're
> using both and you run into issues, you will find it difficult to know
> whether the problem is Docker, Cassandra, or both. As always, YMMV. Good
> luck. Cheers!
>
>>

-- 
Cedrick Lunven
e. cedrick.lun...@datastax.com
w. www.datastax.com


cassandra3-1dc-3nodes.yaml
Description: application/yaml


cassandra3-1dc-1node.yaml
Description: application/yaml

-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org

Re: Impact of enabling authentication on performance

2020-06-04 Thread Gil Ganz
Great advice guys, will check it out.
Jeff, what do you mean exactly by dropping bcrypt rounds?


On Wed, Jun 3, 2020 at 10:22 AM Alex Ott  wrote:

> You can decrease this time for picking up the change by using lower number
> for credentials_update_interval_in_ms, roles_update_interval_in_ms &
> permissions_update_interval_in_ms
>
> Durity, Sean R  at "Tue, 2 Jun 2020 14:48:28 +" wrote:
>  DSR> To flesh this out a bit, I set roles_validity_in_ms and
> permissions_validity_in_ms to
>  DSR> 360 (10 minutes). The default of 2000 is far too often for my
> use cases. Usually I set
>  DSR> the RF for system_auth to 3 per DC. On a larger, busier cluster I
> have set it to 6 per
>  DSR> DC. NOTE: if you set the validity higher, it may take that amount of
> time before a change
>  DSR> in password or table permissions is picked up (usually less).
>
>
>  DSR> Sean Durity
>
>  DSR> -Original Message-
>  DSR> From: Jeff Jirsa 
>  DSR> Sent: Tuesday, June 2, 2020 2:39 AM
>  DSR> To: user@cassandra.apache.org
>  DSR> Subject: [EXTERNAL] Re: Impact of enabling authentication on
> performance
>
>  DSR> Set the Auth cache to a long validity
>
>  DSR> Don’t go crazy with RF of system auth
>
>  DSR> Drop bcrypt rounds if you see massive cpu spikes on reconnect storms
>
>
>  >> On Jun 1, 2020, at 11:26 PM, Gil Ganz  wrote:
>  >>
>  >> 
>  >> Hi
>  >> I have a production 3.11.6 cluster which I'm might want to enable
> authentication in, I'm trying to understand what will be the performance
> impact, if any.
>  >> I understand each use case might be different, trying to understand if
> there is a common % people usually see their performance hit, or if someone
> has looked into this.
>  >> Gil
>
>  DSR> -
>  DSR> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
>  DSR> For additional commands, e-mail: user-h...@cassandra.apache.org
>
>
>  DSR> 
>
>  DSR> The information in this Internet Email is confidential and may be
> legally privileged. It is intended solely for the addressee. Access to this
> Email by anyone else is unauthorized. If you are not the intended
> recipient, any disclosure, copying, distribution or any action taken or
> omitted to be taken in reliance on it, is prohibited and may be unlawful.
> When addressed to our clients any opinions or advice contained in this
> Email are subject to the terms and conditions expressed in any applicable
> governing The Home Depot terms of business or client engagement letter. The
> Home Depot disclaims all responsibility and liability for the accuracy and
> content of this attachment and for any damages or losses arising from any
> inaccuracies, errors, viruses, e.g., worms, trojan horses, etc., or other
> items of a destructive nature, which may be contained in this attachment
> and shall not be liable for direct, indirect, consequential or special
> damages in connection with this e-mail message or its attachment.
>
>  DSR> -
>  DSR> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
>  DSR> For additional commands, e-mail: user-h...@cassandra.apache.org
>
>
> --
> With best wishes,Alex Ott
> Principal Architect, DataStax
> http://datastax.com/
>
> -
> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: user-h...@cassandra.apache.org
>
>