Cassandra Python Driver - User Defined Functions

2017-08-30 Thread Earl Lapus
Hi,

I'm learning to use the Python Driver API as documented here:
https://datastax.github.io/python-driver/api/index.html

The database that I am working on has a user define function, similar to
what is described here:
http://batey.info/cassandra-aggregates-min-max-avg-group.html

Is there a way to use to a user defined function inside a ModelQuerySet?
(the API docs seems to be silent about this topic)

If not, what are my other alternatives? Should I be defining an equivalent
function in my python program that emulates what the user define function
does?

Cheers,
Earl


Re: system_auth replication factor in Cassandra 2.1

2017-08-30 Thread Nate McCall
Regardless, if you are not modifying users frequently (with five you most
likely are not), make sure turn the permission cache wyyy up.

In 2.1 that is just: permissions_validity_in_ms (default is 2000 or 2
seconds). Feel free to set it to 1 day or some such. The corresponding
async update parameter (permissions_update_interval_in_ms) can be set to a
slightly smaller value. If you really need to, you can drop the cache via
the "invalidate" operation on the
"org.apache.cassandra.auth:type=PermissionsCache" mbean (on each node) to
revoke a user for example.

In later versions, you would have to do the same with:
- roles_validity_in_ms
- credentials_validity_in_ms
and their corresponding 'interval' parameters.


Re: Cassandra All host(s) tried for query failed (no host was tried)

2017-08-30 Thread Nate McCall
If these app instances sit idle for a while, they might just be timing out
their sockets. You can tweak socket settings on the driver as described
here:
https://github.com/datastax/java-driver/tree/3.x/manual/socket_options

Perhaps start with explicitly setting keepAlive to true as that may or may
not be set depending on whether it's using the native epoll extension or
NIO directly (more details about such on the page above).

On Thu, Aug 31, 2017 at 3:10 AM, Ivan Iliev 
wrote:

> Hello everyone,
>
> We are using Cassandra 3.9 for storing quite a lot of data produced from
> our tester machines.
>
> Occasionally, we are seeing issues with apps not being able to communicate
> with Cassandra nodes, returning the following errors (captured in
> servicemix logs):
>
>>  by: com.datastax.driver.core.exceptions.NoHostAvailableException: All
>> host(s) tried for query failed (no host was tried)
>> at com.datastax.driver.core.RequestHandler.reportNoMoreHosts(
>> RequestHandler.java:218)
>> at com.datastax.driver.core.RequestHandler.access$1000(
>> RequestHandler.java:43)
>> at com.datastax.driver.core.RequestHandler$SpeculativeExecution.
>> sendRequest(RequestHandler.java:284)
>> at com.datastax.driver.core.RequestHandler.startNewExecution(
>> RequestHandler.java:115)
>> at com.datastax.driver.core.RequestHandler.sendRequest(
>> RequestHandler.java:91)
>> at com.datastax.driver.core.SessionManager.executeAsync(
>> SessionManager.java:132)
>> ... 107 more
>
>
> As a result, apps that try to send data to cassandra get crashed due to
> running out of memory and we have to restart the containers in which they
> run.
>
> So far I have not been able to identify what might be the cause for this
> as nothing (at least I could not find anything relevant on the timestamps)
> in the cassandra debug and system logs.
>
> Could you share some insight on this ? What to check and where to start
> from , in order to troubleshoot this.
>
> Thanks !
> Ivan
>



-- 
-
Nate McCall
Wellington, NZ
@zznate

CTO
Apache Cassandra Consulting
http://www.thelastpickle.com


Re: Cassandra 3.7 repair error messages

2017-08-30 Thread Erick Ramirez
No, it isn't normal for sessions to fail and you will need to investigate.
You need to review the logs on node .204 to determine why the session
failed. For example, did it timeout because of a very large sstable? Or did
the connection get truncated after a while?

You will need to address the cause of those failures. It could be external
to the nodes, e.g. firewall closing the socket so you might need to
configure TCP keep_alive. 33 hours sounds like a really long time. Have you
successfully run a repair on this cluster before?

On Thu, Aug 31, 2017 at 11:39 AM, Paul Pollack 
wrote:

> Hi,
>
> I'm trying to run a repair on a node my Cassandra cluster, version 3.7,
> and was hoping someone may be able to shed light on an error message that
> keeps cropping up.
>
> I started the repair on a node after discovering that it somehow became
> partitioned from the rest of the cluster, e.g. nodetool status on all other
> nodes showed it as DN, and on the node itself showed all other nodes as DN.
> After restarting the Cassandra daemon the node seemed to re-join the
> cluster just fine, so I began a repair.
>
> The repair has been running for about 33 hours (first incremental repair
> on this cluster), and every so often I'll see a line like this:
>
> [2017-08-31 00:18:16,300] Repair session f7ae4e71-8ce3-11e7-b466-79eba0383e4f
> for range [(-5606588017314999649,-5604469721630340065],
> (9047587767449433379,9047652965163017217]] failed with error Endpoint /
> 20.0.122.204 died (progress: 9%)
>
> Every one of these lines refers to the same node, 20.0.122.204.
>
> I'm mostly looking for guidance here. Do these errors indicate that the
> entire repair will be worthless, or just for token ranges shared by these
> two nodes? Is it normal to see error messages of this nature and for a
> repair not to terminate?
>
> Thanks,
> Paul
>


Re: Cassandra snapshot restore with VNODES missing some data

2017-08-30 Thread Erick Ramirez
For your method to work, you have to restore like-for-like, i.e. you need
to mirror the source nodes by using the exact same tokens in system.local.

For example, if source node A has tokens 567, 678 and 789, then you need to
setup the equivalent target node with exactly those tokens. Otherwise, the
new nodes will own a different range of tokens and the data in the sstables
will not necessarily match. Cheers!

On Thu, Aug 31, 2017 at 9:06 AM, Jai Bheemsen Rao Dhanwada <
jaibheem...@gmail.com> wrote:

> Hello All,
>
> I am trying to restore a cluster with VNODE(s) to a new cluster using the
> snapshot.
> After the restore when I query data from cql I see some random data is
> missing.
>
> I used the below steps to restore
>
> 1. Snapshot on the source cluster
> 2. Setup new cluster(VNODEs) with the same schema as source cluster.
> 3. Restore the snapshot to the specific CF directory
> 4. Restart the node.
> 5. Now when I query through CQL I see some data is missing.
>
> I looked at the procedure documented in : http://docs.datastax.com/en/
> cassandra/2.1/cassandra/operations/ops_snapshot_restore_new_cluster.html
>
> When I try to follow the steps in the above doc, especially replacing the
> num_tokens with initial_token: from the source cluster token, I get the
> below error.
>
> ERROR [main] 2017-08-30 17:17:20,878 CassandraDaemon.java:395 - Fatal
> configuration error org.apache.cassandra.exceptions.ConfigurationException:
> Cannot change the number of tokens from 256 to 1 at
> org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:972)
> ~[apache-cassandra-2.1.16.jar:2.1.16] at org.apache.cassandra.service.
> StorageService.initServer(StorageService.java:740)
> ~[apache-cassandra-2.1.16.jar:2.1.16] at org.apache.cassandra.service.
> StorageService.initServer(StorageService.java:617)
> ~[apache-cassandra-2.1.16.jar:2.1.16] at org.apache.cassandra.service.
> CassandraDaemon.setup(CassandraDaemon.java:391)
> [apache-cassandra-2.1.16.jar:2.1.16] at org.apache.cassandra.service.
> CassandraDaemon.activate(CassandraDaemon.java:566)
> [apache-cassandra-2.1.16.jar:2.1.16] at org.apache.cassandra.service.
> CassandraDaemon.main(CassandraDaemon.java:655)
> [apache-cassandra-2.1.16.jar:2.1.16] INFO [StorageServiceShutdownHook]
> 2017-08-30 17:17:20,898 Gossiper.java:1454 - Announcing shutdown INFO
> [StorageServiceShutdownHook] 2017-08-30 17:17:22,902
> MessagingService.java:734 - Waiting for messaging service to quiesce INFO
> [ACCEPT-/x .x.x.x] 2017-08-30 17:17:22,903
> MessagingService.java:1020 - MessagingService has terminated the accept()
> thread
>
> any pointers on this?
>


Re: system_auth replication factor in Cassandra 2.1

2017-08-30 Thread Erick Ramirez
It looks like nodes .113 and .116 have a problem. Repairing system_auth
which only contains 5 users should not take that long. Run with just nodetool
repair system_auth (without the -pr flag).

But first investigate why those 2 nodes are slow to respond. Cheers!

On Thu, Aug 31, 2017 at 3:00 AM, Chuck Reynolds 
wrote:

> select * from users;
>
>
>
> OK here’s the trace.  The times are super long.
>
> name  | super
>
> ---+---
>
>   user1 | False
>
>   user2 | True
>
>   user3 | True
>
>   user4 | False
>
>   user5 | True
>
> (5 rows)
>
> Tracing session: 55a4aa50-8da3-11e7-adbb-e7bbc3a8a72e
>
> activity
>   |
> timestamp  | source| source_elapsed
>
> 
> 
> ---++---+
>
>
>  Execute CQL3 query | 2017-08-30
> 10:50:31.413000 | xx.xx.xx.113 |   0
>
>   READ message received from
> /xx.xx.xx.113 [MessagingService-Incoming-/xx.xx.xx.113] | 2017-08-30
> 10:50:31.398000 | xx.xx.xx.107 | 66
>
>
> Executing single-partition query on users [SharedPool-Worker-1] |
> 2017-08-30 10:50:31.399000 | xx.xx.xx.107 |110
>
>   
>   Acquiring
> sstable references [SharedPool-Worker-1] | 2017-08-30 10:50:31.399000 |
> xx.xx.xx.107 |114
>
>
>  Merging memtable tombstones [SharedPool-Worker-1] | 2017-08-30
> 10:50:31.40 | xx.xx.xx.107 |121
>
>
> Key cache hit for sstable 288 [SharedPool-Worker-1] | 2017-08-30
> 10:50:31.40 | xx.xx.xx.107 |129
>
>  Seeking
> to partition beginning in data file [SharedPool-Worker-1] | 2017-08-30
> 10:50:31.40 | xx.xx.xx.107 |130
>
>Skipped 0/1 non-slice-intersecting
> sstables, included 0 due to tombstones [SharedPool-Worker-1] | 2017-08-30
> 10:50:31.401000 | xx.xx.xx.107 |209
>
>   Merging
> data from memtables and 1 sstables [SharedPool-Worker-1] | 2017-08-30
> 10:50:31.401000 | xx.xx.xx.107 |211
>
>
> Read 1 live and 0 tombstone cells [SharedPool-Worker-1] | 2017-08-30
> 10:50:31.402000 | xx.xx.xx.107 |226
>
>
> Enqueuing response to /xx.xx.xx.113 [SharedPool-Worker-1] | 2017-08-30
> 10:50:31.402000 | xx.xx.xx.107 |321
>
>  Sending REQUEST_RESPONSE message to
> /xx.xx.xx.113 [MessagingService-Outgoing-/xx.xx.xx.113] | 2017-08-30
> 10:50:31.402000 | xx.xx.xx.107 |417
>
>
>  Parsing select * from users; [SharedPool-Worker-2] | 2017-08-30
> 10:50:31.414000 | xx.xx.xx.113 | 22
>
>
> Preparing statement [SharedPool-Worker-2] | 2017-08-30 10:50:31.415000 |
> xx.xx.xx.113 | 58
>
>
> reading data from /xx.xx.xx.107 [SharedPool-Worker-2] | 2017-08-30
> 10:50:31.415000 | xx.xx.xx.113 |950
>
>  Sending READ message to
> /xx.xx.xx.107 [MessagingService-Outgoing-/xx.xx.xx.107] | 2017-08-30
> 10:50:31.415000 | xx.xx.xx.113 |   1017
>
>   REQUEST_RESPONSE message received from
> /xx.xx.xx.107 [MessagingService-Incoming-/xx.xx.xx.107] | 2017-08-30
> 10:50:31.415000 | xx.xx.xx.113 |   1744
>
>
> Processing response from /xx.xx.xx.107 [SharedPool-Worker-1] | 2017-08-30
> 10:50:31.416000 | xx.xx.xx.113 |   1805
>
>
> Computing ranges to query [SharedPool-Worker-2] | 2017-08-30
> 10:50:31.416000 | xx.xx.xx.113 |   1853
>
> Submitting range requests on 63681 ranges with a concurrency of 10056
> (0.009944752 rows per range expected) [SharedPool-Worker-2] | 2017-08-30
> 10:50:31.424000 | xx.xx.xx.113 |  11427
>
>PAGED_RANGE message received from
> /xx.xx.xx.113 [MessagingService-Incoming-/xx.xx.xx.113] | 2017-08-30
> 10:51:25.002000 | xx.xx.xx.116 | 28
>
>  Executing seq scan across 1 sstables for
> [min(-9223372036854775808), min(-9223372036854775808)]
> [SharedPool-Worker-1] | 2017-08-30 10:51:25.002000 | xx.xx.xx.116
> | 82
>
>
> Read 1 live and 0 tombstone cells [SharedPool-Worker-1] | 2017-08-30
> 10:51:25.003000 | xx.xx.xx.116 |178
>
>
> Read 1 live and 0 tombstone cells [SharedPool-Worker-1] | 2017-08-30
> 10:51:25.003000 | xx.xx.xx.116 |186
>
>
>Read 1 live and 0 tombstone cells [SharedPool-Worker-1] |
> 2017-08-30 10:51:25.003000 | xx.xx.xx.116 |191
>
>
> Read 1 live and 0 tombstone cells 

Re: Cassandra All host(s) tried for query failed (no host was tried)

2017-08-30 Thread Erick Ramirez
No host was tried because nodes were unresponsive and the driver marked
them as down. When too many nodes get marked as down, the driver eventually
runs out of nodes so ends up in NoHostAvailableException.

Nodes become unresponsive because they are overloaded. You either throttle
back the app throughput or increase the capacity of your cluster. Cheers!

On Thu, Aug 31, 2017 at 1:10 AM, Ivan Iliev 
wrote:

> Hello everyone,
>
> We are using Cassandra 3.9 for storing quite a lot of data produced from
> our tester machines.
>
> Occasionally, we are seeing issues with apps not being able to communicate
> with Cassandra nodes, returning the following errors (captured in
> servicemix logs):
>
>>  by: com.datastax.driver.core.exceptions.NoHostAvailableException: All
>> host(s) tried for query failed (no host was tried)
>> at com.datastax.driver.core.RequestHandler.reportNoMoreHosts(
>> RequestHandler.java:218)
>> at com.datastax.driver.core.RequestHandler.access$1000(
>> RequestHandler.java:43)
>> at com.datastax.driver.core.RequestHandler$SpeculativeExecution.
>> sendRequest(RequestHandler.java:284)
>> at com.datastax.driver.core.RequestHandler.startNewExecution(
>> RequestHandler.java:115)
>> at com.datastax.driver.core.RequestHandler.sendRequest(
>> RequestHandler.java:91)
>> at com.datastax.driver.core.SessionManager.executeAsync(
>> SessionManager.java:132)
>> ... 107 more
>
>
> As a result, apps that try to send data to cassandra get crashed due to
> running out of memory and we have to restart the containers in which they
> run.
>
> So far I have not been able to identify what might be the cause for this
> as nothing (at least I could not find anything relevant on the timestamps)
> in the cassandra debug and system logs.
>
> Could you share some insight on this ? What to check and where to start
> from , in order to troubleshoot this.
>
> Thanks !
> Ivan
>


Cassandra 3.7 repair error messages

2017-08-30 Thread Paul Pollack
Hi,

I'm trying to run a repair on a node my Cassandra cluster, version 3.7, and
was hoping someone may be able to shed light on an error message that keeps
cropping up.

I started the repair on a node after discovering that it somehow became
partitioned from the rest of the cluster, e.g. nodetool status on all other
nodes showed it as DN, and on the node itself showed all other nodes as DN.
After restarting the Cassandra daemon the node seemed to re-join the
cluster just fine, so I began a repair.

The repair has been running for about 33 hours (first incremental repair on
this cluster), and every so often I'll see a line like this:

[2017-08-31 00:18:16,300] Repair session
f7ae4e71-8ce3-11e7-b466-79eba0383e4f for range
[(-5606588017314999649,-5604469721630340065],
(9047587767449433379,9047652965163017217]] failed with error Endpoint /
20.0.122.204 died (progress: 9%)

Every one of these lines refers to the same node, 20.0.122.204.

I'm mostly looking for guidance here. Do these errors indicate that the
entire repair will be worthless, or just for token ranges shared by these
two nodes? Is it normal to see error messages of this nature and for a
repair not to terminate?

Thanks,
Paul


Re: Hints replay incompatible between 2.x and 3.x

2017-08-30 Thread Erick Ramirez
1TB of hints suggests you don't have enough capacity in your cluster. The
only way around that is to add more nodes.

On Thu, Aug 31, 2017 at 3:05 AM, Jason Brown  wrote:

> Hi Andrew,
>
> This question is best for the user@ list, included here.
>
> Thanks,
>
> -Jason
>
> On Wed, Aug 30, 2017 at 10:00 AM, Andrew Whang 
> wrote:
>
>> In evaluating 3.x, we found that hints are unable to be replayed between
>> 2.x and 3.x nodes. This introduces a risk during the upgrade path for some
>> of our write-heavy clusters - nodes will accumulate upwards of 1TB of
>> hints
>> if a node goes/remains down for <1hr.
>>
>> Any suggestions to mitigate this issue?
>>
>
>


Re: Invalid Gossip generation

2017-08-30 Thread Erick Ramirez
Unfortunately, the only available workaround is a rolling restart of the
cluster until you get the fix in C* 2.1.13 (CASSANDRA-10969
).

On Thu, Aug 31, 2017 at 5:52 AM, Mark Furlong  wrote:

> I have a 2.1.12 cluster and have experienced an invalid gossip generation
> error on one of the nodes. We have tried altering the local generation
> value without achieving the desired result. A rolling restart of this
> production cluster of 136 nodes is a last chance option. The next step we
> know is to upgrade this cluster to a new version of 2.1. In the meantime is
> there any other way then the above mentioned to get this node communicating
> with the cluster?
>
>
>
> *Mark Furlong*
>
> Sr. Database Administrator
>
> *mfurl...@ancestry.com *
> M: 801-859-7427 <(801)%20859-7427>
>
> O: 801-705-7115 <(801)%20705-7115>
>
> 1300 W Traverse Pkwy
>
> Lehi, UT 84043
>
>
>
>
>
> ​[image: http://c.mfcreative.com/mars/email/shared-icon/sig-logo.gif]
>
>
>
>
>


Re: Cassandra snapshot restore with VNODES missing some data

2017-08-30 Thread Jai Bheemsen Rao Dhanwada
Also,

When use the same DC and Rack, I am still getting the below error when I
specify the 256 tokens from the source cluster and restart it.


ERROR [main] 2017-08-30 17:17:20,878 CassandraDaemon.java:395 - Fatal
configuration error org.apache.cassandra.exceptions.ConfigurationException:
Cannot change the number of tokens from 256 to 1 at
org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:972)
~[apache-cassandra-2.1.16.jar:2.1.16] at org.apache.cassandra.service.
StorageService.initServer(StorageService.java:740)
~[apache-cassandra-2.1.16.jar:2.1.16] at org.apache.cassandra.service.
StorageService.initServer(StorageService.java:617)
~[apache-cassandra-2.1.16.jar:2.1.16] at org.apache.cassandra.service.
CassandraDaemon.setup(CassandraDaemon.java:391)
[apache-cassandra-2.1.16.jar:2.1.16] at org.apache.cassandra.service.
CassandraDaemon.activate(CassandraDaemon.java:566)
[apache-cassandra-2.1.16.jar:2.1.16] at org.apache.cassandra.service.
CassandraDaemon.main(CassandraDaemon.java:655)
[apache-cassandra-2.1.16.jar:2.1.16]
INFO [StorageServiceShutdownHook] 2017-08-30 17:17:20,898
Gossiper.java:1454 - Announcing shutdown INFO [StorageServiceShutdownHook]
2017-08-30 17:17:22,902 MessagingService.java:734 - Waiting for messaging
service to quiesce INFO [ACCEPT-/x .x.x.x] 2017-08-30
17:17:22,903 MessagingService.java:1020 - MessagingService has terminated
the accept() thread

On Wed, Aug 30, 2017 at 5:57 PM, Jai Bheemsen Rao Dhanwada <
jaibheem...@gmail.com> wrote:

> yes source use the vnodes.
>
> I am restoring to a different cluster in different datacenter, so the rack
> and dc changes. does that matter?
>
> On Wed, Aug 30, 2017 at 5:55 PM, kurt greaves 
> wrote:
>
>> Does the source cluster also use vnodes? You will need to ensure you use
>> the same tokens for each node as the snapshots used in the source (and also
>> ensure same tokens apply to same racks).
>>
>
>


Re: Cassandra snapshot restore with VNODES missing some data

2017-08-30 Thread Jai Bheemsen Rao Dhanwada
yes source use the vnodes.

I am restoring to a different cluster in different datacenter, so the rack
and dc changes. does that matter?

On Wed, Aug 30, 2017 at 5:55 PM, kurt greaves  wrote:

> Does the source cluster also use vnodes? You will need to ensure you use
> the same tokens for each node as the snapshots used in the source (and also
> ensure same tokens apply to same racks).
>


Re: Cassandra snapshot restore with VNODES missing some data

2017-08-30 Thread kurt greaves
Does the source cluster also use vnodes? You will need to ensure you use
the same tokens for each node as the snapshots used in the source (and also
ensure same tokens apply to same racks).


Re: system_auth replication factor in Cassandra 2.1

2017-08-30 Thread kurt greaves
For that many nodes mixed with vnodes you probably want a lower RF than N
per datacenter. 5 or 7 would be reasonable. The only down side is that auth
queries may take slightly longer as they will often have to go to other
nodes to be resolved, but in practice this is likely not a big deal as the
data will be cached anyway.


Re: Unsuccessful back-up and restore with differing counts

2017-08-30 Thread Jai Bheemsen Rao Dhanwada
This is link gives me 404, can you please give me the correct link?

On Sat, May 13, 2017 at 10:51 AM, Surbhi Gupta 
wrote:

> Below link has the method u r looking for
> http://datascale.io/cloning-cassandra-clusters-fast-way/
>
> On Sat, May 13, 2017 at 9:49 AM srinivasarao daruna <
> sree.srin...@gmail.com> wrote:
>
>> I am using vnodes. Is there a documenation that you can suggest to
>> understand how to assign same tokens in new cluster.? I will try it again.
>>
>>
>> On May 13, 2017 12:32 PM, "Nitan Kainth"  wrote:
>>
>> As Jonathan mentioned, if you are using vnodes , you should back up
>> nodetool output and assign same token to nodes and then copy corresponding
>> sstables.
>>
>> If using, initial token, then assign same value for subsequent node.
>>
>> Sstable loader should work independently, not sure why you are getting
>> wrong counts for that
>>
>> Sent from my iPhone
>>
>> On May 13, 2017, at 9:34 AM, Jonathan Haddad  wrote:
>>
>> Did you create the nodes with the same tokens?
>>
>> On Sat, May 13, 2017 at 8:44 AM srinivasarao daruna <
>> sree.srin...@gmail.com> wrote:
>>
>>> Hi,
>>>
>>> We have a cassandra cluster built on Apache Cassandra 3.9 with 6 nodes
>>> and RF = 3. As part of re-building the cluster, we are testing the backup
>>> and restore strategy.
>>>
>>> We took the snapshot and uploaded the files to S3 and data has been
>>> saved the data with folder names (backup_folder1 - 6 for nodes 1 - 6).
>>> Created a new cluster with the same number of nodes, and copied the data
>>> from S3 and created the schema.
>>>
>>> *Strategy 1: (using nodetool refresh)*
>>> 1) Copied back the data from S3 into one machine each based on the
>>> folders created (backup_folder1  - 6 to 6 nodes)
>>> 2) and performed nodetool refresh on the cluster.
>>>
>>> Ran the count:
>>>
>>> Count on previous cluster: 12125800
>>> Count on new cluster: 10504780
>>>
>>> *Strategy 2: using sstableloader*
>>>
>>> 1) Copied back the data from S3 into one machine each based on the
>>> folders created (backup_folder1  - 6 to 6 nodes)
>>> 2) and performed sstableloader on each node.
>>>
>>> Ran the count:
>>>
>>> Count on previous cluster: 12125800
>>> Count on new cluster: 11705084
>>>
>>>
>>> Looking at the results, i have bit disappointed that neither of the
>>> approach resulted 100% restore for me.
>>> If there is an error in taking the backup, it should have not given
>>> different counts.
>>>
>>> Any ideas on successful back-up and restore strategies.? and what could
>>> ve gone wrong in my process.?
>>>
>>> Thank You,
>>> Regards,
>>> Srini
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>
>>
>>


Cassandra snapshot restore with VNODES missing some data

2017-08-30 Thread Jai Bheemsen Rao Dhanwada
Hello All,

I am trying to restore a cluster with VNODE(s) to a new cluster using the
snapshot.
After the restore when I query data from cql I see some random data is
missing.

I used the below steps to restore

1. Snapshot on the source cluster
2. Setup new cluster(VNODEs) with the same schema as source cluster.
3. Restore the snapshot to the specific CF directory
4. Restart the node.
5. Now when I query through CQL I see some data is missing.

I looked at the procedure documented in :
http://docs.datastax.com/en/cassandra/2.1/cassandra/operations/ops_snapshot_restore_new_cluster.html

When I try to follow the steps in the above doc, especially replacing the
num_tokens with initial_token: from the source cluster token, I get the
below error.

ERROR [main] 2017-08-30 17:17:20,878 CassandraDaemon.java:395 - Fatal
configuration error org.apache.cassandra.exceptions.ConfigurationException:
Cannot change the number of tokens from 256 to 1 at
org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:972)
~[apache-cassandra-2.1.16.jar:2.1.16] at
org.apache.cassandra.service.StorageService.initServer(StorageService.java:740)
~[apache-cassandra-2.1.16.jar:2.1.16] at
org.apache.cassandra.service.StorageService.initServer(StorageService.java:617)
~[apache-cassandra-2.1.16.jar:2.1.16] at
org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:391)
[apache-cassandra-2.1.16.jar:2.1.16] at
org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:566)
[apache-cassandra-2.1.16.jar:2.1.16] at
org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:655)
[apache-cassandra-2.1.16.jar:2.1.16] INFO [StorageServiceShutdownHook]
2017-08-30 17:17:20,898 Gossiper.java:1454 - Announcing shutdown INFO
[StorageServiceShutdownHook] 2017-08-30 17:17:22,902
MessagingService.java:734 - Waiting for messaging service to quiesce INFO
[ACCEPT-/x .x.x.x] 2017-08-30 17:17:22,903
MessagingService.java:1020 - MessagingService has terminated the accept()
thread

any pointers on this?


Invalid Gossip generation

2017-08-30 Thread Mark Furlong
I have a 2.1.12 cluster and have experienced an invalid gossip generation error 
on one of the nodes. We have tried altering the local generation value without 
achieving the desired result. A rolling restart of this production cluster of 
136 nodes is a last chance option. The next step we know is to upgrade this 
cluster to a new version of 2.1. In the meantime is there any other way then 
the above mentioned to get this node communicating with the cluster?

Mark Furlong

Sr. Database Administrator

mfurl...@ancestry.com
M: 801-859-7427
O: 801-705-7115
1300 W Traverse Pkwy
Lehi, UT 84043





​[http://c.mfcreative.com/mars/email/shared-icon/sig-logo.gif]





Re: Hints replay incompatible between 2.x and 3.x

2017-08-30 Thread Jason Brown
Hi Andrew,

This question is best for the user@ list, included here.

Thanks,

-Jason

On Wed, Aug 30, 2017 at 10:00 AM, Andrew Whang 
wrote:

> In evaluating 3.x, we found that hints are unable to be replayed between
> 2.x and 3.x nodes. This introduces a risk during the upgrade path for some
> of our write-heavy clusters - nodes will accumulate upwards of 1TB of hints
> if a node goes/remains down for <1hr.
>
> Any suggestions to mitigate this issue?
>


Re: system_auth replication factor in Cassandra 2.1

2017-08-30 Thread Chuck Reynolds
select * from users;

OK here’s the trace.  The times are super long.
name  | super
---+---
  user1 | False
  user2 | True
  user3 | True
  user4 | False
  user5 | True
(5 rows)
Tracing session: 55a4aa50-8da3-11e7-adbb-e7bbc3a8a72e
activity
  | timestamp  
| source| source_elapsed
---++---+

Execute CQL3 query | 2017-08-30 10:50:31.413000 
| xx.xx.xx.113 |   0
  READ message received from 
/xx.xx.xx.113 [MessagingService-Incoming-/xx.xx.xx.113] | 2017-08-30 
10:50:31.398000 | xx.xx.xx.107 | 66
   Executing 
single-partition query on users [SharedPool-Worker-1] | 2017-08-30 
10:50:31.399000 | xx.xx.xx.107 |110

Acquiring sstable references [SharedPool-Worker-1] | 2017-08-30 10:50:31.399000 
| xx.xx.xx.107 |114

 Merging memtable tombstones [SharedPool-Worker-1] | 2017-08-30 10:50:31.40 
| xx.xx.xx.107 |121
   
Key cache hit for sstable 288 [SharedPool-Worker-1] | 2017-08-30 
10:50:31.40 | xx.xx.xx.107 |129
 Seeking to 
partition beginning in data file [SharedPool-Worker-1] | 2017-08-30 
10:50:31.40 | xx.xx.xx.107 |130
   Skipped 0/1 non-slice-intersecting sstables, 
included 0 due to tombstones [SharedPool-Worker-1] | 2017-08-30 10:50:31.401000 
| xx.xx.xx.107 |209
  Merging data 
from memtables and 1 sstables [SharedPool-Worker-1] | 2017-08-30 
10:50:31.401000 | xx.xx.xx.107 |211
   Read 
1 live and 0 tombstone cells [SharedPool-Worker-1] | 2017-08-30 10:50:31.402000 
| xx.xx.xx.107 |226

Enqueuing response to /xx.xx.xx.113 [SharedPool-Worker-1] | 2017-08-30 
10:50:31.402000 | xx.xx.xx.107 |321
 Sending REQUEST_RESPONSE message to 
/xx.xx.xx.113 [MessagingService-Outgoing-/xx.xx.xx.113] | 2017-08-30 
10:50:31.402000 | xx.xx.xx.107 |417

Parsing select * from users; [SharedPool-Worker-2] | 2017-08-30 10:50:31.414000 
| xx.xx.xx.113 | 22

 Preparing statement [SharedPool-Worker-2] | 2017-08-30 10:50:31.415000 
| xx.xx.xx.113 | 58

reading data from /xx.xx.xx.107 [SharedPool-Worker-2] | 2017-08-30 
10:50:31.415000 | xx.xx.xx.113 |950
 Sending READ message to 
/xx.xx.xx.107 [MessagingService-Outgoing-/xx.xx.xx.107] | 2017-08-30 
10:50:31.415000 | xx.xx.xx.113 |   1017
  REQUEST_RESPONSE message received from 
/xx.xx.xx.107 [MessagingService-Incoming-/xx.xx.xx.107] | 2017-08-30 
10:50:31.415000 | xx.xx.xx.113 |   1744
 Processing 
response from /xx.xx.xx.107 [SharedPool-Worker-1] | 2017-08-30 10:50:31.416000 
| xx.xx.xx.113 |   1805

   Computing ranges to query [SharedPool-Worker-2] | 2017-08-30 10:50:31.416000 
| xx.xx.xx.113 |   1853
Submitting range requests on 63681 ranges with a concurrency of 10056 
(0.009944752 rows per range expected) [SharedPool-Worker-2] | 2017-08-30 
10:50:31.424000 | xx.xx.xx.113 |  11427
   PAGED_RANGE message received from 
/xx.xx.xx.113 [MessagingService-Incoming-/xx.xx.xx.113] | 2017-08-30 
10:51:25.002000 | xx.xx.xx.116 | 28
 Executing seq scan across 1 sstables for 
[min(-9223372036854775808), min(-9223372036854775808)] [SharedPool-Worker-1] | 
2017-08-30 10:51:25.002000 | xx.xx.xx.116 | 82
   Read 
1 

Re: system_auth replication factor in Cassandra 2.1

2017-08-30 Thread Oleksandr Shulgin
On Wed, Aug 30, 2017 at 6:40 PM, Chuck Reynolds 
wrote:

> How many users do you have (or expect to be found in system_auth.users)?
>
>   5 users.
>
> What are the current RF for system_auth and consistency level you are
> using in cqlsh?
>
>  135 in one DC and 227 in the other DC.  Consistency level one
>

Still very surprising...

Did you try to obtain a trace of a timing-out query (with TRACING ON)?
>
> Tracing timeout even though I increased it to 120 seconds.
>

Even if cqlsh doesn't print the trace because of timeout, you should be
still able to find something in system_traces.

--
Alex


Re: system_auth replication factor in Cassandra 2.1

2017-08-30 Thread Chuck Reynolds
How many users do you have (or expect to be found in system_auth.users)?
  5 users.
What are the current RF for system_auth and consistency level you are using in 
cqlsh?
 135 in one DC and 227 in the other DC.  Consistency level one
Did you try to obtain a trace of a timing-out query (with TRACING ON)?
Tracing timeout even though I increased it to 120 seconds.

From: Oleksandr Shulgin 
Reply-To: "user@cassandra.apache.org" 
Date: Wednesday, August 30, 2017 at 10:19 AM
To: User 
Subject: Re: system_auth replication factor in Cassandra 2.1

On Wed, Aug 30, 2017 at 5:50 PM, Chuck Reynolds 
> wrote:
So I’ve read that if your using authentication in Cassandra 2.1 that your 
replication factor should match the number of nodes in your datacenter.

Is that true?

I have two datacenter cluster, 135 nodes in datacenter 1 & 227 nodes in an AWS 
datacenter.

Why do I want to replicate the system_auth table that many times?

What are the benefits and disadvantages of matching the number of nodes as 
opposed to the standard replication factor of 3?


The reason I’m asking the question is because it seems like I’m getting a lot 
of authentication errors now and they seem to happen more under load.

Also, querying the system_auth table from cqlsh to get the users seems to now 
timeout.

This is surprising.

How many users do you have (or expect to be found in system_auth.users)?   What 
are the current RF for system_auth and consistency level you are using in 
cqlsh?  Did you try to obtain a trace of a timing-out query (with TRACING ON)?

Regards,
--
Oleksandr "Alex" Shulgin | Database Engineer | Zalando SE | Tel: +49 176 
127-59-707



Re: system_auth replication factor in Cassandra 2.1

2017-08-30 Thread Oleksandr Shulgin
On Wed, Aug 30, 2017 at 6:20 PM, Chuck Reynolds 
wrote:

> So I tried to run a repair with the following on one of the server.
>
> nodetool repair system_auth -pr –local
>
>
>
> After two hours it hadn’t finished.  I had to kill the repair because of
> another issue and haven’t tried again.
>
>
>
> *Why would such a small table take so long to repair?*
>

It could be the overhead of that many nodes having to communicate with each
other (times the number of vnodes).  Even on a small clusters (3-5 nodes) I
think it takes a few minutes to run a repair on a small/empty keyspace.

*Also what would happen if I set the RF back to a lower number like 5?*
>

You should still run a repair afterwards, but I would expect it to finish
in a reasonable time.

--
Alex


Re: system_auth replication factor in Cassandra 2.1

2017-08-30 Thread Chuck Reynolds
So I tried to run a repair with the following on one of the server.
nodetool repair system_auth -pr –local

After two hours it hadn’t finished.  I had to kill the repair because of 
another issue and haven’t tried again.

Why would such a small table take so long to repair?

Also what would happen if I set the RF back to a lower number like 5?


Thanks
From:  on behalf of Sam Tunnicliffe 
Reply-To: "user@cassandra.apache.org" 
Date: Wednesday, August 30, 2017 at 10:10 AM
To: "user@cassandra.apache.org" 
Subject: Re: system_auth replication factor in Cassandra 2.1

It's a better rule of thumb to use an RF of 3 to 5 per DC and this is what the 
docs now suggest: 
http://cassandra.apache.org/doc/latest/operating/security.html#authentication
Out of the box, the system_auth keyspace is setup with SimpleStrategy and RF=1 
so that it works on any new system including dev & test clusters, but obviously 
that's no use for a production system.

Regarding the increased rate of authentication errors: did you run repair after 
changing the RF? Auth queries are done at CL.LOCAL_ONE, so if you haven't 
repaired, the data for the user logging in will probably not be where it should 
be. The exception to this is the default "cassandra" user, queries for that 
user are done at CL.QUORUM, which will indeed lead to timeouts and 
authentication errors with a very high RF. It's recommended to only use that 
default user to bootstrap the setup of your own users & superusers, the link 
above also has info on this.

Thanks,
Sam


On 30 August 2017 at 16:50, Chuck Reynolds 
> wrote:
So I’ve read that if your using authentication in Cassandra 2.1 that your 
replication factor should match the number of nodes in your datacenter.

Is that true?

I have two datacenter cluster, 135 nodes in datacenter 1 & 227 nodes in an AWS 
datacenter.

Why do I want to replicate the system_auth table that many times?

What are the benefits and disadvantages of matching the number of nodes as 
opposed to the standard replication factor of 3?


The reason I’m asking the question is because it seems like I’m getting a lot 
of authentication errors now and they seem to happen more under load.

Also, querying the system_auth table from cqlsh to get the users seems to now 
timeout.


Any help would be greatly appreciated.

Thanks



Re: system_auth replication factor in Cassandra 2.1

2017-08-30 Thread Oleksandr Shulgin
On Wed, Aug 30, 2017 at 5:50 PM, Chuck Reynolds 
wrote:

> So I’ve read that if your using authentication in Cassandra 2.1 that your
> replication factor should match the number of nodes in your datacenter.
>
>
>
> *Is that true?*
>
>
>
> I have two datacenter cluster, 135 nodes in datacenter 1 & 227 nodes in an
> AWS datacenter.
>
>
>
> *Why do I want to replicate the system_auth table that many times?*
>
>
>
> *What are the benefits and disadvantages of matching the number of nodes
> as opposed to the standard replication factor of 3? *
>
>
>
>
>
> The reason I’m asking the question is because it seems like I’m getting a
> lot of authentication errors now and they seem to happen more under load.
>
>
>
> Also, querying the system_auth table from cqlsh to get the users seems to
> now timeout.
>

This is surprising.

How many users do you have (or expect to be found in system_auth.users)?
What are the current RF for system_auth and consistency level you are using
in cqlsh?  Did you try to obtain a trace of a timing-out query (with
TRACING ON)?

Regards,
-- 
Oleksandr "Alex" Shulgin | Database Engineer | Zalando SE | Tel: +49 176
127-59-707


Re: system_auth replication factor in Cassandra 2.1

2017-08-30 Thread Sam Tunnicliffe
It's a better rule of thumb to use an RF of 3 to 5 per DC and this is what
the docs now suggest:
http://cassandra.apache.org/doc/latest/operating/security.html#authentication

Out of the box, the system_auth keyspace is setup with SimpleStrategy and
RF=1 so that it works on any new system including dev & test clusters, but
obviously that's no use for a production system.

Regarding the increased rate of authentication errors: did you run repair
after changing the RF? Auth queries are done at CL.LOCAL_ONE, so if you
haven't repaired, the data for the user logging in will probably not be
where it should be. The exception to this is the default "cassandra" user,
queries for that user are done at CL.QUORUM, which will indeed lead to
timeouts and authentication errors with a very high RF. It's recommended to
only use that default user to bootstrap the setup of your own users &
superusers, the link above also has info on this.

Thanks,
Sam


On 30 August 2017 at 16:50, Chuck Reynolds  wrote:

> So I’ve read that if your using authentication in Cassandra 2.1 that your
> replication factor should match the number of nodes in your datacenter.
>
>
>
> *Is that true?*
>
>
>
> I have two datacenter cluster, 135 nodes in datacenter 1 & 227 nodes in an
> AWS datacenter.
>
>
>
> *Why do I want to replicate the system_auth table that many times?*
>
>
>
> *What are the benefits and disadvantages of matching the number of nodes
> as opposed to the standard replication factor of 3? *
>
>
>
>
>
> The reason I’m asking the question is because it seems like I’m getting a
> lot of authentication errors now and they seem to happen more under load.
>
>
>
> Also, querying the system_auth table from cqlsh to get the users seems to
> now timeout.
>
>
>
>
>
> Any help would be greatly appreciated.
>
>
>
> Thanks
>


RE: system_auth replication factor in Cassandra 2.1

2017-08-30 Thread Jonathan Baynes

I recently came across an issue where by my user Keyspace was replicated by 3 
(I have 3 nodes) but my system_auth was default to 1, we also use 
authentication, I then lost 2 of my nodes and because authentication wasn’t 
replicated I couldn’t log in.

Once I resolved the issue, and got the nodes back up, I could then log back in, 
I too asked the community what was going on , and I was pointed to this

http://docs.datastax.com/en/datastax_enterprise/4.8/datastax_enterprise/sec/secConfSysAuthKeyspRepl.html

it clearly states the following

Attention: To prevent a potential problem logging into a secure cluster, set 
the replication factor of the system_auth and dse_security keyspaces to a value 
that is greater than 1. In a multi-node cluster, using the default of 1 
prevents logging into any node when the node that stores the user data is down.



From: Chuck Reynolds [mailto:creyno...@ancestry.com]
Sent: 30 August 2017 16:51
To: user@cassandra.apache.org
Subject: system_auth replication factor in Cassandra 2.1

So I’ve read that if your using authentication in Cassandra 2.1 that your 
replication factor should match the number of nodes in your datacenter.

Is that true?

I have two datacenter cluster, 135 nodes in datacenter 1 & 227 nodes in an AWS 
datacenter.

Why do I want to replicate the system_auth table that many times?

What are the benefits and disadvantages of matching the number of nodes as 
opposed to the standard replication factor of 3?


The reason I’m asking the question is because it seems like I’m getting a lot 
of authentication errors now and they seem to happen more under load.

Also, querying the system_auth table from cqlsh to get the users seems to now 
timeout.


Any help would be greatly appreciated.

Thanks



This e-mail may contain confidential and/or privileged information. If you are 
not the intended recipient (or have received this e-mail in error) please 
notify the sender immediately and destroy it. Any unauthorized copying, 
disclosure or distribution of the material in this e-mail is strictly 
forbidden. Tradeweb reserves the right to monitor all e-mail communications 
through its networks. If you do not wish to receive marketing emails about our 
products / services, please let us know by contacting us, either by email at 
contac...@tradeweb.com or by writing to us at the registered office of Tradeweb 
in the UK, which is: Tradeweb Europe Limited (company number 3912826), 1 Fore 
Street Avenue London EC2Y 9DT. To see our privacy policy, visit our website @ 
www.tradeweb.com.


Re: Cassandra All host(s) tried for query failed (no host was tried)

2017-08-30 Thread Oleksandr Shulgin
On Wed, Aug 30, 2017 at 5:10 PM, Ivan Iliev 
wrote:

> Hello everyone,
>
> We are using Cassandra 3.9 for storing quite a lot of data produced from
> our tester machines.
>
> Occasionally, we are seeing issues with apps not being able to communicate
> with Cassandra nodes, returning the following errors (captured in
> servicemix logs):
>
>>  by: com.datastax.driver.core.exceptions.NoHostAvailableException: All
>> host(s) tried for query failed (no host was tried)
>> at com.datastax.driver.core.RequestHandler.reportNoMoreHosts(Re
>> questHandler.java:218)
>> at com.datastax.driver.core.RequestHandler.access$1000(RequestH
>> andler.java:43)
>> at com.datastax.driver.core.RequestHandler$SpeculativeExecution
>> .sendRequest(RequestHandler.java:284)
>> at com.datastax.driver.core.RequestHandler.startNewExecution(Re
>> questHandler.java:115)
>> at com.datastax.driver.core.RequestHandler.sendRequest(RequestH
>> andler.java:91)
>> at com.datastax.driver.core.SessionManager.executeAsync(Session
>> Manager.java:132)
>> ... 107 more
>
>
> As a result, apps that try to send data to cassandra get crashed due to
> running out of memory and we have to restart the containers in which they
> run.
>
> So far I have not been able to identify what might be the cause for this
> as nothing (at least I could not find anything relevant on the timestamps)
> in the cassandra debug and system logs.
>
> Could you share some insight on this ? What to check and where to start
> from , in order to troubleshoot this.
>

We've seen such error once on AWS EC2 when the Cassandra was configured
using EC2MultiRegionSnitch, but the application code didn't use the
EC2MultiRegionAddressTranslator[1,2].

What happened to us is whenever the node to which the client was first to
connect was unavailable, it wouldn't even try to contact other nodes, since
it somehow could figure out that it won't be able to reach them.  I don't
recall all the details now, but after studying the driver code[3] we could
find that configuring address translation would fix the problem, which it
did for us.

I guess you might be hitting this very issue or a similar one.

Hope this helps,
-- 
Oleksandr "Alex" Shulgin | Database Engineer | Zalando SE | Tel: +49 176
127-59-707 <+49%20176%2012759707>

[1] http://docs.datastax.com/en/drivers/java/3.2/com/datastax/
driver/core/policies/EC2MultiRegionAddressTranslator.html
[2] https://docs.datastax.com/en/developer/java-driver/3.3/
manual/address_resolution/
[3] https://github.com/datastax/java-driver/


system_auth replication factor in Cassandra 2.1

2017-08-30 Thread Chuck Reynolds
So I’ve read that if your using authentication in Cassandra 2.1 that your 
replication factor should match the number of nodes in your datacenter.

Is that true?

I have two datacenter cluster, 135 nodes in datacenter 1 & 227 nodes in an AWS 
datacenter.

Why do I want to replicate the system_auth table that many times?

What are the benefits and disadvantages of matching the number of nodes as 
opposed to the standard replication factor of 3?


The reason I’m asking the question is because it seems like I’m getting a lot 
of authentication errors now and they seem to happen more under load.

Also, querying the system_auth table from cqlsh to get the users seems to now 
timeout.


Any help would be greatly appreciated.

Thanks


Cassandra All host(s) tried for query failed (no host was tried)

2017-08-30 Thread Ivan Iliev
Hello everyone,

We are using Cassandra 3.9 for storing quite a lot of data produced from
our tester machines.

Occasionally, we are seeing issues with apps not being able to communicate
with Cassandra nodes, returning the following errors (captured in
servicemix logs):

>  by: com.datastax.driver.core.exceptions.NoHostAvailableException: All
> host(s) tried for query failed (no host was tried)
> at
> com.datastax.driver.core.RequestHandler.reportNoMoreHosts(RequestHandler.java:218)
> at
> com.datastax.driver.core.RequestHandler.access$1000(RequestHandler.java:43)
> at
> com.datastax.driver.core.RequestHandler$SpeculativeExecution.sendRequest(RequestHandler.java:284)
> at
> com.datastax.driver.core.RequestHandler.startNewExecution(RequestHandler.java:115)
> at
> com.datastax.driver.core.RequestHandler.sendRequest(RequestHandler.java:91)
> at
> com.datastax.driver.core.SessionManager.executeAsync(SessionManager.java:132)
> ... 107 more


As a result, apps that try to send data to cassandra get crashed due to
running out of memory and we have to restart the containers in which they
run.

So far I have not been able to identify what might be the cause for this as
nothing (at least I could not find anything relevant on the timestamps) in
the cassandra debug and system logs.

Could you share some insight on this ? What to check and where to start
from , in order to troubleshoot this.

Thanks !
Ivan


[ANNOUNCE] Gocqlx - Go productivity toolkit

2017-08-30 Thread Michał Matczuk
At Scylla we are open sourcing a library we use on top of the Go driver
(gocql) to communicate with ScyllaDB and Apache Cassandra.

Gocqlx provides:

* Builders for SELECT, INSERT, UPDATE DELETE and BATCH
* Queries with named parameters (:identifier) support
* Binding parameters form struct or map
* Scanning results into structs and slices
* Automatic query releasing

and it's fast.

You can find it on our GitHub https://github.com/scylladb/gocqlx

You may also take a look at the blog post about it http://www.scylladb.com/
2017/08/25/gocqlx-productivity-toolkit/

Best regards
-- Michał Matczuk


Re: Working With Prepared Statements

2017-08-30 Thread Shalom Sagges
Thanks guys for all the info!




Shalom Sagges
DBA
T: +972-74-700-4035
 
 We Create Meaningful Connections



On Wed, Aug 30, 2017 at 10:54 AM, Oleksandr Shulgin <
oleksandr.shul...@zalando.de> wrote:

> On Tue, Aug 29, 2017 at 12:33 PM, Shalom Sagges 
> wrote:
>
>> Insights, anyone?
>>
>
> There were reports of Cassandra failing to start due to trying to load the
> prepared statements from a cached table.  This can only affect you if you
> have a lot (tens of thousands, IIRC) of prepared statements.  A fix seems
> to have made it into 3.11.1:
>
> https://issues.apache.org/jira/browse/CASSANDRA-13641
>
> Search the the users list for "prepared statement preload" and you'll find
> some recent threads.
>
> Cheers,
> --
> Alex
>
>

-- 
This message may contain confidential and/or privileged information. 
If you are not the addressee or authorized to receive this on behalf of the 
addressee you must not use, copy, disclose or take action based on this 
message or any information herein. 
If you have received this message in error, please advise the sender 
immediately by reply email and delete this message. Thank you.


Re: Cassandra and OpenJDK

2017-08-30 Thread Myrle Krantz
Thank you Kurt and Eric.

Myrle

On Mon, Aug 28, 2017 at 11:53 PM, kurt greaves  wrote:
> OpenJDK is fine.

-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org



Re: Working With Prepared Statements

2017-08-30 Thread Oleksandr Shulgin
On Tue, Aug 29, 2017 at 12:33 PM, Shalom Sagges 
wrote:

> Insights, anyone?
>

There were reports of Cassandra failing to start due to trying to load the
prepared statements from a cached table.  This can only affect you if you
have a lot (tens of thousands, IIRC) of prepared statements.  A fix seems
to have made it into 3.11.1:

https://issues.apache.org/jira/browse/CASSANDRA-13641

Search the the users list for "prepared statement preload" and you'll find
some recent threads.

Cheers,
--
Alex


Re: why the cluster does not work well after addding two new nodes

2017-08-30 Thread 赵豫峰
Hi, Nandan, thanks for your replay.


1)  Yes the config is wrong. My doubt is why I can get the result when one node 
is alive, but can't when two or three nodes are alive;
2)  I guess your mean the auto_bootstrap configure parameter? It is not set in 
cassandra.yaml;

3) Dose it must  to rebalance when adding new node? I remeber that it’s ok when 
just add a new node without any operation before。
 
Thanks for your advice!






 
 
-- Original --
From:  "@Nandan@";
Date:  Wed, Aug 30, 2017 01:04 PM
To:  "zhaoyf"; "user"; 

Subject:  Re: why the cluster does not work well after addding two new nodes

 
Hi , What happened wrong from starting, I am just listing down:-
1) Had 2 nodes servers but created Keyspace with RF 3. [Always make sure RF <= 
Total No. of Nodes]
2) While Adding New Nodes, Make sure that Auto_bootstraping is Enable or not. 
3) Once You added 2 new nodes, better things will be you have to do node 
rebalance. 
There are 2 different way by which you can do rebalance. 
A) Use OpsCenter -> And select Rebalance Cluster. 
B) Use onnodetool cleanup that node afterward to clean up data no longer 
belonging to that node.



Best Regards, 
Nandan




On Wed, Aug 30, 2017 at 12:14 PM, 赵豫峰  wrote:
Hi, I have a cluster with two node servers(I know it’s in a wrong way  but it‘s 
builded by another colleague who has left), and it's keyspace set like:


CREATE KEYSPACE my_keyspace WITH replication = {'class': 'SimpleStrategy', 
'replication_factor': '3'}  AND durable_writes = true;


one day my boss said one node was down for a long time and another worked 
normally, tell my to restart the cluster.


First, I make a snapshot from the working node;
then, I check the data numbers with select count(*) cql statement, the result 
is more then 17;
Next, I add two new nodes. After new node worked, I use select count(*)  cql to 
check the data several times, but now I get uncertain resluts, and each reslut 
is less then 1; I check node status with ./nodetool status cql, and every 
node is UN, but the load of two new nodes is far less then the normal node。
I stop the two new nodes, use “select count(*)” cql and get the right result 
again.


I build a new cluster in sandbox env with snapshot file, and get the same 
result like above。 I used "./nodetool repair" sql,then the cluster works well 
but I don't know why.


I guess it because two nodes with "replication = {'class': 'SimpleStrategy', 
'replication_factor': '3'} " can make splite brain and the data won't be 
consistent,or the data file is broken but not make sure。Why did it happen, why 
I have to use "./nodetool repair" command, and when to use it?


Thanks!





--


赵豫峰



环信即时通讯云/研发

320CFF08@0DE55A52.325DA659
Description: Binary data