Re: Bootstrap streaming issues

2018-08-29 Thread Jai Bheemsen Rao Dhanwada
Jeff, any idea if this is somehow related to : https://issues.apache.org/jira/browse/CASSANDRA-11840? does increasing the value of streaming_socket_timeout_in_ms to a higher value helps? On Wed, Aug 29, 2018 at 10:52 PM Jai Bheemsen Rao Dhanwada < jaibheem...@gmail.com> wrote: > I have 72 nodes

Re: Bootstrap streaming issues

2018-08-29 Thread Jai Bheemsen Rao Dhanwada
I have 72 nodes in the cluster, across 8 datacenters.. the moment I try to increase the node above 84 or so, the issue starts. I am still using CMS Heap, assuming it will create more harm if I increase the heap size beyond 8G(recommended). On Wed, Aug 29, 2018 at 6:53 PM Jeff Jirsa wrote: >

Re: Bootstrap streaming issues

2018-08-29 Thread Jeff Jirsa
Given the size of your schema, you’re probably getting flooded with a bunch of huge schema mutations as it hops into gossip and tries to pull the schema from every host it sees. You say 8 DCs but you don’t say how many nodes - I’m guessing it’s a lot? This is something that’s incrementally

Re: Bootstrap streaming issues

2018-08-29 Thread Jai Bheemsen Rao Dhanwada
It fails before bootstrap streaming throughpu on the nodes is set to 400Mb/ps On Wednesday, August 29, 2018, Jeff Jirsa wrote: > Is the bootstrap plan succeeding (does streaming start or does it crash > before it logs messages about streaming starting)? > > Have you capped the stream

Re: Bootstrap streaming issues

2018-08-29 Thread Jeff Jirsa
Is the bootstrap plan succeeding (does streaming start or does it crash before it logs messages about streaming starting)? Have you capped the stream throughput on the existing hosts? -- Jeff Jirsa > On Aug 29, 2018, at 5:02 PM, Jai Bheemsen Rao Dhanwada > wrote: > > Hello All, > > We

Bootstrap streaming issues

2018-08-29 Thread Jai Bheemsen Rao Dhanwada
Hello All, We are seeing some issue when we add more nodes to the cluster, where new node bootstrap is not able to stream the entire metadata and fails to bootstrap. Finally the process dies with OOM (java.lang.OutOfMemoryError: Java heap space) But if I remove few nodes from the cluster we

Re: A blog about Cassandra in the IoT arena

2018-08-29 Thread Rahul Singh
Understood. Deep problems to consider. Partition size. I’ve been looking at how Yugabyte is using “tablets” of data which have data. It’s an interesting proposition. .. it all comes down to the token based addressing - which is optimized as a single dimension array and I think this is part of

RE: [EXTERNAL] Re: Re: bigger data density with Cassandra 4.0?

2018-08-29 Thread Rahul Singh
YugaByte is also another new dancer in the Cassandra dance. The data store is based on RocksDB — and it’s written in C++. Although they ar wire compliant with c* I’m pretty are everything under the hood is NOT a port like Scylla was initially. Rahul Singh Chief Executive Officer m 202.905.2818

Re: Recommended num_tokens setting for small cluster

2018-08-29 Thread kurt greaves
For 10 nodes you probably want to use between 32 and 64. Make sure you use the token allocation algorithm by specifying allocate_tokens_for_keyspace On Thu., 30 Aug. 2018, 04:40 Jeff Jirsa, wrote: > 3.0 has a (optional?) feature to guarantee better distribution, and the > blog focuses on 2.2. >

Re: Recommended num_tokens setting for small cluster

2018-08-29 Thread Jeff Jirsa
3.0 has a (optional?) feature to guarantee better distribution, and the blog focuses on 2.2. Using fewer will minimize your risk of unavailability if any two hosts fail. -- Jeff Jirsa > On Aug 29, 2018, at 11:18 AM, Max C. wrote: > > Hello Everyone, > > Datastax recommends num_tokens =

Recommended num_tokens setting for small cluster

2018-08-29 Thread Max C.
Hello Everyone, Datastax recommends num_tokens = 8 as a sensible default, rather than num_tokens = 256: https://docs.datastax.com/en/dse/5.1/dse-dev/datastax_enterprise/config/configVnodes.html … but

Re: bigger data density with Cassandra 4.0?

2018-08-29 Thread Ariel Weisberg
Hi, It depends on compaction strategy to an extent. Leveled compaction is partitioning sstables on token range so there is a wider variety of scenarios where it works. I haven't done the napkin math at 10 terabytes to figure what % of sstables will be leveled to the point they work with 256

RE: [EXTERNAL] Re: Re: bigger data density with Cassandra 4.0?

2018-08-29 Thread Durity, Sean R
If you are going to compare vs commercial offerings like Scylla and CosmosDB, you should be looking at DataStax Enterprise. They are moving more quickly than open source (IMO) on adding features and tools that enterprises really need. I think they have some emerging tech for large/dense nodes,

RE: [EXTERNAL] Re: Nodetool refresh v/s sstableloader

2018-08-29 Thread Durity, Sean R
Sstableloader, though, could require a lot more disk space – until compaction can reduce. For example, if your RF=3, you will essentially be loading 3 copies of the data. Then it will get replicated 3 more times as it is being loaded. Thus, you could need up to 9x disk space. Sean Durity

Re: URGENT: disable reads from node

2018-08-29 Thread Vlad
Hi, >You'll need to disable the native transportWell, this is what I did already, >it seems repair is running I'm not sure whether repair will finish within 3 hours, but I can run it again (as it's incremental repair by default, right?) I'm not sure about RF=3 and QUORUM reads because of

Re: URGENT: disable reads from node

2018-08-29 Thread Alexander Dejanovski
Kurt is right. So here are the options I can think of : - use the join_ring false technique and rely on hints. You'll need to disable the native transport on the node as well to prevent direct connections to be made to it. Hopefully, you can run repair in less than 3 hours which is the hint

Re: Repairs are slow after upgrade to 3.11.3

2018-08-29 Thread Maxim Parkachov
Hi Alex, I'm using Cassandra reaper as well. Could be https://issues.apache.org/jira/browse/CASSANDRA-14332 as it was committed in both version. Regards, Maxim. On Wed, Aug 29, 2018 at 2:14 PM Oleksandr Shulgin < oleksandr.shul...@zalando.de> wrote: > On Wed, Aug 29, 2018 at 3:06 AM Maxim

Re: Repairs are slow after upgrade to 3.11.3

2018-08-29 Thread Maxim Parkachov
Hi, I wanted to get rid of https://issues.apache.org/jira/browse/CASSANDRA-14332 and https://issues.apache.org/jira/browse/CASSANDRA-14470. I haven't seen these errors yet, but it is early to say after couple of days of operation. Regards, Maxim. On Wed, Aug 29, 2018 at 10:27 AM Jean Carlo

Re: URGENT: disable reads from node

2018-08-29 Thread Vlad
Also after restart with join_ring=false C* is still accepting connection on port 9042 (and obviously returning no data), so I run nodetool drainIs it good? I run nodetool repair on this node. Meanwhile command didn't return, but I see in log INFO  [Thread-6] 2018-08-29 12:16:03,954

Re: Repairs are slow after upgrade to 3.11.3

2018-08-29 Thread Oleksandr Shulgin
On Wed, Aug 29, 2018 at 3:06 AM Maxim Parkachov wrote: > couple of days ago I have upgraded Cassandra from 3.11.2 to 3.11.3 and I > see that repair time is practically doubled. Does someone else experience > the same regression ? > We have upgraded from 3.0.16 to 3.0.17 two days ago and we see

Re: URGENT: disable reads from node

2018-08-29 Thread Vlad
I restarted with cassandra.join_ring=falsenodetool status on other nodes shows this node as DN, while it see itself as UN. >I'd say best to just query at QUORUM until you can finish repairs.We have RH >2, so I guess QUORUM queries will fail. Also different application should be >changed for

Re: Re: bigger data density with Cassandra 4.0?

2018-08-29 Thread onmstester onmstester
Could you please explain more about (you mean slower performance in compare to Cassandra?) ---Hbase tends to be quite average for transactional data and about: ScyllaDB IDK, I'd assume they just sorted out streaming by learning from C*'s mistakes. While ScyllaDB is a much younger project

Re: URGENT: disable reads from node

2018-08-29 Thread kurt greaves
Note that you'll miss incoming writes if you do that, so you'll be inconsistent even after the repair. I'd say best to just query at QUORUM until you can finish repairs. On 29 August 2018 at 21:22, Alexander Dejanovski wrote: > Hi Vlad, you must restart the node but first disable joining the

Re: Nodetool refresh v/s sstableloader

2018-08-29 Thread kurt greaves
Removing dev... Nodetool refresh only picks up new SSTables that have been placed in the tables directory. It doesn't account for actual ownership of the data like SSTableloader does. Refresh will only work properly if the SSTables you are copying in are completely covered by that nodes tokens. It

Re: URGENT: disable reads from node

2018-08-29 Thread Alexander Dejanovski
Hi Vlad, you must restart the node but first disable joining the cluster, as described in the second part of this blog post : http://thelastpickle.com/blog/2018/08/02/Re-Bootstrapping-Without-Bootstrapping.html Once repaired, you'll have to run "nodetool join" to start serving reads. Le mer. 29

Re: Re: bigger data density with Cassandra 4.0?

2018-08-29 Thread kurt greaves
Most of the issues around big nodes is related to streaming, which is currently quite slow (should be a bit better in 4.0). HBase is built on top of hadoop, which is much better at large files/very dense nodes, and tends to be quite average for transactional data. ScyllaDB IDK, I'd assume they

Re: URGENT: disable reads from node

2018-08-29 Thread Vlad
Will it help to set read_repair_chance to 1 (compaction is SizeTieredCompactionStrategy)? On Wednesday, August 29, 2018 1:34 PM, Vlad wrote: Hi, quite urgent questions:due to disk and C* start problem we were forced to delete commit logs from one of nodes. Now repair is running, but

URGENT: disable reads from node

2018-08-29 Thread Vlad
Hi, quite urgent questions:due to disk and C* start problem we were forced to delete commit logs from one of nodes. Now repair is running, but meanwhile some reads bring no data (RF=2) Can this node be excluded from reads queries? And that  all reads will be redirected to other node in the

Fwd: Re: bigger data density with Cassandra 4.0?

2018-08-29 Thread onmstester onmstester
Thanks Kurt, Actually my cluster has > 10 nodes, so there is a tiny chance to stream a complete SSTable. While logically any Columnar noSql db like Cassandra, needs always to re-sort grouped data for later-fast-reads and having nodes with big amount of data (> 2 TB) would be annoying for this

Unsubscribe

2018-08-29 Thread Raj Bakhru

Re: Repairs are slow after upgrade to 3.11.3

2018-08-29 Thread Jean Carlo
Hello, Can I ask you why did you upgrade from 3.11.2 ? did you experience some java heap problems ? Unfortunately I cannot answer your question :( I am in the 2.1 and about to upgrade to 3.11 Best greatings Jean Carlo "The best way to predict the future is to invent it" Alan Kay On Wed,

Re: bigger data density with Cassandra 4.0?

2018-08-29 Thread kurt greaves
My reasoning was if you have a small cluster with vnodes you're more likely to have enough overlap between nodes that whole SSTables will be streamed on major ops. As N gets >RF you'll have less common ranges and thus less likely to be streaming complete SSTables. Correct me if I've