Re: Bootstrap streaming issues

2018-08-29 Thread Jai Bheemsen Rao Dhanwada
Jeff,

any idea if this is somehow related to :
https://issues.apache.org/jira/browse/CASSANDRA-11840?
does increasing the value of streaming_socket_timeout_in_ms to a higher
value helps?

On Wed, Aug 29, 2018 at 10:52 PM Jai Bheemsen Rao Dhanwada <
jaibheem...@gmail.com> wrote:

> I have 72 nodes in the cluster, across 8 datacenters.. the moment I try to
> increase the node above 84 or so, the issue starts.
>
> I am still using CMS Heap, assuming it will create more harm if I increase
> the heap size beyond 8G(recommended).
>
> On Wed, Aug 29, 2018 at 6:53 PM Jeff Jirsa  wrote:
>
>> Given the size of your schema, you’re probably getting flooded with a
>> bunch of huge schema mutations as it hops into gossip and tries to pull the
>> schema from every host it sees. You say 8 DCs but you don’t say how many
>> nodes - I’m guessing it’s  a lot?
>>
>> This is something that’s incrementally better in 3.0, but a real proper
>> fix has been talked about a few times  -
>> https://issues.apache.org/jira/browse/CASSANDRA-11748 and
>> https://issues.apache.org/jira/browse/CASSANDRA-13569 for example
>>
>> In the short term, you may be able to work around this by increasing your
>> heap size. If that doesn’t work, there’s an ugly ugly hack that’ll work on
>> 2.1:  limiting the number of schema blobs you can get at a time - in this
>> case, that means firewall off all but a few nodes in your cluster for 10-30
>> seconds, make sure it gets the schema (watch the logs or file system for
>> the tables to be created), then remove the firewall so it can start the
>> bootstrap process (it needs the schema to setup the streaming plan, and it
>> needs all the hosts up in gossip to stream successfully, so this is an ugly
>> hack to give you time to get the schema and then heal the cluster so it can
>> bootstrap).
>>
>> Yea that’s awful. Hopefully either of the two above JIRAs lands to make
>> this less awful.
>>
>>
>>
>> --
>> Jeff Jirsa
>>
>>
>> On Aug 29, 2018, at 6:29 PM, Jai Bheemsen Rao Dhanwada <
>> jaibheem...@gmail.com> wrote:
>>
>> It fails before bootstrap
>>
>> streaming throughpu on the nodes is set to 400Mb/ps
>>
>> On Wednesday, August 29, 2018, Jeff Jirsa  wrote:
>>
>>> Is the bootstrap plan succeeding (does streaming start or does it crash
>>> before it logs messages about streaming starting)?
>>>
>>> Have you capped the stream throughput on the existing hosts?
>>>
>>> --
>>> Jeff Jirsa
>>>
>>>
>>> On Aug 29, 2018, at 5:02 PM, Jai Bheemsen Rao Dhanwada <
>>> jaibheem...@gmail.com> wrote:
>>>
>>> Hello All,
>>>
>>> We are seeing some issue when we add more nodes to the cluster, where
>>> new node bootstrap is not able to stream the entire metadata and fails to
>>> bootstrap. Finally the process dies with OOM (java.lang.OutOfMemoryError:
>>> Java heap space)
>>>
>>> But if I remove few nodes from the cluster we don't see this issue.
>>>
>>> Cassandra Version: 2.1.16
>>> # of KS and CF : 100, 3000 (approx)
>>> # of DC: 8
>>> # of Vnodes per node: 256
>>>
>>> Not sure what is causing this behavior, has any one come across this
>>> scenario?
>>> thanks in advance.
>>>
>>>


Re: Bootstrap streaming issues

2018-08-29 Thread Jai Bheemsen Rao Dhanwada
I have 72 nodes in the cluster, across 8 datacenters.. the moment I try to
increase the node above 84 or so, the issue starts.

I am still using CMS Heap, assuming it will create more harm if I increase
the heap size beyond 8G(recommended).

On Wed, Aug 29, 2018 at 6:53 PM Jeff Jirsa  wrote:

> Given the size of your schema, you’re probably getting flooded with a
> bunch of huge schema mutations as it hops into gossip and tries to pull the
> schema from every host it sees. You say 8 DCs but you don’t say how many
> nodes - I’m guessing it’s  a lot?
>
> This is something that’s incrementally better in 3.0, but a real proper
> fix has been talked about a few times  -
> https://issues.apache.org/jira/browse/CASSANDRA-11748 and
> https://issues.apache.org/jira/browse/CASSANDRA-13569 for example
>
> In the short term, you may be able to work around this by increasing your
> heap size. If that doesn’t work, there’s an ugly ugly hack that’ll work on
> 2.1:  limiting the number of schema blobs you can get at a time - in this
> case, that means firewall off all but a few nodes in your cluster for 10-30
> seconds, make sure it gets the schema (watch the logs or file system for
> the tables to be created), then remove the firewall so it can start the
> bootstrap process (it needs the schema to setup the streaming plan, and it
> needs all the hosts up in gossip to stream successfully, so this is an ugly
> hack to give you time to get the schema and then heal the cluster so it can
> bootstrap).
>
> Yea that’s awful. Hopefully either of the two above JIRAs lands to make
> this less awful.
>
>
>
> --
> Jeff Jirsa
>
>
> On Aug 29, 2018, at 6:29 PM, Jai Bheemsen Rao Dhanwada <
> jaibheem...@gmail.com> wrote:
>
> It fails before bootstrap
>
> streaming throughpu on the nodes is set to 400Mb/ps
>
> On Wednesday, August 29, 2018, Jeff Jirsa  wrote:
>
>> Is the bootstrap plan succeeding (does streaming start or does it crash
>> before it logs messages about streaming starting)?
>>
>> Have you capped the stream throughput on the existing hosts?
>>
>> --
>> Jeff Jirsa
>>
>>
>> On Aug 29, 2018, at 5:02 PM, Jai Bheemsen Rao Dhanwada <
>> jaibheem...@gmail.com> wrote:
>>
>> Hello All,
>>
>> We are seeing some issue when we add more nodes to the cluster, where new
>> node bootstrap is not able to stream the entire metadata and fails to
>> bootstrap. Finally the process dies with OOM (java.lang.OutOfMemoryError:
>> Java heap space)
>>
>> But if I remove few nodes from the cluster we don't see this issue.
>>
>> Cassandra Version: 2.1.16
>> # of KS and CF : 100, 3000 (approx)
>> # of DC: 8
>> # of Vnodes per node: 256
>>
>> Not sure what is causing this behavior, has any one come across this
>> scenario?
>> thanks in advance.
>>
>>


Re: Bootstrap streaming issues

2018-08-29 Thread Jeff Jirsa
Given the size of your schema, you’re probably getting flooded with a bunch of 
huge schema mutations as it hops into gossip and tries to pull the schema from 
every host it sees. You say 8 DCs but you don’t say how many nodes - I’m 
guessing it’s  a lot? 

This is something that’s incrementally better in 3.0, but a real proper fix has 
been talked about a few times  - 
https://issues.apache.org/jira/browse/CASSANDRA-11748 and 
https://issues.apache.org/jira/browse/CASSANDRA-13569 for example 

In the short term, you may be able to work around this by increasing your heap 
size. If that doesn’t work, there’s an ugly ugly hack that’ll work on 2.1:  
limiting the number of schema blobs you can get at a time - in this case, that 
means firewall off all but a few nodes in your cluster for 10-30 seconds, make 
sure it gets the schema (watch the logs or file system for the tables to be 
created), then remove the firewall so it can start the bootstrap process (it 
needs the schema to setup the streaming plan, and it needs all the hosts up in 
gossip to stream successfully, so this is an ugly hack to give you time to get 
the schema and then heal the cluster so it can bootstrap).

Yea that’s awful. Hopefully either of the two above JIRAs lands to make this 
less awful. 



-- 
Jeff Jirsa


> On Aug 29, 2018, at 6:29 PM, Jai Bheemsen Rao Dhanwada 
>  wrote:
> 
> It fails before bootstrap
> 
> streaming throughpu on the nodes is set to 400Mb/ps
> 
>> On Wednesday, August 29, 2018, Jeff Jirsa  wrote:
>> Is the bootstrap plan succeeding (does streaming start or does it crash 
>> before it logs messages about streaming starting)?
>> 
>> Have you capped the stream throughput on the existing hosts? 
>> 
>> -- 
>> Jeff Jirsa
>> 
>> 
>>> On Aug 29, 2018, at 5:02 PM, Jai Bheemsen Rao Dhanwada 
>>>  wrote:
>>> 
>>> Hello All,
>>> 
>>> We are seeing some issue when we add more nodes to the cluster, where new 
>>> node bootstrap is not able to stream the entire metadata and fails to 
>>> bootstrap. Finally the process dies with OOM (java.lang.OutOfMemoryError: 
>>> Java heap space)
>>> 
>>> But if I remove few nodes from the cluster we don't see this issue.
>>> 
>>> Cassandra Version: 2.1.16
>>> # of KS and CF : 100, 3000 (approx)
>>> # of DC: 8
>>> # of Vnodes per node: 256
>>> 
>>> Not sure what is causing this behavior, has any one come across this 
>>> scenario? 
>>> thanks in advance.


Re: Bootstrap streaming issues

2018-08-29 Thread Jai Bheemsen Rao Dhanwada
It fails before bootstrap

streaming throughpu on the nodes is set to 400Mb/ps

On Wednesday, August 29, 2018, Jeff Jirsa  wrote:

> Is the bootstrap plan succeeding (does streaming start or does it crash
> before it logs messages about streaming starting)?
>
> Have you capped the stream throughput on the existing hosts?
>
> --
> Jeff Jirsa
>
>
> On Aug 29, 2018, at 5:02 PM, Jai Bheemsen Rao Dhanwada <
> jaibheem...@gmail.com> wrote:
>
> Hello All,
>
> We are seeing some issue when we add more nodes to the cluster, where new
> node bootstrap is not able to stream the entire metadata and fails to
> bootstrap. Finally the process dies with OOM (java.lang.OutOfMemoryError:
> Java heap space)
>
> But if I remove few nodes from the cluster we don't see this issue.
>
> Cassandra Version: 2.1.16
> # of KS and CF : 100, 3000 (approx)
> # of DC: 8
> # of Vnodes per node: 256
>
> Not sure what is causing this behavior, has any one come across this
> scenario?
> thanks in advance.
>
>


Re: Bootstrap streaming issues

2018-08-29 Thread Jeff Jirsa
Is the bootstrap plan succeeding (does streaming start or does it crash before 
it logs messages about streaming starting)?

Have you capped the stream throughput on the existing hosts? 

-- 
Jeff Jirsa


> On Aug 29, 2018, at 5:02 PM, Jai Bheemsen Rao Dhanwada 
>  wrote:
> 
> Hello All,
> 
> We are seeing some issue when we add more nodes to the cluster, where new 
> node bootstrap is not able to stream the entire metadata and fails to 
> bootstrap. Finally the process dies with OOM (java.lang.OutOfMemoryError: 
> Java heap space)
> 
> But if I remove few nodes from the cluster we don't see this issue.
> 
> Cassandra Version: 2.1.16
> # of KS and CF : 100, 3000 (approx)
> # of DC: 8
> # of Vnodes per node: 256
> 
> Not sure what is causing this behavior, has any one come across this 
> scenario? 
> thanks in advance.


Bootstrap streaming issues

2018-08-29 Thread Jai Bheemsen Rao Dhanwada
Hello All,

We are seeing some issue when we add more nodes to the cluster, where new
node bootstrap is not able to stream the entire metadata and fails to
bootstrap. Finally the process dies with OOM (java.lang.OutOfMemoryError:
Java heap space)

But if I remove few nodes from the cluster we don't see this issue.

Cassandra Version: 2.1.16
# of KS and CF : 100, 3000 (approx)
# of DC: 8
# of Vnodes per node: 256

Not sure what is causing this behavior, has any one come across this
scenario?
thanks in advance.


Re: A blog about Cassandra in the IoT arena

2018-08-29 Thread Rahul Singh
Understood. Deep problems to consider.

Partition size.
I’ve been looking at how Yugabyte is using “tablets” of data which have data. 
It’s an interesting proposition. .. it all comes down to the token based 
addressing - which is optimized as a single dimension array and I think this is 
part of the limitation.


The sorting problem is one of the oldest in the Industry. Maybe need to look at 
Kafka and Lucene. Between the two, there are some interesting patterns to 
reference the location of data and to store those references. The compaction 
process wouldn’t need to “sort” if there is an optimized index which orders the 
vectors and the location. Compacting files should be “dumb” operation if the 
“smart” index is ready as the task table. The major reason Cassandra is fast is 
because of the partitioner which effectively “indexes” the data into a node and 
into a token. We need to go one level deeper. Maybe it’s another compaction 
strategy that evenly distributes data by either threshold of size or maintain a 
certain number of sstables.

Don’t have any ideas yet on anything better than Merkle trees. Will get back to 
you with ideas or code.

Good stuff.

Rahul
On Aug 24, 2018, 12:06 PM -0400, DuyHai Doan , wrote:
> No what I meant by infinite partition is not auto sub-partitioning, even at 
> server-side. Ideally Cassandra should be able to support infinite partition 
> size and make compaction, repair and streaming of such partitions manageable:
>
> - compaction: find a way to iterate super efficiently through the whole 
> partition and merge-sort all sstables containing data of the same partition.
>
>  - repair: find another approach than Merkle tree because its resolution is 
> not granular enough. Ideally repair resolution should be at the clustering 
> level or every xxx clustering values
>
>  - streaming: same idea as repair, in case of error/disconnection the stream 
> should be resumed at the latest clustering level checkpoint, or at least 
> should we checkpoint every xxx clustering values
>
>  - partition index: find a way to index efficiently the huge partition. Right 
> now huge partition has a dramatic impact on partition index. The work of 
> Michael Kjellman on birch indices is going into the right direction 
> (CASSANDRA-9754)
>
> About tombstone, there is recently a research paper about Dotted DB and an 
> attempt to make delete without using tombstones: 
> http://haslab.uminho.pt/tome/files/dotteddb_srds.pdf
>
>
>
> > On Fri, Aug 24, 2018 at 12:38 AM, Rahul Singh 
> >  wrote:
> > > Agreed. One of the ideas I had on partition size is to automatically 
> > > synthetically shard based on some basic patterns seen in the data.
> > >
> > > It could be implemented as a tool that would create a new table with an 
> > > additional part of the key that is an automatic created shard, or it 
> > > would use an existing key and then migrate the data.
> > >
> > > The internal automatic shard would adjust as needed and keep 
> > > “Subpartitons” or “rowsets” but return the full partition given some 
> > > special CQL
> > >
> > > This is done today at the Data Access layer and he data model design but 
> > > it’s pretty much a step by step process that could be algorithmically 
> > > done.
> > >
> > > Regarding the tombstone — maybe we have another thread dedicated to 
> > > cleaning tombstones - separate from compaction. Depending on the amount 
> > > of tombstones and a threshold, it would be dedicated to deletion. It may 
> > > be an edge case , but people face issues with tombstones all the time 
> > > because they don’t know better.
> > >
> > > Rahul
> > > On Aug 23, 2018, 11:50 AM -0500, DuyHai Doan , 
> > > wrote:
> > > > As I used to tell some people, the day we make :
> > > >
> > > > 1. partition size unlimited, or at least huge partition easily 
> > > > manageable (compaction, repair, streaming, partition index file)
> > > > 2. tombstone a non-issue
> > > >
> > > > that day, Cassandra will dominate any other IoT technology out there
> > > >
> > > > Until then ...
> > > >
> > > > > On Thu, Aug 23, 2018 at 4:54 PM, Rahul Singh 
> > > > >  wrote:
> > > > > > Good analysis of how the different key structures affect use cases 
> > > > > > and performance. I think you could extend this article with 
> > > > > > potential evaluation of FiloDB which specifically tries to solve 
> > > > > > the OLAP issue with arbitrary queries.
> > > > > >
> > > > > > Another option is leveraging Elassandra (index in Elasticsearch 
> > > > > > collocates with C*) or DataStax (index in Solr collocated with C*)
> > > > > >
> > > > > > I personally haven’t used SnappyData but that’s another Spark based 
> > > > > > DB that could be leveraged for performance real-time queries on the 
> > > > > > OLTP side.
> > > > > >
> > > > > > Rahul
> > > > > > On Aug 23, 2018, 2:48 AM -0500, Affan Syed , wrote:
> > > > > > > Hi,
> > > > > > >
> > > > > > > we wrote a blog about some of the results that engineers from 
> > > > > > 

RE: [EXTERNAL] Re: Re: bigger data density with Cassandra 4.0?

2018-08-29 Thread Rahul Singh
YugaByte is also another new dancer in the Cassandra dance. The data store is 
based on RocksDB — and it’s written in C++. Although they ar wire compliant 
with c* I’m pretty are everything under the hood is NOT a port like Scylla was 
initially.

Rahul Singh
Chief Executive Officer
m 202.905.2818

Anant Corporation
1010 Wisconsin Ave NW, Suite 250
Washington, D.C. 20007

We build and manage digital business technology platforms.
On Aug 29, 2018, 10:05 AM -0400, Durity, Sean R , 
wrote:
> If you are going to compare vs commercial offerings like Scylla and CosmosDB, 
> you should be looking at DataStax Enterprise. They are moving more quickly 
> than open source (IMO) on adding features and tools that enterprises really 
> need. I think they have some emerging tech for large/dense nodes, in 
> particular. The ability to handle different data model types (Graph and 
> Search) and embedded analytics sets it apart from plain Cassandra. Plus, they 
> have replaced Cassandra’s SEDA architecture to give it a significant boost in 
> performance. As a customer, I see the value in what they are doing.
>
>
> Sean Durity
> From: onmstester onmstester 
> Sent: Wednesday, August 29, 2018 7:43 AM
> To: user 
> Subject: [EXTERNAL] Re: Re: bigger data density with Cassandra 4.0?
>
> Could you please explain more about (you mean slower performance in compare 
> to Cassandra?)
> ---Hbase tends to be quite average for transactional data
>
> and about:
> ScyllaDB IDK, I'd assume they just sorted out streaming by learning from 
> C*'s mistakes.
> While ScyllaDB is a much younger project than Cassandra with so much less 
> usage and attention, Currently I encounter a dilemma on launching new 
> clusters which is: should i wait for Cassandra community to apply all 
> enhancement's and bug fixes that applied by their main competitors (Scylla DB 
> or Cosmos DB) or just switch to competitors (afraid of the new world!)?
> For example right now is there a motivation to handle more dense nodes in 
> near future?
>
> Again, Thank you for your time
>
> Sent using Zoho Mail
>
>
>  On Wed, 29 Aug 2018 15:16:40 +0430 kurt greaves  
> wrote 
>
> > quote_type
> > Most of the issues around big nodes is related to streaming, which is 
> > currently quite slow (should be a bit better in 4.0). HBase is built on top 
> > of hadoop, which is much better at large files/very dense nodes, and tends 
> > to be quite average for transactional data. ScyllaDB IDK, I'd assume they 
> > just sorted out streaming by learning from C*'s mistakes.
> >
> > On 29 August 2018 at 19:43, onmstester onmstester  
> > wrote:
> >
> > > quote_type
> > >
> > > Thanks Kurt,
> > > Actually my cluster has > 10 nodes, so there is a tiny chance to stream a 
> > > complete SSTable.
> > > While logically any Columnar noSql db like Cassandra, needs always to 
> > > re-sort grouped data for later-fast-reads and having nodes with big 
> > > amount of data (> 2 TB) would be annoying for this background process, 
> > > How is it possible that some of these databases like HBase and Scylla db 
> > > does not emphasis on small nodes (like Cassandra do)?
> > >
> > > Sent using Zoho Mail
> > >
> > >
> > >  Forwarded message 
> > > From : kurt greaves 
> > > To : "User"
> > > Date : Wed, 29 Aug 2018 12:03:47 +0430
> > > Subject : Re: bigger data density with Cassandra 4.0?
> > >  Forwarded message 
> > >
> > > > quote_type
> > > > My reasoning was if you have a small cluster with vnodes you're more 
> > > > likely to have enough overlap between nodes that whole SSTables will be 
> > > > streamed on major ops. As  N gets >RF you'll have less common ranges 
> > > > and thus less likely to be streaming complete SSTables. Correct me if 
> > > > I've misunderstood.
> > >
>
>
>
>
> The information in this Internet Email is confidential and may be legally 
> privileged. It is intended solely for the addressee. Access to this Email by 
> anyone else is unauthorized. If you are not the intended recipient, any 
> disclosure, copying, distribution or any action taken or omitted to be taken 
> in reliance on it, is prohibited and may be unlawful. When addressed to our 
> clients any opinions or advice contained in this Email are subject to the 
> terms and conditions expressed in any applicable governing The Home Depot 
> terms of business or client engagement letter. The Home Depot disclaims all 
> responsibility and liability for the accuracy and content of this attachment 
> and for any damages or losses arising from any inaccuracies, errors, viruses, 
> e.g., worms, trojan horses, etc., or other items of a destructive nature, 
> which may be contained in this attachment and shall not be liable for direct, 
> indirect, consequential or special damages in connection with this e-mail 
> message or its attachment.


Re: Recommended num_tokens setting for small cluster

2018-08-29 Thread kurt greaves
For 10 nodes you probably want to use between 32 and 64. Make sure you use
the token allocation algorithm by specifying allocate_tokens_for_keyspace

On Thu., 30 Aug. 2018, 04:40 Jeff Jirsa,  wrote:

> 3.0 has a (optional?) feature to guarantee better distribution, and the
> blog focuses on 2.2.
>
> Using fewer will minimize your risk of unavailability if any two hosts
> fail.
>
> --
> Jeff Jirsa
>
>
> On Aug 29, 2018, at 11:18 AM, Max C.  wrote:
>
> Hello Everyone,
>
> Datastax recommends num_tokens = 8 as a sensible default, rather than
> num_tokens = 256:
>
>
> https://docs.datastax.com/en/dse/5.1/dse-dev/datastax_enterprise/config/configVnodes.html
>
> … but then I see stories like this (unbalanced cluster when using
> num_tokens=12), which are very concerning:
>
>
> https://danielparker.me/cassandra/vnodes/tokens/increasing-vnodes-cassandra/
>
> We’re currently running 3.0.x, 3 nodes, RF=3, num_tokens=256, spinning
> disks, soon to be 2 DCs.   My guess is that our cluster will probably not
> grow beyond 10 nodes (10 TB?)
>
> I’d like to minimize the chance of hitting a roadblock down the road due
> to having num_tokens set inappropriately.   We can change this right now
> pretty easily (our dataset is small but growing).  Should we switch from
> 256 to 8?  32?
>
> Has anyone had num_tokens = 8 (or similarly small number) and experienced
> growing pains?  What do you think the recommended setting should be?
>
> Thanks for the advice.  :-)
>
> - Max
>
>


Re: Recommended num_tokens setting for small cluster

2018-08-29 Thread Jeff Jirsa
3.0 has a (optional?) feature to guarantee better distribution, and the blog 
focuses on 2.2. 

Using fewer will minimize your risk of unavailability if any two hosts fail. 

-- 
Jeff Jirsa


> On Aug 29, 2018, at 11:18 AM, Max C.  wrote:
> 
> Hello Everyone,
> 
> Datastax recommends num_tokens = 8 as a sensible default, rather than 
> num_tokens = 256:
> 
> https://docs.datastax.com/en/dse/5.1/dse-dev/datastax_enterprise/config/configVnodes.html
> 
> … but then I see stories like this (unbalanced cluster when using 
> num_tokens=12), which are very concerning:
> 
> https://danielparker.me/cassandra/vnodes/tokens/increasing-vnodes-cassandra/
> 
> We’re currently running 3.0.x, 3 nodes, RF=3, num_tokens=256, spinning disks, 
> soon to be 2 DCs.   My guess is that our cluster will probably not grow 
> beyond 10 nodes (10 TB?)
> 
> I’d like to minimize the chance of hitting a roadblock down the road due to 
> having num_tokens set inappropriately.   We can change this right now pretty 
> easily (our dataset is small but growing).  Should we switch from 256 to 8?  
> 32?  
> 
> Has anyone had num_tokens = 8 (or similarly small number) and experienced 
> growing pains?  What do you think the recommended setting should be?
> 
> Thanks for the advice.  :-)
> 
> - Max
> 


Recommended num_tokens setting for small cluster

2018-08-29 Thread Max C.
Hello Everyone,

Datastax recommends num_tokens = 8 as a sensible default, rather than 
num_tokens = 256:

https://docs.datastax.com/en/dse/5.1/dse-dev/datastax_enterprise/config/configVnodes.html
 


… but then I see stories like this (unbalanced cluster when using 
num_tokens=12), which are very concerning:

https://danielparker.me/cassandra/vnodes/tokens/increasing-vnodes-cassandra/ 


We’re currently running 3.0.x, 3 nodes, RF=3, num_tokens=256, spinning disks, 
soon to be 2 DCs.   My guess is that our cluster will probably not grow beyond 
10 nodes (10 TB?)

I’d like to minimize the chance of hitting a roadblock down the road due to 
having num_tokens set inappropriately.   We can change this right now pretty 
easily (our dataset is small but growing).  Should we switch from 256 to 8?  
32?  

Has anyone had num_tokens = 8 (or similarly small number) and experienced 
growing pains?  What do you think the recommended setting should be?

Thanks for the advice.  :-)

- Max



Re: bigger data density with Cassandra 4.0?

2018-08-29 Thread Ariel Weisberg
Hi,

It depends on compaction strategy to an extent. Leveled compaction is
partitioning sstables on token range so there is a wider variety of
scenarios where it works. I haven't done the napkin math at 10 terabytes
to figure what % of sstables will be leveled to the point they work with
256 vnodes.
It's also probably possible without vnodes to use other compaction
strategies by specifying multiple data directories so that they are
partitioned by token range to match replication. I don't know the
operational process is for that maybe Dinesh does.
Ariel


On Wed, Aug 29, 2018, at 3:33 AM, kurt greaves wrote:
> My reasoning was if you have a small cluster with vnodes you're more
> likely to have enough overlap between nodes that whole SSTables will
> be streamed on major ops. As  N gets >RF you'll have less common
> ranges and thus less likely to be streaming complete SSTables. Correct
> me if I've misunderstood.> 
> On 28 August 2018 at 01:37, Dinesh Joshi
>  wrote:>> Although the extent of benefits 
> depend on the specific use case, the
>> cluster size is definitely not a limiting factor.
 Dinesh
>> 
>> 
>> On Aug 27, 2018, at 5:05 AM, kurt greaves
>>  wrote:>>> I believe there are caveats that it will 
>> only really help if you're
>>> not using vnodes, or you have a very small cluster, and also
>>> internode encryption is not enabled. Alternatively if you're using
>>> JBOD vnodes will be marginally better, but JBOD is not a great idea
>>> (and doesn't guarantee a massive improvement).>>> 
>>> On 27 August 2018 at 15:46, dinesh.jo...@yahoo.com.INVALID
>>>  wrote: Yes, this feature will help 
>>> with operating nodes with higher data
 density. 
 
 Dinesh
 
 
 
 On Saturday, August 25, 2018, 9:01:27 PM PDT, onmstester onmstester
  wrote: 
 
 I've noticed this new feature of 4.0:
 Streaming optimizations
 (https://cassandra.apache.org/blog/2018/08/07/faster_streaming_in_cassandra.html)
  Is this mean that we could have much more data density with
 Cassandra 4.0 (less problems than 3.X)? I mean > 10 TB of data on
 each node without worrying about node join/remove? This is something 
 needed for Write-Heavy applications that do not
 read a lot. When you have like 2 TB of data per day and need to
 keep it for 6 month, it would be waste of money to purchase 180
 servers (even Commodity or Cloud). IMHO, even if 4.0 fix problem with 
 streaming/joining a new node,
 still Compaction is another evil for a big node, but we could
 tolerate that somehow 
 Sent using Zoho Mail[1]


 


Links:

  1. https://www.zoho.com/mail/


RE: [EXTERNAL] Re: Re: bigger data density with Cassandra 4.0?

2018-08-29 Thread Durity, Sean R
If you are going to compare vs commercial offerings like Scylla and CosmosDB, 
you should be looking at DataStax Enterprise. They are moving more quickly than 
open source (IMO) on adding features and tools that enterprises really need. I 
think they have some emerging tech for large/dense nodes, in particular. The 
ability to handle different data model types (Graph and Search) and embedded 
analytics sets it apart from plain Cassandra. Plus, they have replaced 
Cassandra’s SEDA architecture to give it a significant boost in performance. As 
a customer, I see the value in what they are doing.


Sean Durity
From: onmstester onmstester 
Sent: Wednesday, August 29, 2018 7:43 AM
To: user 
Subject: [EXTERNAL] Re: Re: bigger data density with Cassandra 4.0?

Could you please explain more about (you mean slower performance in compare to 
Cassandra?)
---Hbase tends to be quite average for transactional data

and about:
ScyllaDB IDK, I'd assume they just sorted out streaming by learning from 
C*'s mistakes.
While ScyllaDB is a much younger project than Cassandra with so much less usage 
and attention, Currently I encounter a dilemma on launching new clusters which 
is: should i wait for Cassandra community to apply all enhancement's and bug 
fixes that applied by their main competitors (Scylla DB or Cosmos DB) or just 
switch to competitors (afraid of the new world!)?
For example right now is there a motivation to handle more dense nodes in near 
future?

Again, Thank you for your time


Sent using Zoho 
Mail


 On Wed, 29 Aug 2018 15:16:40 +0430 kurt greaves 
mailto:k...@instaclustr.com>> wrote 

Most of the issues around big nodes is related to streaming, which is currently 
quite slow (should be a bit better in 4.0). HBase is built on top of hadoop, 
which is much better at large files/very dense nodes, and tends to be quite 
average for transactional data. ScyllaDB IDK, I'd assume they just sorted out 
streaming by learning from C*'s mistakes.

On 29 August 2018 at 19:43, onmstester onmstester 
mailto:onmstes...@zoho.com>> wrote:


Thanks Kurt,
Actually my cluster has > 10 nodes, so there is a tiny chance to stream a 
complete SSTable.
While logically any Columnar noSql db like Cassandra, needs always to re-sort 
grouped data for later-fast-reads and having nodes with big amount of data (> 2 
TB) would be annoying for this background process, How is it possible that some 
of these databases like HBase and Scylla db does not emphasis on small nodes 
(like Cassandra do)?


Sent using Zoho 
Mail


 Forwarded message 
From : kurt greaves mailto:k...@instaclustr.com>>
To : "User"mailto:user@cassandra.apache.org>>
Date : Wed, 29 Aug 2018 12:03:47 +0430
Subject : Re: bigger data density with Cassandra 4.0?
 Forwarded message 

My reasoning was if you have a small cluster with vnodes you're more likely to 
have enough overlap between nodes that whole SSTables will be streamed on major 
ops. As  N gets >RF you'll have less common ranges and thus less likely to be 
streaming complete SSTables. Correct me if I've misunderstood.






The information in this Internet Email is confidential and may be legally 
privileged. It is intended solely for the addressee. Access to this Email by 
anyone else is unauthorized. If you are not the intended recipient, any 
disclosure, copying, distribution or any action taken or omitted to be taken in 
reliance on it, is prohibited and may be unlawful. When addressed to our 
clients any opinions or advice contained in this Email are subject to the terms 
and conditions expressed in any applicable governing The Home Depot terms of 
business or client engagement letter. The Home Depot disclaims all 
responsibility and liability for the accuracy and content of this attachment 
and for any damages or losses arising from any inaccuracies, errors, viruses, 
e.g., worms, trojan horses, etc., or other items of a destructive nature, which 
may be contained in this attachment and shall not be liable for direct, 
indirect, consequential or special damages in connection with this e-mail 
message or its attachment.


RE: [EXTERNAL] Re: Nodetool refresh v/s sstableloader

2018-08-29 Thread Durity, Sean R
Sstableloader, though, could require a lot more disk space – until compaction 
can reduce. For example, if your RF=3, you will essentially be loading 3 copies 
of the data. Then it will get replicated 3 more times as it is being loaded. 
Thus, you could need up to 9x disk space.


Sean Durity
From: kurt greaves 
Sent: Wednesday, August 29, 2018 7:26 AM
To: User 
Subject: [EXTERNAL] Re: Nodetool refresh v/s sstableloader

Removing dev...
Nodetool refresh only picks up new SSTables that have been placed in the tables 
directory. It doesn't account for actual ownership of the data like 
SSTableloader does. Refresh will only work properly if the SSTables you are 
copying in are completely covered by that nodes tokens. It doesn't work if 
there's a change in topology, replication and token ownership will have to be 
more or less the same.

SSTableloader will break up the SSTables and send the relevant bits to 
whichever node needs it, so no need for you to worry about tokens and copying 
data to the right places, it will do that for you.

On 28 August 2018 at 11:27, Rajath Subramanyam 
mailto:rajat...@gmail.com>> wrote:
Hi Cassandra users, Cassandra dev,

When recovering using SSTables from a snapshot, I want to know what are the key 
differences between using:
1. Nodetool refresh and,
2. SSTableloader

Does nodetool refresh have restrictions that need to be met? Does nodetool 
refresh work even if there is a change in the topology between the source 
cluster and the destination cluster? Does it work if the token ranges don't 
match between the source cluster and the destination cluster? Does it work when 
an old SSTable in the snapshot has a dropped column that is not part of the 
current schema?

I appreciate any help in advance.

Thanks,
Rajath

Rajath Subramanyam





The information in this Internet Email is confidential and may be legally 
privileged. It is intended solely for the addressee. Access to this Email by 
anyone else is unauthorized. If you are not the intended recipient, any 
disclosure, copying, distribution or any action taken or omitted to be taken in 
reliance on it, is prohibited and may be unlawful. When addressed to our 
clients any opinions or advice contained in this Email are subject to the terms 
and conditions expressed in any applicable governing The Home Depot terms of 
business or client engagement letter. The Home Depot disclaims all 
responsibility and liability for the accuracy and content of this attachment 
and for any damages or losses arising from any inaccuracies, errors, viruses, 
e.g., worms, trojan horses, etc., or other items of a destructive nature, which 
may be contained in this attachment and shall not be liable for direct, 
indirect, consequential or special damages in connection with this e-mail 
message or its attachment.


Re: URGENT: disable reads from node

2018-08-29 Thread Vlad
Hi,
>You'll need to disable the native transportWell, this is what I did already, 
>it seems repair is running

I'm not sure whether repair will finish within 3 hours, but I can run it again 
(as it's incremental repair by default, right?)


I'm not sure about RF=3 and QUORUM reads because of load/disk space constrains 
we have, but we'll definitely consider this.

Thanks to all for help!
 

On Wednesday, August 29, 2018 4:13 PM, Alexander Dejanovski 
 wrote:
 

 Kurt is right. 
So here are the options I can think of : - use the join_ring false technique 
and rely on hints. You'll need to disable the native transport on the node as 
well to prevent direct connections to be made to it. Hopefully, you can run 
repair in less than 3 hours which is the hint window (hints will be collected 
while the node hasn't joined the ring). Otherwise you'll have more consistency 
issues after the node joins the ring again. Maybe incremental repair could help 
fixing this quickly afterwards if you've been running full repairs that 
involved anticompaction (if you're running at least Cassandra 2.2).- Fully 
re-bootstrap the node by replacing itself, using the replace_address_first_boot 
technique (but since you have RF=2, that would most probably mean some data 
loss since you read/write at ONE)- Try to cheat the dynamic snitch to take the 
node out of reads. You would then have the node join the ring normally, disable 
native transport and raise Severity (in 
org.apache.cassandra.db:type=DynamicEndpointSnitch) to something like 50 so the 
node won't be selected by the dynamic snitch. I guess the value will reset 
itself over time so you may need to set it to 50 on a regular basis while 
repair is happening.
I would then strongly consider moving to RF=3 because RF=2 will lead you to 
this type of situation again in the future and does not allow quorum reads with 
fault tolerance. Good luck,
On Wed, Aug 29, 2018 at 1:56 PM Vlad  wrote:

I restarted with cassandra.join_ring=falsenodetool status on other nodes shows 
this node as DN, while it see itself as UN.


>I'd say best to just query at QUORUM until you can finish repairs.We have RH 
>2, so I guess QUORUM queries will fail. Also different application should be 
>changed for this. 

On Wednesday, August 29, 2018 2:41 PM, kurt greaves  
wrote:
 

 Note that you'll miss incoming writes if you do that, so you'll be 
inconsistent even after the repair. I'd say best to just query at QUORUM until 
you can finish repairs.
On 29 August 2018 at 21:22, Alexander Dejanovski  wrote:

Hi Vlad, you must restart the node but first disable joining the cluster, as 
described in the second part of this blog post : 
http://thelastpickle.com/blog/ 2018/08/02/Re-Bootstrapping- 
Without-Bootstrapping.html
Once repaired, you'll have to run "nodetool join" to start serving reads.

Le mer. 29 août 2018 à 12:40, Vlad  a écrit :

Will it help to set read_repair_chance to 1 (compaction is 
SizeTieredCompactionStrategy)? 

On Wednesday, August 29, 2018 1:34 PM, Vlad  
wrote:
 

 Hi,
quite urgent questions:due to disk and C* start problem we were forced to 
delete commit logs from one of nodes.
Now repair is running, but meanwhile some reads bring no data (RF=2)

Can this node be excluded from reads queries? And that  all reads will be 
redirected to other node in the ring?

Thanks to All for help.


   
-- 
-Alexander DejanovskiFrance@alexanderdeja
ConsultantApache Cassandra Consultinghttp://www.thelastpickle.com



   
-- 
-Alexander DejanovskiFrance@alexanderdeja
ConsultantApache Cassandra Consultinghttp://www.thelastpickle.com

   

Re: URGENT: disable reads from node

2018-08-29 Thread Alexander Dejanovski
Kurt is right.

So here are the options I can think of :
- use the join_ring false technique and rely on hints. You'll need to
disable the native transport on the node as well to prevent direct
connections to be made to it. Hopefully, you can run repair in less than 3
hours which is the hint window (hints will be collected while the node
hasn't joined the ring). Otherwise you'll have more consistency issues
after the node joins the ring again. Maybe incremental repair could help
fixing this quickly afterwards if you've been running full repairs that
involved anticompaction (if you're running at least Cassandra 2.2).
- Fully re-bootstrap the node by replacing itself, using the
replace_address_first_boot technique (but since you have RF=2, that would
most probably mean some data loss since you read/write at ONE)
- Try to cheat the dynamic snitch to take the node out of reads. You would
then have the node join the ring normally, disable native transport and
raise Severity (in org.apache.cassandra.db:type=DynamicEndpointSnitch) to
something like 50 so the node won't be selected by the dynamic snitch. I
guess the value will reset itself over time so you may need to set it to 50
on a regular basis while repair is happening.

I would then strongly consider moving to RF=3 because RF=2 will lead you to
this type of situation again in the future and does not allow quorum reads
with fault tolerance.

Good luck,

On Wed, Aug 29, 2018 at 1:56 PM Vlad  wrote:

> I restarted with cassandra.join_ring=false
> nodetool status on other nodes shows this node as DN, while it see itself
> as UN.
>
>
> >I'd say best to just query at QUORUM until you can finish repairs.
> We have RH 2, so I guess QUORUM queries will fail. Also different
> application should be changed for this.
>
>
> On Wednesday, August 29, 2018 2:41 PM, kurt greaves 
> wrote:
>
>
> Note that you'll miss incoming writes if you do that, so you'll be
> inconsistent even after the repair. I'd say best to just query at QUORUM
> until you can finish repairs.
>
> On 29 August 2018 at 21:22, Alexander Dejanovski 
> wrote:
>
> Hi Vlad, you must restart the node but first disable joining the cluster,
> as described in the second part of this blog post :
> http://thelastpickle.com/blog/ 2018/08/02/Re-Bootstrapping-
> Without-Bootstrapping.html
> 
>
> Once repaired, you'll have to run "nodetool join" to start serving reads.
>
>
> Le mer. 29 août 2018 à 12:40, Vlad  a écrit :
>
> Will it help to set read_repair_chance to 1 (compaction is
> SizeTieredCompactionStrategy)?
>
>
> On Wednesday, August 29, 2018 1:34 PM, Vlad 
> wrote:
>
>
> Hi,
>
> quite urgent questions:
> due to disk and C* start problem we were forced to delete commit logs from
> one of nodes.
>
> Now repair is running, but meanwhile some reads bring no data (RF=2)
>
> Can this node be excluded from reads queries? And that  all reads will be
> redirected to other node in the ring?
>
>
> Thanks to All for help.
>
>
> --
> -
> Alexander Dejanovski
> France
> @alexanderdeja
>
> Consultant
> Apache Cassandra Consulting
> http://www.thelastpickle.com
>
>
>
>
> --
-
Alexander Dejanovski
France
@alexanderdeja

Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com


Re: Repairs are slow after upgrade to 3.11.3

2018-08-29 Thread Maxim Parkachov
Hi Alex,

I'm using Cassandra reaper as well. Could be
https://issues.apache.org/jira/browse/CASSANDRA-14332 as it was committed
in both version.

Regards,
Maxim.

On Wed, Aug 29, 2018 at 2:14 PM Oleksandr Shulgin <
oleksandr.shul...@zalando.de> wrote:

> On Wed, Aug 29, 2018 at 3:06 AM Maxim Parkachov 
> wrote:
>
>> couple of days ago I have upgraded Cassandra from 3.11.2 to 3.11.3 and I
>> see that repair time is practically doubled. Does someone else experience
>> the same regression ?
>>
>
> We have upgraded from 3.0.16 to 3.0.17 two days ago and we see the same
> symptom.  We are using Cassandra reaper and average time to repair one
> segment increased from 5-6 to 10-12 min.
>
> --
> Alex
>
>


Re: Repairs are slow after upgrade to 3.11.3

2018-08-29 Thread Maxim Parkachov
Hi,

I wanted to get rid of https://issues.apache.org/jira/browse/CASSANDRA-14332
and https://issues.apache.org/jira/browse/CASSANDRA-14470. I haven't seen
these errors yet, but it is early to say after couple of days of operation.

Regards,
Maxim.

On Wed, Aug 29, 2018 at 10:27 AM Jean Carlo 
wrote:

> Hello,
>
> Can I ask you why did you upgrade from 3.11.2 ? did you experience some
> java heap problems ?
>
> Unfortunately I cannot answer your question :( I am in the 2.1 and about
> to upgrade to 3.11
>
> Best greatings
>
>
>
> Jean Carlo
>
> "The best way to predict the future is to invent it" Alan Kay
>
>
> On Wed, Aug 29, 2018 at 3:06 AM Maxim Parkachov 
> wrote:
>
>> Hi everyone,
>>
>> couple of days ago I have upgraded Cassandra from 3.11.2 to 3.11.3 and I
>> see that repair time is practically doubled. Does someone else experience
>> the same regression ?
>>
>> Regards,
>> Maxim.
>>
>


Re: URGENT: disable reads from node

2018-08-29 Thread Vlad
Also after restart with join_ring=false C* is still accepting connection on 
port 9042 (and obviously returning no data), so I run nodetool drainIs it good?
I run nodetool repair on this node. Meanwhile command didn't return, but I see 
in log
INFO  [Thread-6] 2018-08-29 12:16:03,954 RepairRunnable.java:125 - Starting 
repair command #1, repairing keyspace scanrepo with repair options 
(parallelism: parallel, primary range: false, incremental: true, job threads: 
1, ColumnFamilies: [], dataCenters: [], hosts: [], # of ranges: 530)
ERROR [Thread-6] 2018-08-29 12:16:14,363 SystemDistributedKeyspace.java:306 - 
Error executing query INSERT INTO system_distributed.parent_repair_history 
(parent_id, keyspace_name, columnfamily_names, requested_ranges, started_at,
  options) VALUES (...) 
org.apache.cassandra.exceptions.WriteTimeoutException: Operation timed out - 
received only 0 responses.

and  nodetool compactionstats shows 
pending tasks: 9
- system_schema.tables: 1
- system_schema.keyspaces: 1
- ks1.tb1: 4
- ks1.tb2: 3


 

On Wednesday, August 29, 2018 2:57 PM, Vlad  
wrote:
 

 I restarted with cassandra.join_ring=falsenodetool status on other nodes shows 
this node as DN, while it see itself as UN.


>I'd say best to just query at QUORUM until you can finish repairs.We have RH 
>2, so I guess QUORUM queries will fail. Also different application should be 
>changed for this. 

On Wednesday, August 29, 2018 2:41 PM, kurt greaves  
wrote:
 

 Note that you'll miss incoming writes if you do that, so you'll be 
inconsistent even after the repair. I'd say best to just query at QUORUM until 
you can finish repairs.
On 29 August 2018 at 21:22, Alexander Dejanovski  wrote:

Hi Vlad, you must restart the node but first disable joining the cluster, as 
described in the second part of this blog post : 
http://thelastpickle.com/blog/ 2018/08/02/Re-Bootstrapping- 
Without-Bootstrapping.html
Once repaired, you'll have to run "nodetool join" to start serving reads.

Le mer. 29 août 2018 à 12:40, Vlad  a écrit :

Will it help to set read_repair_chance to 1 (compaction is 
SizeTieredCompactionStrategy)? 

On Wednesday, August 29, 2018 1:34 PM, Vlad  
wrote:
 

 Hi,
quite urgent questions:due to disk and C* start problem we were forced to 
delete commit logs from one of nodes.
Now repair is running, but meanwhile some reads bring no data (RF=2)

Can this node be excluded from reads queries? And that  all reads will be 
redirected to other node in the ring?

Thanks to All for help.


   
-- 
-Alexander DejanovskiFrance@alexanderdeja
ConsultantApache Cassandra Consultinghttp://www.thelastpickle.com



   

   

Re: Repairs are slow after upgrade to 3.11.3

2018-08-29 Thread Oleksandr Shulgin
On Wed, Aug 29, 2018 at 3:06 AM Maxim Parkachov 
wrote:

> couple of days ago I have upgraded Cassandra from 3.11.2 to 3.11.3 and I
> see that repair time is practically doubled. Does someone else experience
> the same regression ?
>

We have upgraded from 3.0.16 to 3.0.17 two days ago and we see the same
symptom.  We are using Cassandra reaper and average time to repair one
segment increased from 5-6 to 10-12 min.

--
Alex


Re: URGENT: disable reads from node

2018-08-29 Thread Vlad
I restarted with cassandra.join_ring=falsenodetool status on other nodes shows 
this node as DN, while it see itself as UN.


>I'd say best to just query at QUORUM until you can finish repairs.We have RH 
>2, so I guess QUORUM queries will fail. Also different application should be 
>changed for this. 

On Wednesday, August 29, 2018 2:41 PM, kurt greaves  
wrote:
 

 Note that you'll miss incoming writes if you do that, so you'll be 
inconsistent even after the repair. I'd say best to just query at QUORUM until 
you can finish repairs.
On 29 August 2018 at 21:22, Alexander Dejanovski  wrote:

Hi Vlad, you must restart the node but first disable joining the cluster, as 
described in the second part of this blog post : 
http://thelastpickle.com/blog/ 2018/08/02/Re-Bootstrapping- 
Without-Bootstrapping.html
Once repaired, you'll have to run "nodetool join" to start serving reads.

Le mer. 29 août 2018 à 12:40, Vlad  a écrit :

Will it help to set read_repair_chance to 1 (compaction is 
SizeTieredCompactionStrategy)? 

On Wednesday, August 29, 2018 1:34 PM, Vlad  
wrote:
 

 Hi,
quite urgent questions:due to disk and C* start problem we were forced to 
delete commit logs from one of nodes.
Now repair is running, but meanwhile some reads bring no data (RF=2)

Can this node be excluded from reads queries? And that  all reads will be 
redirected to other node in the ring?

Thanks to All for help.


   
-- 
-Alexander DejanovskiFrance@alexanderdeja
ConsultantApache Cassandra Consultinghttp://www.thelastpickle.com



   

Re: Re: bigger data density with Cassandra 4.0?

2018-08-29 Thread onmstester onmstester
Could you please explain more about (you mean slower performance in compare to 
Cassandra?) ---Hbase tends to be quite average for transactional data and 
about: ScyllaDB IDK, I'd assume they just sorted out streaming by learning 
from C*'s mistakes. While ScyllaDB is a much younger project than Cassandra 
with so much less usage and attention, Currently I encounter a dilemma on 
launching new clusters which is: should i wait for Cassandra community to apply 
all enhancement's and bug fixes that applied by their main competitors (Scylla 
DB or Cosmos DB) or just switch to competitors (afraid of the new world!)? For 
example right now is there a motivation to handle more dense nodes in near 
future? Again, Thank you for your time Sent using Zoho Mail  On Wed, 29 Aug 
2018 15:16:40 +0430 kurt greaves  wrote  Most of the 
issues around big nodes is related to streaming, which is currently quite slow 
(should be a bit better in 4.0). HBase is built on top of hadoop, which is much 
better at large files/very dense nodes, and tends to be quite average for 
transactional data. ScyllaDB IDK, I'd assume they just sorted out streaming by 
learning from C*'s mistakes. On 29 August 2018 at 19:43, onmstester onmstester 
 wrote: Thanks Kurt, Actually my cluster has > 10 nodes, 
so there is a tiny chance to stream a complete SSTable. While logically any 
Columnar noSql db like Cassandra, needs always to re-sort grouped data for 
later-fast-reads and having nodes with big amount of data (> 2 TB) would be 
annoying for this background process, How is it possible that some of these 
databases like HBase and Scylla db does not emphasis on small nodes (like 
Cassandra do)? Sent using Zoho Mail  Forwarded message  
From : kurt greaves  To : 
"User" Date : Wed, 29 Aug 2018 12:03:47 +0430 
Subject : Re: bigger data density with Cassandra 4.0?  Forwarded 
message  My reasoning was if you have a small cluster with vnodes 
you're more likely to have enough overlap between nodes that whole SSTables 
will be streamed on major ops. As  N gets >RF you'll have less common ranges 
and thus less likely to be streaming complete SSTables. Correct me if I've 
misunderstood.

Re: URGENT: disable reads from node

2018-08-29 Thread kurt greaves
Note that you'll miss incoming writes if you do that, so you'll be
inconsistent even after the repair. I'd say best to just query at QUORUM
until you can finish repairs.

On 29 August 2018 at 21:22, Alexander Dejanovski 
wrote:

> Hi Vlad, you must restart the node but first disable joining the cluster,
> as described in the second part of this blog post :
> http://thelastpickle.com/blog/2018/08/02/Re-Bootstrapping-
> Without-Bootstrapping.html
>
> Once repaired, you'll have to run "nodetool join" to start serving reads.
>
>
> Le mer. 29 août 2018 à 12:40, Vlad  a écrit :
>
>> Will it help to set read_repair_chance to 1 (compaction is
>> SizeTieredCompactionStrategy)?
>>
>>
>> On Wednesday, August 29, 2018 1:34 PM, Vlad 
>> wrote:
>>
>>
>> Hi,
>>
>> quite urgent questions:
>> due to disk and C* start problem we were forced to delete commit logs
>> from one of nodes.
>>
>> Now repair is running, but meanwhile some reads bring no data (RF=2)
>>
>> Can this node be excluded from reads queries? And that  all reads will be
>> redirected to other node in the ring?
>>
>>
>> Thanks to All for help.
>>
>>
>> --
> -
> Alexander Dejanovski
> France
> @alexanderdeja
>
> Consultant
> Apache Cassandra Consulting
> http://www.thelastpickle.com
>


Re: Nodetool refresh v/s sstableloader

2018-08-29 Thread kurt greaves
Removing dev...
Nodetool refresh only picks up new SSTables that have been placed in the
tables directory. It doesn't account for actual ownership of the data like
SSTableloader does. Refresh will only work properly if the SSTables you are
copying in are completely covered by that nodes tokens. It doesn't work if
there's a change in topology, replication and token ownership will have to
be more or less the same.

SSTableloader will break up the SSTables and send the relevant bits to
whichever node needs it, so no need for you to worry about tokens and
copying data to the right places, it will do that for you.

On 28 August 2018 at 11:27, Rajath Subramanyam  wrote:

> Hi Cassandra users, Cassandra dev,
>
> When recovering using SSTables from a snapshot, I want to know what are
> the key differences between using:
> 1. Nodetool refresh and,
> 2. SSTableloader
>
> Does nodetool refresh have restrictions that need to be met?
> Does nodetool refresh work even if there is a change in the topology
> between the source cluster and the destination cluster? Does it work if the
> token ranges don't match between the source cluster and the destination
> cluster? Does it work when an old SSTable in the snapshot has a dropped
> column that is not part of the current schema?
>
> I appreciate any help in advance.
>
> Thanks,
> Rajath
> 
> Rajath Subramanyam
>
>


Re: URGENT: disable reads from node

2018-08-29 Thread Alexander Dejanovski
Hi Vlad, you must restart the node but first disable joining the cluster,
as described in the second part of this blog post :
http://thelastpickle.com/blog/2018/08/02/Re-Bootstrapping-Without-Bootstrapping.html

Once repaired, you'll have to run "nodetool join" to start serving reads.


Le mer. 29 août 2018 à 12:40, Vlad  a écrit :

> Will it help to set read_repair_chance to 1 (compaction is
> SizeTieredCompactionStrategy)?
>
>
> On Wednesday, August 29, 2018 1:34 PM, Vlad 
> wrote:
>
>
> Hi,
>
> quite urgent questions:
> due to disk and C* start problem we were forced to delete commit logs from
> one of nodes.
>
> Now repair is running, but meanwhile some reads bring no data (RF=2)
>
> Can this node be excluded from reads queries? And that  all reads will be
> redirected to other node in the ring?
>
>
> Thanks to All for help.
>
>
> --
-
Alexander Dejanovski
France
@alexanderdeja

Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com


Re: Re: bigger data density with Cassandra 4.0?

2018-08-29 Thread kurt greaves
Most of the issues around big nodes is related to streaming, which is
currently quite slow (should be a bit better in 4.0). HBase is built on top
of hadoop, which is much better at large files/very dense nodes, and tends
to be quite average for transactional data. ScyllaDB IDK, I'd assume they
just sorted out streaming by learning from C*'s mistakes.

On 29 August 2018 at 19:43, onmstester onmstester 
wrote:

> Thanks Kurt,
> Actually my cluster has > 10 nodes, so there is a tiny chance to stream a
> complete SSTable.
> While logically any Columnar noSql db like Cassandra, needs always to
> re-sort grouped data for later-fast-reads and having nodes with big amount
> of data (> 2 TB) would be annoying for this background process, How is it
> possible that some of these databases like HBase and Scylla db does not
> emphasis on small nodes (like Cassandra do)?
>
> Sent using Zoho Mail 
>
>
>  Forwarded message 
> From : kurt greaves 
> To : "User"
> Date : Wed, 29 Aug 2018 12:03:47 +0430
> Subject : Re: bigger data density with Cassandra 4.0?
>  Forwarded message 
>
> My reasoning was if you have a small cluster with vnodes you're more
> likely to have enough overlap between nodes that whole SSTables will be
> streamed on major ops. As  N gets >RF you'll have less common ranges and
> thus less likely to be streaming complete SSTables. Correct me if I've
> misunderstood.
>
>
>
>


Re: URGENT: disable reads from node

2018-08-29 Thread Vlad
Will it help to set read_repair_chance to 1 (compaction is 
SizeTieredCompactionStrategy)? 

On Wednesday, August 29, 2018 1:34 PM, Vlad  
wrote:
 

 Hi,
quite urgent questions:due to disk and C* start problem we were forced to 
delete commit logs from one of nodes.
Now repair is running, but meanwhile some reads bring no data (RF=2)

Can this node be excluded from reads queries? And that  all reads will be 
redirected to other node in the ring?

Thanks to All for help.


   

URGENT: disable reads from node

2018-08-29 Thread Vlad
Hi,
quite urgent questions:due to disk and C* start problem we were forced to 
delete commit logs from one of nodes.
Now repair is running, but meanwhile some reads bring no data (RF=2)

Can this node be excluded from reads queries? And that  all reads will be 
redirected to other node in the ring?

Thanks to All for help.


Fwd: Re: bigger data density with Cassandra 4.0?

2018-08-29 Thread onmstester onmstester
Thanks Kurt, Actually my cluster has > 10 nodes, so there is a tiny chance to 
stream a complete SSTable. While logically any Columnar noSql db like 
Cassandra, needs always to re-sort grouped data for later-fast-reads and having 
nodes with big amount of data (> 2 TB) would be annoying for this background 
process, How is it possible that some of these databases like HBase and Scylla 
db does not emphasis on small nodes (like Cassandra do)? Sent using Zoho Mail 
 Forwarded message  From : kurt greaves 
 To : "User" Date : Wed, 29 
Aug 2018 12:03:47 +0430 Subject : Re: bigger data density with Cassandra 4.0? 
 Forwarded message  My reasoning was if you have a 
small cluster with vnodes you're more likely to have enough overlap between 
nodes that whole SSTables will be streamed on major ops. As  N gets >RF you'll 
have less common ranges and thus less likely to be streaming complete SSTables. 
Correct me if I've misunderstood.

Unsubscribe

2018-08-29 Thread Raj Bakhru



Re: Repairs are slow after upgrade to 3.11.3

2018-08-29 Thread Jean Carlo
Hello,

Can I ask you why did you upgrade from 3.11.2 ? did you experience some
java heap problems ?

Unfortunately I cannot answer your question :( I am in the 2.1 and about to
upgrade to 3.11

Best greatings



Jean Carlo

"The best way to predict the future is to invent it" Alan Kay


On Wed, Aug 29, 2018 at 3:06 AM Maxim Parkachov 
wrote:

> Hi everyone,
>
> couple of days ago I have upgraded Cassandra from 3.11.2 to 3.11.3 and I
> see that repair time is practically doubled. Does someone else experience
> the same regression ?
>
> Regards,
> Maxim.
>


Re: bigger data density with Cassandra 4.0?

2018-08-29 Thread kurt greaves
My reasoning was if you have a small cluster with vnodes you're more likely
to have enough overlap between nodes that whole SSTables will be streamed
on major ops. As  N gets >RF you'll have less common ranges and thus less
likely to be streaming complete SSTables. Correct me if I've misunderstood.

On 28 August 2018 at 01:37, Dinesh Joshi 
wrote:

> Although the extent of benefits depend on the specific use case, the
> cluster size is definitely not a limiting factor.
>
> Dinesh
>
> On Aug 27, 2018, at 5:05 AM, kurt greaves  wrote:
>
> I believe there are caveats that it will only really help if you're not
> using vnodes, or you have a very small cluster, and also internode
> encryption is not enabled. Alternatively if you're using JBOD vnodes will
> be marginally better, but JBOD is not a great idea (and doesn't guarantee a
> massive improvement).
>
> On 27 August 2018 at 15:46, dinesh.jo...@yahoo.com.INVALID <
> dinesh.jo...@yahoo.com.invalid> wrote:
>
>> Yes, this feature will help with operating nodes with higher data density.
>>
>> Dinesh
>>
>>
>> On Saturday, August 25, 2018, 9:01:27 PM PDT, onmstester onmstester <
>> onmstes...@zoho.com> wrote:
>>
>>
>> I've noticed this new feature of 4.0:
>> Streaming optimizations (https://cassandra.apache.org/
>> blog/2018/08/07/faster_streaming_in_cassandra.html)
>> Is this mean that we could have much more data density with Cassandra 4.0
>> (less problems than 3.X)? I mean > 10 TB of data on each node without
>> worrying about node join/remove?
>> This is something needed for Write-Heavy applications that do not read a
>> lot. When you have like 2 TB of data per day and need to keep it for 6
>> month, it would be waste of money to purchase 180 servers (even Commodity
>> or Cloud).
>> IMHO, even if 4.0 fix problem with streaming/joining a new node, still
>> Compaction is another evil for a big node, but we could tolerate that
>> somehow
>>
>> Sent using Zoho Mail 
>>
>>
>>
>