Re: Optimal backup strategy

2019-12-02 Thread Adarsh Kumar
Thanks Hossein,

Just one more question is there any special SOP or consideration we have to
take for multi-site backup.

Please share any helpful link, blog or steps documented.

Regards,
Adarsh Kumar

On Sun, Dec 1, 2019 at 10:40 PM Hossein Ghiyasi Mehr 
wrote:

> 1. It's recommended to use commit log after one node failure. Cassandra
> has many options such as replication factor as substitute solution.
> 2. Yes, right.
>
> *VafaTech.com - A Total Solution for Data Gathering & Analysis*
>
>
> On Fri, Nov 29, 2019 at 9:33 AM Adarsh Kumar  wrote:
>
>> Thanks Ahu and Hussein,
>>
>> So my understanding is:
>>
>>1. Commit log backup is not documented for Apache Cassandra, hence
>>not standard. But can be used for restore on the same machine (For taking
>>backup from commit_log_dir). If used on other machine(s) has to be in the
>>same topology. Can it be used for replacement node?
>>2. For periodic backup Snapshot+Incremental backup is the best option
>>
>>
>> Thanks,
>> Adarsh Kumar
>>
>> On Fri, Nov 29, 2019 at 7:28 AM guo Maxwell  wrote:
>>
>>> Hossein is right , But for use , we restore to the same cassandra
>>> topology ,So it is usable to do replay .But when restore to the
>>> same machine it is also usable .
>>> Using sstableloader cost too much time and more storage(though will
>>> reduce after  restored)
>>>
>>> Hossein Ghiyasi Mehr  于2019年11月28日周四 下午7:40写道:
>>>
 commitlog backup isn't usable in another machine.
 Backup solution depends on what you want to do: periodic backup or
 backup to restore on other machine?
 Periodic backup is combine of snapshot and incremental backup. Remove
 incremental backup after new snapshot.
 Take backup to restore on other machine: You can use snapshot after
 flushing memtable or Use sstableloader.


 
 VafaTech.com - A Total Solution for Data Gathering & Analysis

 On Thu, Nov 28, 2019 at 6:05 AM guo Maxwell 
 wrote:

> for cassandra or datastax's documentation, commitlog's backup is not
> mentioned.
> only snapshot and incremental backup is described to do backup .
>
> Though commitlog's archive for keyspace/table is not support but
> commitlog' replay (though you must put log to commitlog_dir and restart 
> the
> process)
> support the feature of keyspace/table' replay filter (using
> -Dcassandra.replayList with the keyspace1.table1,keyspace1.table2 format 
> to
> replay the specified keyspace/table)
>
> Snapshot do affect the storage, for us we got snapshot one week a time
> under the low business peak and making snapshot got throttle ,for you you
> may
> see the issue (https://issues.apache.org/jira/browse/CASSANDRA-13019)
>
>
>
> Adarsh Kumar  于2019年11月28日周四 上午1:00写道:
>
>> Thanks Guo and Eric for replying,
>>
>> I have some confusions about commit log backup:
>>
>>1. commit log archival technique is (
>>
>> https://support.datastax.com/hc/en-us/articles/115001593706-Manual-Backup-and-Restore-with-Point-in-time-and-table-level-restore-
>>) as good as an incremental backup, as it also captures commit logs 
>> after
>>memtable flush.
>>2. If we go for "Snapshot + Incremental bk + Commit log", here we
>>have to take commit log from commit log directory (is there any SOP 
>> for
>>this?). As commit logs are not per table or ks, we will have chalange 
>> in
>>restoring selective tables.
>>3. Snapshot based backups are easy to manage and operate due to
>>its simplicity. But they are heavy on storage. Any views on this?
>>4. Please share any successful strategy that someone is using for
>>production. We are still in the design phase and want to implement 
>> the best
>>solution.
>>
>> Thanks Eric for sharing link for medusa.
>>
>> Regards,
>> Adarsh Kumar
>>
>> On Wed, Nov 27, 2019 at 5:16 PM guo Maxwell 
>> wrote:
>>
>>> For me, I think the last one :
>>>  Snapshot + Incremental + commitlog
>>> is the most meaningful way to do backup and restore, when you make
>>> the data backup to some where else like AWS S3.
>>>
>>>- Snapshot based backup // for incremental data will not be
>>>backuped and may lose data when restore to the time latter than 
>>> snapshot
>>>time;
>>>- Incremental backups // better than snapshot backup .but
>>>with Insufficient data accuracy. For data remain in the memtable 
>>> will be
>>>lose;
>>>- Snapshot + incremental
>>>- Snapshot + commitlog archival // better data precision than
>>>made incremental backup, but the data in the non archived 
>>> commitlog(not
>>>archive and commitlog log not closed) will not restore and will 
>>> lose. Also
>>>when log is too much, do log reply 

Re: "Maximum memory usage reached (512.000MiB), cannot allocate chunk of 1.000MiB"

2019-12-02 Thread Jeff Jirsa
This would be true except that the pretty print for the log message is done
before the logging rate limiter is applied, so if you see MiB instead of a
raw byte count, you're PROBABLY spending a ton of time in string formatting
within the read path.

This is fixed in 3.11.3 (
https://issues.apache.org/jira/browse/CASSANDRA-14416 )


On Mon, Dec 2, 2019 at 8:07 AM Reid Pinchback 
wrote:

> Rahul, if my memory of this is correct, that particular logging message is
> noisy, the cache is pretty much always used to its limit (and why not, it’s
> a cache, no point in using less than you have).
>
>
>
> No matter what value you set, you’ll just change the “reached (….)” part
> of it.  I think what would help you more is to work with the team(s) that
> have apps depending upon C* and decide what your performance SLA is with
> them.  If you are meeting your SLA, you don’t care about noisy messages.
> If you aren’t meeting your SLA, then the noisy messages become sources of
> ideas to look at.
>
>
>
> One thing you’ll find out pretty quickly.  There are a lot of knobs you
> can turn with C*, too many to allow for easy answers on what you should
> do.  Figure out what your throughput and latency SLAs are, and you’ll know
> when to stop tuning.  Otherwise you’ll discover that it’s a rabbit hole you
> can dive into and not come out of for weeks.
>
>
>
>
>
> *From: *Hossein Ghiyasi Mehr 
> *Reply-To: *"user@cassandra.apache.org" 
> *Date: *Monday, December 2, 2019 at 10:35 AM
> *To: *"user@cassandra.apache.org" 
> *Subject: *Re: "Maximum memory usage reached (512.000MiB), cannot
> allocate chunk of 1.000MiB"
>
>
>
> *Message from External Sender*
>
> It may be helpful:
> https://thelastpickle.com/blog/2018/08/08/compression_performance.html
> 
>
> It's complex. Simple explanation, cassandra keeps sstables in memory based
> on chunk size and sstable parts. It manage loading new sstables to memory
> based on requests on different sstables correctly . You should be worry
> about it (sstables loaded in memory)
>
>
> *VafaTech.com - A Total Solution for Data Gathering & Analysis*
>
>
>
>
>
> On Mon, Dec 2, 2019 at 6:18 PM Rahul Reddy 
> wrote:
>
> Thanks Hossein,
>
>
>
> How does the chunks are moved out of memory (LRU?) if it want to make room
> for new requests to get chunks?if it has mechanism to clear chunks from
> cache what causes to cannot allocate chunk? Can you point me to any
> documention?
>
>
>
> On Sun, Dec 1, 2019, 12:03 PM Hossein Ghiyasi Mehr 
> wrote:
>
> Chunks are part of sstables. When there is enough space in memory to cache
> them, read performance will increase if application requests it again.
>
>
>
> Your real answer is application dependent. For example write heavy
> applications are different than read heavy or read-write heavy. Real time
> applications are different than time series data environments and ... .
>
>
>
>
>
>
>
> On Sun, Dec 1, 2019 at 7:09 PM Rahul Reddy 
> wrote:
>
> Hello,
>
>
>
> We are seeing memory usage reached 512 mb and cannot allocate 1MB.  I see
> this because file_cache_size_mb by default set to 512MB.
>
>
>
> Datastax document recommends to increase the file_cache_size.
>
>
>
> We have 32G over all memory allocated 16G to Cassandra. What is the
> recommended value in my case. And also when does this memory gets filled up
> frequent does nodeflush helps in avoiding this info messages?
>
>


Re: "Maximum memory usage reached (512.000MiB), cannot allocate chunk of 1.000MiB"

2019-12-02 Thread Reid Pinchback
Rahul, if my memory of this is correct, that particular logging message is 
noisy, the cache is pretty much always used to its limit (and why not, it’s a 
cache, no point in using less than you have).

No matter what value you set, you’ll just change the “reached (….)” part of it. 
 I think what would help you more is to work with the team(s) that have apps 
depending upon C* and decide what your performance SLA is with them.  If you 
are meeting your SLA, you don’t care about noisy messages.  If you aren’t 
meeting your SLA, then the noisy messages become sources of ideas to look at.

One thing you’ll find out pretty quickly.  There are a lot of knobs you can 
turn with C*, too many to allow for easy answers on what you should do.  Figure 
out what your throughput and latency SLAs are, and you’ll know when to stop 
tuning.  Otherwise you’ll discover that it’s a rabbit hole you can dive into 
and not come out of for weeks.


From: Hossein Ghiyasi Mehr 
Reply-To: "user@cassandra.apache.org" 
Date: Monday, December 2, 2019 at 10:35 AM
To: "user@cassandra.apache.org" 
Subject: Re: "Maximum memory usage reached (512.000MiB), cannot allocate chunk 
of 1.000MiB"

Message from External Sender
It may be helpful: 
https://thelastpickle.com/blog/2018/08/08/compression_performance.html
It's complex. Simple explanation, cassandra keeps sstables in memory based on 
chunk size and sstable parts. It manage loading new sstables to memory based on 
requests on different sstables correctly . You should be worry about it 
(sstables loaded in memory)

VafaTech.com - A Total Solution for Data Gathering & Analysis


On Mon, Dec 2, 2019 at 6:18 PM Rahul Reddy 
mailto:rahulreddy1...@gmail.com>> wrote:
Thanks Hossein,

How does the chunks are moved out of memory (LRU?) if it want to make room for 
new requests to get chunks?if it has mechanism to clear chunks from cache what 
causes to cannot allocate chunk? Can you point me to any documention?

On Sun, Dec 1, 2019, 12:03 PM Hossein Ghiyasi Mehr 
mailto:ghiyasim...@gmail.com>> wrote:
Chunks are part of sstables. When there is enough space in memory to cache 
them, read performance will increase if application requests it again.

Your real answer is application dependent. For example write heavy applications 
are different than read heavy or read-write heavy. Real time applications are 
different than time series data environments and ... .



On Sun, Dec 1, 2019 at 7:09 PM Rahul Reddy 
mailto:rahulreddy1...@gmail.com>> wrote:
Hello,

We are seeing memory usage reached 512 mb and cannot allocate 1MB.  I see this 
because file_cache_size_mb by default set to 512MB.

Datastax document recommends to increase the file_cache_size.

We have 32G over all memory allocated 16G to Cassandra. What is the recommended 
value in my case. And also when does this memory gets filled up frequent does 
nodeflush helps in avoiding this info messages?


Re: "Maximum memory usage reached (512.000MiB), cannot allocate chunk of 1.000MiB"

2019-12-02 Thread Hossein Ghiyasi Mehr
It may be helpful:
https://thelastpickle.com/blog/2018/08/08/compression_performance.html
It's complex. Simple explanation, cassandra keeps sstables in memory based
on chunk size and sstable parts. It manage loading new sstables to memory
based on requests on different sstables correctly . You should be worry
about it (sstables loaded in memory)

*VafaTech.com - A Total Solution for Data Gathering & Analysis*


On Mon, Dec 2, 2019 at 6:18 PM Rahul Reddy  wrote:

> Thanks Hossein,
>
> How does the chunks are moved out of memory (LRU?) if it want to make room
> for new requests to get chunks?if it has mechanism to clear chunks from
> cache what causes to cannot allocate chunk? Can you point me to any
> documention?
>
> On Sun, Dec 1, 2019, 12:03 PM Hossein Ghiyasi Mehr 
> wrote:
>
>> Chunks are part of sstables. When there is enough space in memory to
>> cache them, read performance will increase if application requests it again.
>>
>> Your real answer is application dependent. For example write heavy
>> applications are different than read heavy or read-write heavy. Real time
>> applications are different than time series data environments and ... .
>>
>>
>>
>> On Sun, Dec 1, 2019 at 7:09 PM Rahul Reddy 
>> wrote:
>>
>>> Hello,
>>>
>>> We are seeing memory usage reached 512 mb and cannot allocate 1MB.  I
>>> see this because file_cache_size_mb by default set to 512MB.
>>>
>>> Datastax document recommends to increase the file_cache_size.
>>>
>>> We have 32G over all memory allocated 16G to Cassandra. What is the
>>> recommended value in my case. And also when does this memory gets filled up
>>> frequent does nodeflush helps in avoiding this info messages?
>>>
>>


RE: [EXTERNAL] Migration a Keyspace from 3.0.X to 3.11.2 Cluster which already have keyspaces

2019-12-02 Thread Durity, Sean R
The size of the data matters here. Copy to/from is ok if the data is a few 
million rows per table, but not billions. It is also relatively slow (but with 
small data or a decent outage window, it could be fine). If the data is large 
and the outage time matters, you may need custom code to read from one cluster 
and write to another. If this is DataStax, the dsbulk utility would be ideal.


Sean Durity
-Original Message-
From: slmnjobs - 
Sent: Sunday, December 1, 2019 4:41 AM
To: user@cassandra.apache.org
Subject: [EXTERNAL] Migration a Keyspace from 3.0.X to 3.11.2 Cluster which 
already have keyspaces

Hi everyone,

I have a question about migrate a keyspace another cluster. The main problem 
for us, our new cluster already have 2 keyspaces and we using it in production. 
Because of we don't sure how token ranges will be change, we would like the 
share our migration plan here and take back your comments.
 We have two Cassandra clusters which one of them:

CLUSTER-A :
- Cassandra version 3.0.10
- describe keyspace:
CREATE KEYSPACE mykeyspace WITH replication = {'class': 
'NetworkTopologyStrategy', 'DC1': '3', 'DC2': '3', 'DC3': '1'}  AND 
durable_writes = true;
- DC1 : 6 nodes
- DC2 : 6 nodes
- DC3 : 1 node (backup node, have all data)

CLUSTER-B :
- Cassandra version 3.11.2- DC1 : 5 nodes
- DC2 : 5 nodes- DC3 : 1 node
- Already have 2 keyspaces and write/read traffic

We want to migrate a keyspace which on CLUSTER-A to CLUSTER-B. There're some 
solutions for restore or migrate a keyspace on a new cluster but I haven't seen 
any safety way about how we can migrate a keyspace on existing cluster which 
already have keyspaces.

Replication Factor won't change.
We think about two ways : one of them using sstableloader and other one using 
COPY TO/COPY FROM commands.

Our migration plan is:

- export of keyspace schema structure with DESC keyspace on CLUSTER-A
- create keyspace schema on CLUSTER-B
- disable writing traffic on application layer
- load data from CLUSTER-A, DC3 backup node (which have all data) to CLUSTER-B, 
DC1 with sstableloader or COPY command (each table wiil be copy one by one).
- update cluster IP addresses in application configuration
- enable writing traffic on application layer
So do you see any risk or any suggestion for this plan? Thanks a lot.

-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org




The information in this Internet Email is confidential and may be legally 
privileged. It is intended solely for the addressee. Access to this Email by 
anyone else is unauthorized. If you are not the intended recipient, any 
disclosure, copying, distribution or any action taken or omitted to be taken in 
reliance on it, is prohibited and may be unlawful. When addressed to our 
clients any opinions or advice contained in this Email are subject to the terms 
and conditions expressed in any applicable governing The Home Depot terms of 
business or client engagement letter. The Home Depot disclaims all 
responsibility and liability for the accuracy and content of this attachment 
and for any damages or losses arising from any inaccuracies, errors, viruses, 
e.g., worms, trojan horses, etc., or other items of a destructive nature, which 
may be contained in this attachment and shall not be liable for direct, 
indirect, consequential or special damages in connection with this e-mail 
message or its attachment.


Re: "Maximum memory usage reached (512.000MiB), cannot allocate chunk of 1.000MiB"

2019-12-02 Thread Rajsekhar Mallick
Hello Rahul,

I would request Hossein to correct me if I am wrong. Below is how it works

How will a application/database read something from the disk
A request comes in for read> the application code internally would be
invoking upon system calls-> these kernel level system calls will
schedule a job with io-scheduler--> the data is then read and  returned
by the device drivers-> this fetched data from the disk is a
accumulated in a memory location ( file buffer) until the entire read
operation is complete-> then i guess the data is uncompressed>
processed inside jvm as JAVA objects-> handed over to the application
logic to transmit it over the network interface.

This is my understanding of file_cache_size_in_mb. Basically caching disk
data onto the file system cache.
The alert you are getting is an INFO level log.
I would recommend try understanding why is it that this cache is filling up
fast. Increasing the cache size is a solution but as i remember there are
some impact if this is increased. I faced a similar issue and increased the
cache size. Eventually it happened that the increased size started falling
short.

You have the right question of how cache is being recycled. If you find an
answer do post the same. But that is something Cassandra doesn't have a
control on ( that is what i understand) .
 Investigating your reads,if a lot of data is being read to satisfy few
queries, might be another way to start troubleshooting

Thanks,
Rajsekhar








On Mon, 2 Dec, 2019, 8:18 PM Rahul Reddy,  wrote:

> Thanks Hossein,
>
> How does the chunks are moved out of memory (LRU?) if it want to make room
> for new requests to get chunks?if it has mechanism to clear chunks from
> cache what causes to cannot allocate chunk? Can you point me to any
> documention?
>
> On Sun, Dec 1, 2019, 12:03 PM Hossein Ghiyasi Mehr 
> wrote:
>
>> Chunks are part of sstables. When there is enough space in memory to
>> cache them, read performance will increase if application requests it again.
>>
>> Your real answer is application dependent. For example write heavy
>> applications are different than read heavy or read-write heavy. Real time
>> applications are different than time series data environments and ... .
>>
>>
>>
>> On Sun, Dec 1, 2019 at 7:09 PM Rahul Reddy 
>> wrote:
>>
>>> Hello,
>>>
>>> We are seeing memory usage reached 512 mb and cannot allocate 1MB.  I
>>> see this because file_cache_size_mb by default set to 512MB.
>>>
>>> Datastax document recommends to increase the file_cache_size.
>>>
>>> We have 32G over all memory allocated 16G to Cassandra. What is the
>>> recommended value in my case. And also when does this memory gets filled up
>>> frequent does nodeflush helps in avoiding this info messages?
>>>
>>


RE: [EXTERNAL] Re: Upgrade strategy for high number of nodes

2019-12-02 Thread Durity, Sean R
All my upgrades are without downtime for the application. Yes, do the binary 
upgrade one node at a time. Then run upgradesstables on as many nodes as your 
app load can handle (maybe you can point the app to a different DC, while 
another DC is doing upgradesstables). Upgradesstables doesn’t cause downtime – 
it just increases the IO load on the nodes executing the upgradesstables. I try 
to get it done as quickly as possible, because I suspend streaming operations 
(repairs, etc.) until the sstable rewrites are completed.

Sean Durity

From: Shishir Kumar 
Sent: Saturday, November 30, 2019 1:00 AM
To: user@cassandra.apache.org
Subject: [EXTERNAL] Re: Upgrade strategy for high number of nodes

Thanks for pointer. We haven't much changed data model since long, so before 
workarounds (scrub) worth understanding root cause of problem.
This might be reason why running upgradesstables in parallel was not 
recommended.
-Shishir
On Sat, 30 Nov 2019, 10:37 Jeff Jirsa, 
mailto:jji...@gmail.com>> wrote:
Scrub really shouldn’t be required here.

If there’s ever a step that reports corruption, it’s either a very very old 
table where you dropped columns previously or did something “wrong” in the past 
or a software bug. The old dropped column really should be obvious in the stack 
trace - anything else deserves a bug report.

It’s unfortunate that people jump to just scrubbing the unreadable data - would 
appreciate an anonymized JIRA if possible. Alternatively work with your vendor 
to make sure they don’t have bugs in their readers somehow.





On Nov 29, 2019, at 8:58 PM, Shishir Kumar 
mailto:shishirroy2...@gmail.com>> wrote:

Some more background. We are planning (tested) binary upgrade across all nodes 
without downtime. As next step running upgradesstables. As C* file format and 
version (from format big, version mc to format bti, version aa (Refer 
https://docs.datastax.com/en/dse/6.0/dse-admin/datastax_enterprise/tools/toolsSStables/ToolsSSTableupgrade.html
 
[docs.datastax.com]
 - upgrade from DSE 5.1 to 6.x). Underlying changes explains why it takes too 
much time to upgrade.
Running  upgradesstables  in parallel across RAC - This is where I am not sure 
on impact of running in parallel (document recommends to run one node at time). 
During upgradesstables there are scenario's where it report file corruption, 
hence require corrective step I.e. scrub. Due to file corruption at times nodes 
goes down due to sstable corruption or result in high CPU usage ~100%. 
Performing above in parallel without downtime might result in more 
inconsistency across nodes. This scenario have not tested, so will need group 
help in case they have done similar upgrade in past (I.e. scenario's/complexity 
which needs to be considered and why guideline recommend to run upgradesstable 
one node at time).
-Shishir

On Fri, Nov 29, 2019 at 11:52 PM Josh Snyder 
mailto:j...@code406.com>> wrote:
Hello Shishir,

It shouldn't be necessary to take downtime to perform upgrades of a Cassandra 
cluster. It sounds like the biggest issue you're facing is the upgradesstables 
step. upgradesstables is not strictly necessary before a Cassandra node 
re-enters the cluster to serve traffic; in my experience it is purely for 
optimizing the performance of the database once the software upgrade is 
complete. I recommend trying out an upgrade in a test environment without using 
upgradesstables, which should bring the 5 hours per node down to just a few 
minutes.

If you're running NetworkTopologyStrategy and you want to optimize further, you 
could consider performing the upgrade on multiple nodes within the same rack in 
parallel. When correctly configured, NetworkTopologyStrategy can protect your 
database from an outage of an entire rack. So performing an upgrade on a few 
nodes at a time within a rack is the same as a partial rack outage, from the 
database's perspective.

Have a nice upgrade!

Josh

On Fri, Nov 29, 2019 at 7:22 AM Shishir Kumar 
mailto:shishirroy2...@gmail.com>> wrote:
Hi,

Need input on cassandra upgrade strategy for below:
1. We have Datacenter across 4 geography (multiple isolated deployments in each 
DC).
2. Number of Cassandra nodes in each deployment is between 6 to 24
3. Data volume on each nodes between 150 to 400 GB
4. All production environment has DR set up
5. During upgrade we do not want downtime

We are planning to go for stack upgrade but upgradesstables is taking approx. 5 
hours per node (if data volume is approx 200 GB).
Options-
No downtime - As per recommendation (DataStax documentation) if we plan to 
upgrade one node at time I.e. in sequence upgrade cycle for one environment 
will take weeks, so DevOps concern.
Read Only (No downtime) - Route read only load to DR system. We have 

RE: [EXTERNAL] performance

2019-12-02 Thread Durity, Sean R
I’m not sure this is the fully correct question to ask. The size of the data 
will matter. The importance of high availability matters. Performance can be 
tuned by taking advantage of Cassandra’s design strengths. In general, you 
should not be doing queries with a where clause on non-key columns. Secondary 
indexes are not what you would expect from a relational background (and should 
normally be avoided).

In short, choose Cassandra if you need high-availability and low latency on 
KNOWN access patterns (on which you base your table design).

If you want an opinion – I would never put data over a few hundred GB that I 
care about into mysql. I don’t like the engine, the history, the company, or 
anything about it. But that’s just my opinion. I know many companies have 
successfully used mysql.


Sean Durity

From: hahaha sc 
Sent: Friday, November 29, 2019 3:27 AM
To: cassandra-user 
Subject: [EXTERNAL] performance

Query based on a field with a non-primary key and a secondary index, and then 
update based on the primary key. Can it be  more efficient than mysql?



The information in this Internet Email is confidential and may be legally 
privileged. It is intended solely for the addressee. Access to this Email by 
anyone else is unauthorized. If you are not the intended recipient, any 
disclosure, copying, distribution or any action taken or omitted to be taken in 
reliance on it, is prohibited and may be unlawful. When addressed to our 
clients any opinions or advice contained in this Email are subject to the terms 
and conditions expressed in any applicable governing The Home Depot terms of 
business or client engagement letter. The Home Depot disclaims all 
responsibility and liability for the accuracy and content of this attachment 
and for any damages or losses arising from any inaccuracies, errors, viruses, 
e.g., worms, trojan horses, etc., or other items of a destructive nature, which 
may be contained in this attachment and shall not be liable for direct, 
indirect, consequential or special damages in connection with this e-mail 
message or its attachment.


Re: "Maximum memory usage reached (512.000MiB), cannot allocate chunk of 1.000MiB"

2019-12-02 Thread Rahul Reddy
Thanks Hossein,

How does the chunks are moved out of memory (LRU?) if it want to make room
for new requests to get chunks?if it has mechanism to clear chunks from
cache what causes to cannot allocate chunk? Can you point me to any
documention?

On Sun, Dec 1, 2019, 12:03 PM Hossein Ghiyasi Mehr 
wrote:

> Chunks are part of sstables. When there is enough space in memory to cache
> them, read performance will increase if application requests it again.
>
> Your real answer is application dependent. For example write heavy
> applications are different than read heavy or read-write heavy. Real time
> applications are different than time series data environments and ... .
>
>
>
> On Sun, Dec 1, 2019 at 7:09 PM Rahul Reddy 
> wrote:
>
>> Hello,
>>
>> We are seeing memory usage reached 512 mb and cannot allocate 1MB.  I see
>> this because file_cache_size_mb by default set to 512MB.
>>
>> Datastax document recommends to increase the file_cache_size.
>>
>> We have 32G over all memory allocated 16G to Cassandra. What is the
>> recommended value in my case. And also when does this memory gets filled up
>> frequent does nodeflush helps in avoiding this info messages?
>>
>


Re: Uneven token distribution with allocate_tokens_for_keyspace

2019-12-02 Thread Enrico Cavallin
Hi Anthony,
thank you for your hints, now the new DC is well balanced within 2%.
I did read your article, but I thought it was needed only for new
"clusters", not also for new "DCs"; but RF is per DC so it makes sense.

You TLP guys are doing a great job for Cassandra community.

Thank you,
Enrico


On Fri, 29 Nov 2019 at 05:09, Anthony Grasso 
wrote:

> Hi Enrico,
>
> This is a classic chicken and egg problem with the
> allocate_tokens_for_keyspace setting.
>
> The allocate_tokens_for_keyspace setting uses the replication factor of a
> DC keyspace to calculate the token allocation when a node is added to the
> cluster for the first time.
>
> Nodes need to be added to the new DC before we can replicate the keyspace
> over to it. Herein lies the problem. We are unable to use
> allocate_tokens_for_keyspace unless the keyspace is replicated to the new
> DC. In addition, as soon as you change the keyspace replication to the new
> DC, new data will start to be written to it. To work around this issue you
> will need to do the following.
>
>1. Decommission all the nodes in the *dcNew*, one at a time.
>2. Once all the *dcNew* nodes are decommissioned, wipe the contents in
>the *commitlog*, *data*, *saved_caches*, and *hints* directories of
>these nodes.
>3. Make the first node to add into the *dcNew* a seed node. Set the
>seed list of the first node with its IP address and the IP addresses of the
>other seed nodes in the cluster.
>4. Set the *initial_token* setting for the first node. You can
>calculate the values using the algorithm in my blog post:
>
> https://thelastpickle.com/blog/2019/02/21/set-up-a-cluster-with-even-token-distribution.html.
>For convenience I have calculated them:
>*-9223372036854775808,-4611686018427387904,0,4611686018427387904*.
>Note, remove the *allocate_tokens_for_keyspace* setting from the
>*cassandra.yaml* file for this (seed) node.
>5. Check to make sure that no other node in the cluster is assigned
>any of the four tokens specified above. If there is another node in the
>cluster that is assigned one of the above tokens, increment the conflicting
>token by values of one until no other node in the cluster is assigned that
>token value. The idea is to make sure that these four tokens are unique to
>the node.
>6. Add the seed node to cluster. Make sure it is listed in *dcNew *by
>checking nodetool status.
>7. Create a dummy keyspace in *dcNew* that has a replication factor of
>2.
>8. Set the *allocate_tokens_for_keyspace* value to be the name of the
>dummy keyspace for the other two nodes you want to add to *dcNew*.
>Note remove the *initial_token* setting for these other nodes.
>9. Set *auto_bootstrap* to *false* for the other two nodes you want to
>add to *dcNew*.
>10. Add the other two nodes to the cluster, one at a time.
>11. If you are happy with the distribution, copy the data to *dcNew*
>by running a rebuild.
>
>
> Hope this helps.
>
> Regards,
> Anthony
>
> On Fri, 29 Nov 2019 at 02:08, Enrico Cavallin 
> wrote:
>
>> Hi all,
>> I have an old datacenter with 4 nodes and 256 tokens each.
>> I am now starting a new datacenter with 3 nodes and num_token=4
>> and allocate_tokens_for_keyspace=myBiggestKeyspace in each node.
>> Both DCs run Cassandra 3.11.x.
>>
>> myBiggestKeyspace has RF=3 in dcOld and RF=2 in dcNew. Now dcNew is very
>> unbalanced.
>> Also keyspaces with RF=2 in both DCs have the same problem.
>> Did I miss something or even with  allocate_tokens_for_keyspace I have
>> strong limitations with low num_token?
>> Any suggestions on how to mitigate it?
>>
>> # nodetool status myBiggestKeyspace
>> Datacenter: dcOld
>> ===
>> Status=Up/Down
>> |/ State=Normal/Leaving/Joining/Moving
>> --  Address   Load   Tokens   Owns (effective)  Host ID
>> Rack
>> UN  x.x.x.x  515.83 GiB  256  76.2%
>> fc462eb2-752f-4d26-aae3-84cb9c977b8a  rack1
>> UN  x.x.x.x  504.09 GiB  256  72.7%
>> d7af8685-ba95-4854-a220-bc52dc242e9c  rack1
>> UN  x.x.x.x  507.50 GiB  256  74.6%
>> b3a4d3d1-e87d-468b-a7d9-3c104e219536  rack1
>> UN  x.x.x.x  490.81 GiB  256  76.5%
>> 41e80c5b-e4e3-46f6-a16f-c784c0132dbc  rack1
>>
>> Datacenter: dcNew
>> ==
>> Status=Up/Down
>> |/ State=Normal/Leaving/Joining/Moving
>> --  AddressLoad   Tokens   Owns (effective)  Host ID
>>Rack
>> UN  x.x.x.x   145.47 KiB  456.3%
>> 7d089351-077f-4c36-a2f5-007682f9c215  rack1
>> UN  x.x.x.x   122.51 KiB  455.5%
>> 625dafcb-0822-4c8b-8551-5350c528907a  rack1
>> UN  x.x.x.x   127.53 KiB  488.2%
>> c64c0ce4-2f85-4323-b0ba-71d70b8e6fbf  rack1
>>
>> Thanks,
>> -- ec
>>
>