Re: "Maximum memory usage reached (512.000MiB), cannot allocate chunk of 1.000MiB"

2019-12-03 Thread Reid Pinchback
John, anything I’ll say will be as a collective ‘we’ since it has been a team 
effort here at Trip, and I’ve just been the hired gun to help out a bit. I’m 
more of a Postgres and Java guy so filter my answers accordingly.

I can’t say we saw as much relevance to tuning chunk cache size, as we did to 
do everything possible to migrate things off-heap.  I haven’t worked with 2.x 
so I don’t know how much these options changed, but in 3.11.x anyways, you 
definitely can migrate a fair bit off-heap.  Our first use case was 3 9’s 
sensitive on latency, which turns out to be a rough go for C* particularly if 
the data model is a bit askew from C*’s sweet spot, as was true for us.  The 
deeper merkle trees that were introduced somewhere I think in the 3.0.x series, 
that was a bane of our existence, we back-patched the 4.0 work to tune the tree 
height so that we weren’t OOMing nodes during reaper repair runs.

As to Shishir’s notion of using swap, because latency mattered to us, we had 
RAM headroom on the boxes.  We couldn’t use it all without pushing on something 
that was hurting us on 3 9’s.  C* is like this over-constrained problem space 
when it comes to tuning, poking in one place resulted in a twitch somewhere 
else, and we had to see which twitches worked out in our favour. If, like us, 
you have RAM headroom, you’re unlikely to care about swap for obvious reasons.  
All you really need is enough room for the O/S file buffer cache.

Tuning related to I/O and file buffer cache mattered a fair bit.  As did GC 
tuning obviously.  Personally, if I were to look at swap as helpful, I’d be 
debating with myself if the sstables should just remain uncompressed in the 
first place.  After all, swap space is disk space so holding 
compressed+uncompressed at the same time would only make sense if the storage 
footprint was large but the hot data in use was routinely much smaller… yet 
stuck around long enough in a cold state that the kernel would target it to 
swap out.  That’s a lot of if’s to line up to your benefit.  When it comes to a 
system running based on garbage collection, I get skeptical of how effectively 
the O/S will determine what is good to swap. Most of the JVM memory in C* 
churns at a rate that you wouldn’t want swap i/o to combine with if you cared 
about latency.  Not everybody cares about tight variance on latency though, so 
there can be other rationales for tuning that would result in different 
conclusions from ours.

I might have more definitive statements to make in the upcoming months, I’m in 
the midst of putting together my own test cluster for more controlled analysis 
on C* and Kafka tuning. Tuning live environments I’ve found makes it hard to 
control the variables enough for my satisfaction. It can feel like a game of 
empirical whack-a-mole.


From: Shishir Kumar 
Reply-To: "user@cassandra.apache.org" 
Date: Tuesday, December 3, 2019 at 9:23 AM
To: "user@cassandra.apache.org" 
Subject: Re: "Maximum memory usage reached (512.000MiB), cannot allocate chunk 
of 1.000MiB"

Message from External Sender
Options: Assuming model and configurations are good and Data size per node less 
than 1 TB (though no such Benchmark).

1. Infra scale for memory
2. Try to change disk_access_mode to mmap_index_only.
In this case you should not have any in memory DB tables.
3. Though Datastax do not recommended and recommends Horizontal scale, so based 
on your requirement alternate old fashion option is to add swap space.

-Shishir

On Tue, 3 Dec 2019, 15:52 John Belliveau, 
mailto:belliveau.j...@gmail.com>> wrote:
Reid,

I've only been working with Cassandra for 2 years, and this echoes my 
experience as well.

Regarding the cache use, I know every use case is different, but have you 
experimented and found any performance benefit to increasing its size?

Thanks,
John Belliveau

On Mon, Dec 2, 2019, 11:07 AM Reid Pinchback 
mailto:rpinchb...@tripadvisor.com>> wrote:
Rahul, if my memory of this is correct, that particular logging message is 
noisy, the cache is pretty much always used to its limit (and why not, it’s a 
cache, no point in using less than you have).

No matter what value you set, you’ll just change the “reached (….)” part of it. 
 I think what would help you more is to work with the team(s) that have apps 
depending upon C* and decide what your performance SLA is with them.  If you 
are meeting your SLA, you don’t care about noisy messages.  If you aren’t 
meeting your SLA, then the noisy messages become sources of ideas to look at.

One thing you’ll find out pretty quickly.  There are a lot of knobs you can 
turn with C*, too many to allow for easy answers on what you should do.  Figure 
out what your throughput and latency SLAs are, and you’ll know when to stop 
tuning.  Otherwise you’ll discover that it’s a rabbit hole you can dive into 
and not come out of for weeks.


From: Hossein Ghiyasi Mehr mailto:ghiyasim...@gmail.com>>
Reply-To: 

TTL on UDT

2019-12-03 Thread Mark Furlong
When I run the command 'select ttl(udt_field) from table; I'm getting an error 
'InvalidRequest: Error from server: code=2200 [Invalid query] message="Cannot 
use selection function ttl on collections"'. How can I get the TTL from a UDT 
field?

Mark Furlong


[cid:image001.png@01D5A920.52B244C0]
We empower journeys of personal discovery to enrich lives




Re: "Maximum memory usage reached (512.000MiB), cannot allocate chunk of 1.000MiB"

2019-12-03 Thread Shishir Kumar
Options: Assuming model and configurations are good and Data size per node
less than 1 TB (though no such Benchmark).

1. Infra scale for memory
2. Try to change disk_access_mode to mmap_index_only.
In this case you should not have any in memory DB tables.
3. Though Datastax do not recommended and recommends Horizontal scale, so
based on your requirement alternate old fashion option is to add swap space.

-Shishir

On Tue, 3 Dec 2019, 15:52 John Belliveau,  wrote:

> Reid,
>
> I've only been working with Cassandra for 2 years, and this echoes my
> experience as well.
>
> Regarding the cache use, I know every use case is different, but have you
> experimented and found any performance benefit to increasing its size?
>
> Thanks,
> John Belliveau
>
>
> On Mon, Dec 2, 2019, 11:07 AM Reid Pinchback 
> wrote:
>
>> Rahul, if my memory of this is correct, that particular logging message
>> is noisy, the cache is pretty much always used to its limit (and why not,
>> it’s a cache, no point in using less than you have).
>>
>>
>>
>> No matter what value you set, you’ll just change the “reached (….)” part
>> of it.  I think what would help you more is to work with the team(s) that
>> have apps depending upon C* and decide what your performance SLA is with
>> them.  If you are meeting your SLA, you don’t care about noisy messages.
>> If you aren’t meeting your SLA, then the noisy messages become sources of
>> ideas to look at.
>>
>>
>>
>> One thing you’ll find out pretty quickly.  There are a lot of knobs you
>> can turn with C*, too many to allow for easy answers on what you should
>> do.  Figure out what your throughput and latency SLAs are, and you’ll know
>> when to stop tuning.  Otherwise you’ll discover that it’s a rabbit hole you
>> can dive into and not come out of for weeks.
>>
>>
>>
>>
>>
>> *From: *Hossein Ghiyasi Mehr 
>> *Reply-To: *"user@cassandra.apache.org" 
>> *Date: *Monday, December 2, 2019 at 10:35 AM
>> *To: *"user@cassandra.apache.org" 
>> *Subject: *Re: "Maximum memory usage reached (512.000MiB), cannot
>> allocate chunk of 1.000MiB"
>>
>>
>>
>> *Message from External Sender*
>>
>> It may be helpful:
>> https://thelastpickle.com/blog/2018/08/08/compression_performance.html
>> 
>>
>> It's complex. Simple explanation, cassandra keeps sstables in memory
>> based on chunk size and sstable parts. It manage loading new sstables to
>> memory based on requests on different sstables correctly . You should be
>> worry about it (sstables loaded in memory)
>>
>>
>> *VafaTech.com - A Total Solution for Data Gathering & Analysis*
>>
>>
>>
>>
>>
>> On Mon, Dec 2, 2019 at 6:18 PM Rahul Reddy 
>> wrote:
>>
>> Thanks Hossein,
>>
>>
>>
>> How does the chunks are moved out of memory (LRU?) if it want to make
>> room for new requests to get chunks?if it has mechanism to clear chunks
>> from cache what causes to cannot allocate chunk? Can you point me to any
>> documention?
>>
>>
>>
>> On Sun, Dec 1, 2019, 12:03 PM Hossein Ghiyasi Mehr 
>> wrote:
>>
>> Chunks are part of sstables. When there is enough space in memory to
>> cache them, read performance will increase if application requests it
>> again.
>>
>>
>>
>> Your real answer is application dependent. For example write heavy
>> applications are different than read heavy or read-write heavy. Real time
>> applications are different than time series data environments and ... .
>>
>>
>>
>>
>>
>>
>>
>> On Sun, Dec 1, 2019 at 7:09 PM Rahul Reddy 
>> wrote:
>>
>> Hello,
>>
>>
>>
>> We are seeing memory usage reached 512 mb and cannot allocate 1MB.  I see
>> this because file_cache_size_mb by default set to 512MB.
>>
>>
>>
>> Datastax document recommends to increase the file_cache_size.
>>
>>
>>
>> We have 32G over all memory allocated 16G to Cassandra. What is the
>> recommended value in my case. And also when does this memory gets filled up
>> frequent does nodeflush helps in avoiding this info messages?
>>
>>


Re: "Maximum memory usage reached (512.000MiB), cannot allocate chunk of 1.000MiB"

2019-12-03 Thread John Belliveau
Reid,

I've only been working with Cassandra for 2 years, and this echoes my
experience as well.

Regarding the cache use, I know every use case is different, but have you
experimented and found any performance benefit to increasing its size?

Thanks,
John Belliveau


On Mon, Dec 2, 2019, 11:07 AM Reid Pinchback 
wrote:

> Rahul, if my memory of this is correct, that particular logging message is
> noisy, the cache is pretty much always used to its limit (and why not, it’s
> a cache, no point in using less than you have).
>
>
>
> No matter what value you set, you’ll just change the “reached (….)” part
> of it.  I think what would help you more is to work with the team(s) that
> have apps depending upon C* and decide what your performance SLA is with
> them.  If you are meeting your SLA, you don’t care about noisy messages.
> If you aren’t meeting your SLA, then the noisy messages become sources of
> ideas to look at.
>
>
>
> One thing you’ll find out pretty quickly.  There are a lot of knobs you
> can turn with C*, too many to allow for easy answers on what you should
> do.  Figure out what your throughput and latency SLAs are, and you’ll know
> when to stop tuning.  Otherwise you’ll discover that it’s a rabbit hole you
> can dive into and not come out of for weeks.
>
>
>
>
>
> *From: *Hossein Ghiyasi Mehr 
> *Reply-To: *"user@cassandra.apache.org" 
> *Date: *Monday, December 2, 2019 at 10:35 AM
> *To: *"user@cassandra.apache.org" 
> *Subject: *Re: "Maximum memory usage reached (512.000MiB), cannot
> allocate chunk of 1.000MiB"
>
>
>
> *Message from External Sender*
>
> It may be helpful:
> https://thelastpickle.com/blog/2018/08/08/compression_performance.html
> 
>
> It's complex. Simple explanation, cassandra keeps sstables in memory based
> on chunk size and sstable parts. It manage loading new sstables to memory
> based on requests on different sstables correctly . You should be worry
> about it (sstables loaded in memory)
>
>
> *VafaTech.com - A Total Solution for Data Gathering & Analysis*
>
>
>
>
>
> On Mon, Dec 2, 2019 at 6:18 PM Rahul Reddy 
> wrote:
>
> Thanks Hossein,
>
>
>
> How does the chunks are moved out of memory (LRU?) if it want to make room
> for new requests to get chunks?if it has mechanism to clear chunks from
> cache what causes to cannot allocate chunk? Can you point me to any
> documention?
>
>
>
> On Sun, Dec 1, 2019, 12:03 PM Hossein Ghiyasi Mehr 
> wrote:
>
> Chunks are part of sstables. When there is enough space in memory to cache
> them, read performance will increase if application requests it again.
>
>
>
> Your real answer is application dependent. For example write heavy
> applications are different than read heavy or read-write heavy. Real time
> applications are different than time series data environments and ... .
>
>
>
>
>
>
>
> On Sun, Dec 1, 2019 at 7:09 PM Rahul Reddy 
> wrote:
>
> Hello,
>
>
>
> We are seeing memory usage reached 512 mb and cannot allocate 1MB.  I see
> this because file_cache_size_mb by default set to 512MB.
>
>
>
> Datastax document recommends to increase the file_cache_size.
>
>
>
> We have 32G over all memory allocated 16G to Cassandra. What is the
> recommended value in my case. And also when does this memory gets filled up
> frequent does nodeflush helps in avoiding this info messages?
>
>


Re: Optimal backup strategy

2019-12-03 Thread Hossein Ghiyasi Mehr
I am sorry! This is true. I forgot "*not*"!
1. It's *not* recommended to use commit log after one node failure.
Cassandra has many options such as replication factor as
substitute solution.

*VafaTech.com - A Total Solution for Data Gathering & Analysis*


On Tue, Dec 3, 2019 at 10:42 AM Adarsh Kumar  wrote:

> Thanks Hossein,
>
> Just one more question is there any special SOP or consideration we have
> to take for multi-site backup.
>
> Please share any helpful link, blog or steps documented.
>
> Regards,
> Adarsh Kumar
>
> On Sun, Dec 1, 2019 at 10:40 PM Hossein Ghiyasi Mehr <
> ghiyasim...@gmail.com> wrote:
>
>> 1. It's recommended to use commit log after one node failure. Cassandra
>> has many options such as replication factor as substitute solution.
>> 2. Yes, right.
>>
>> *VafaTech.com - A Total Solution for Data Gathering & Analysis*
>>
>>
>> On Fri, Nov 29, 2019 at 9:33 AM Adarsh Kumar 
>> wrote:
>>
>>> Thanks Ahu and Hussein,
>>>
>>> So my understanding is:
>>>
>>>1. Commit log backup is not documented for Apache Cassandra, hence
>>>not standard. But can be used for restore on the same machine (For taking
>>>backup from commit_log_dir). If used on other machine(s) has to be in the
>>>same topology. Can it be used for replacement node?
>>>2. For periodic backup Snapshot+Incremental backup is the best option
>>>
>>>
>>> Thanks,
>>> Adarsh Kumar
>>>
>>> On Fri, Nov 29, 2019 at 7:28 AM guo Maxwell 
>>> wrote:
>>>
 Hossein is right , But for use , we restore to the same cassandra
 topology ,So it is usable to do replay .But when restore to the
 same machine it is also usable .
 Using sstableloader cost too much time and more storage(though will
 reduce after  restored)

 Hossein Ghiyasi Mehr  于2019年11月28日周四 下午7:40写道:

> commitlog backup isn't usable in another machine.
> Backup solution depends on what you want to do: periodic backup or
> backup to restore on other machine?
> Periodic backup is combine of snapshot and incremental backup. Remove
> incremental backup after new snapshot.
> Take backup to restore on other machine: You can use snapshot after
> flushing memtable or Use sstableloader.
>
>
> 
> VafaTech.com - A Total Solution for Data Gathering & Analysis
>
> On Thu, Nov 28, 2019 at 6:05 AM guo Maxwell 
> wrote:
>
>> for cassandra or datastax's documentation, commitlog's backup is not
>> mentioned.
>> only snapshot and incremental backup is described to do backup .
>>
>> Though commitlog's archive for keyspace/table is not support but
>> commitlog' replay (though you must put log to commitlog_dir and restart 
>> the
>> process)
>> support the feature of keyspace/table' replay filter (using
>> -Dcassandra.replayList with the keyspace1.table1,keyspace1.table2 format 
>> to
>> replay the specified keyspace/table)
>>
>> Snapshot do affect the storage, for us we got snapshot one week a
>> time under the low business peak and making snapshot got throttle ,for 
>> you
>> you may
>> see the issue (https://issues.apache.org/jira/browse/CASSANDRA-13019)
>>
>>
>>
>> Adarsh Kumar  于2019年11月28日周四 上午1:00写道:
>>
>>> Thanks Guo and Eric for replying,
>>>
>>> I have some confusions about commit log backup:
>>>
>>>1. commit log archival technique is (
>>>
>>> https://support.datastax.com/hc/en-us/articles/115001593706-Manual-Backup-and-Restore-with-Point-in-time-and-table-level-restore-
>>>) as good as an incremental backup, as it also captures commit logs 
>>> after
>>>memtable flush.
>>>2. If we go for "Snapshot + Incremental bk + Commit log", here
>>>we have to take commit log from commit log directory (is there any 
>>> SOP for
>>>this?). As commit logs are not per table or ks, we will have 
>>> chalange in
>>>restoring selective tables.
>>>3. Snapshot based backups are easy to manage and operate due to
>>>its simplicity. But they are heavy on storage. Any views on this?
>>>4. Please share any successful strategy that someone is using
>>>for production. We are still in the design phase and want to 
>>> implement the
>>>best solution.
>>>
>>> Thanks Eric for sharing link for medusa.
>>>
>>> Regards,
>>> Adarsh Kumar
>>>
>>> On Wed, Nov 27, 2019 at 5:16 PM guo Maxwell 
>>> wrote:
>>>
 For me, I think the last one :
  Snapshot + Incremental + commitlog
 is the most meaningful way to do backup and restore, when you make
 the data backup to some where else like AWS S3.

- Snapshot based backup // for incremental data will not be
backuped and may lose data when restore to the time latter than 
 snapshot
time;
- Incremental backups // better