>
> Does this mean RAR(s) must be created and added to the cache
> (FileCacheService) whenever a SSTable is opened even if in case of
> compaction? I think random access read doesn't need to read the data from a
> SSTable in case of compaction because the SSTable had sorted by their keys,
> then only need sequential read to merge.



RandomAccessReader is the name of the class (an implementation of
FileDataInput), but we use them for sequential scanning as well. We don't
necessarily put them in the cache but most of the times we do. On
compaction, if there is a rate limiter on a reader (compaction throughput >
0) then we don't store them in the cache. Disk access mode mmap (the
default on 64 bit machines) is also treated differently, as we cache the
memory mapped segments, not the readers, in 2.2.


On Thu, Mar 17, 2016 at 8:40 PM, Satoshi Hikida <sahik...@gmail.com> wrote:

> Hi, Stefania
>
> Thank you for your advice, again!!
>
> Could I ask you for another question?
>
> > Each sstable has one or more random access readers (one per segment for
> example) and FileCacheService is a cache for such readers
> Does this mean RAR(s) must be created and added to the cache
> (FileCacheService) whenever a SSTable is opened even if in case of
> compaction? I think random access read doesn't need to read the data from a
> SSTable in case of compaction because the SSTable had sorted by their keys,
> then only need sequential read to merge.
>
> If I have something wrong, I'm glad if you could correct.
>
>
> Regards,
> Satoshi
>
>
> On Thu, Mar 17, 2016 at 5:19 PM, Stefania Alborghetti <
> stefania.alborghe...@datastax.com> wrote:
>
>> Q1. Readers are created as needed, there is no fixed number. For example,
>> we may have 2 threads scanning sstables at the same time due to 2 different
>> CQL SELECT statements.
>>
>> Q2. There is no correlation between sstable size and JVM HEAP size. We
>> don't load entire sstables in memory.
>>
>> Q3. It's difficult to say what caused the invalidation messages,
>> basically anything that removed sstables from memory, such as dropping the
>> table, snapshots, compactions, streaming, there may me other operations I'm
>> not familiar with.
>>
>> Q4. Correct, these are temporary files. Once again, in 3.0 things are
>> different and the temporary files have been replaced by transaction logs
>> (CASSANDRA-7066).
>>
>>
>> On Thu, Mar 17, 2016 at 3:40 PM, Satoshi Hikida <sato...@imagine-orb.com>
>> wrote:
>>
>>> Sorry there is a mistake in my previous post. I would correct it.
>>>
>>> In Q3, I mentioned there are a lot of invalidating messages in the
>>> debug.log. It is true but cassandra configurations were wrong. In that
>>> case, the cassandra.yaml configurations are as follows:
>>>
>>> - cassandra.yaml
>>> - compaction_throughput_mb_per_sec: 0 (not 8 or default)
>>> - concurrent_compactors: 1
>>> - sstable_preemptive_open_interval_in_mb: 0  (not 8 or default)
>>> - memtable_flush_writers: 1
>>>
>>> And More precisely, in that case, Cassandra keep on outputting
>>> invalidating messages for a while(a few hours). However CPU usage is almost
>>> 0.0% in top command like below.
>>>
>>>     $ top -bu cassandra -n 1
>>>     ...
>>>     PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+
>>> COMMAND
>>>     2631 cassand+  20   0  0.250t 1.969g 703916 S   0.0 57.0   8459:35
>>> java
>>>
>>> I want to know what was actually happening at that time.
>>>
>>>
>>> Regards,
>>> Satoshi
>>>
>>>
>>> On Thu, Mar 17, 2016 at 3:56 PM, Satoshi Hikida <sahik...@gmail.com>
>>> wrote:
>>>
>>>> Thank you for your very useful advice!
>>>>
>>>>
>>>> Definitely, I'm using Cassandra V2.2.5 not 3.x. And basically I've
>>>> understood what does these logs mean. But I have more a few questions. So I
>>>> would very much appreciate If I get some explanations about these 
>>>> questions.
>>>>
>>>> * Q1.
>>>> In my understand, when open a SSTable, a lot of
>>>> RandomAccessReaders(RARs) are created. A number of RARs is equal to a
>>>> number of segments of SSTable. Is a number of segments(=RARs) equal to
>>>> follows?
>>>>
>>>> a number of segments = size of SSTable / size of segments
>>>>
>>>> * Q2.
>>>> What is happen if the Cassandra open a SSTable file which bigger than
>>>> JVM heap (or memory)?
>>>>
>>>> * Q3.
>>>> In my case, there are a lot of invalidating messages for the same
>>>> SSTable file (e.g. at least 11 records for tmplink-la-8348-big-Data.db in
>>>> my previous post). In some cases, there are more than 600 invalidating
>>>> messages for the same file and these messages logged for a few hours. Would
>>>> that closing a big SSTable is the cause?
>>>>
>>>> * Q4.
>>>> I saw "tmplink-xxx" or "tmp-xxx" files in the logs and also data
>>>> directories. Are these files temporary in compaction process?
>>>>
>>>>
>>>> Here is my experimental configurations.
>>>>
>>>> - Cassandra node: An aws EC2 instance(t2.medium. 4GBRAM, 2vCPU)
>>>> - Cassandra version: 2.2.5
>>>> - inserted data size: about 100GB
>>>> - cassandra-env.sh: default
>>>> - cassandra.yaml
>>>> - compaction_throughput_mb_per_sec: 8 (or default)
>>>> - concurrent_compactors: 1
>>>> - sstable_preemptive_open_interval_in_mb: 25 (or default)
>>>> - memtable_flush_writers: 1
>>>>
>>>>
>>>> Regards,
>>>> Satoshi
>>>>
>>>>
>>>> On Wed, Mar 16, 2016 at 5:47 PM, Stefania Alborghetti <
>>>> stefania.alborghe...@datastax.com> wrote:
>>>>
>>>>> Each sstable has one or more random access readers (one per segment
>>>>> for example) and FileCacheService is a cache for such readers. When an
>>>>> sstable is closed, the cache is invalidated. If no single reader of an
>>>>> sstable is used for at least 512 milliseconds, all readers are evicted. If
>>>>> the sstable is opened again, new reader(s) will be created and added to 
>>>>> the
>>>>> cache again.
>>>>>
>>>>> FileCacheService was removed in cassandra 3.0 in favour of a pool of
>>>>> page-aligned buffers, and sharing the NIO file channels amongst the 
>>>>> readers
>>>>> of an sstable, refer to CASSANDRA-8897
>>>>> <https://issues.apache.org/jira/browse/CASSANDRA-8897> and
>>>>> CASSANDRA-8893 <https://issues.apache.org/jira/browse/CASSANDRA-8893>
>>>>> for more details.
>>>>>
>>>>> On Wed, Mar 16, 2016 at 3:30 PM, satoshi hikida <sahik...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I have been working on some experiments for Cassandra and found some
>>>>>> log messages as follows in debug.log.
>>>>>> I am not sure what it exactly is, so I would appreciate if someone
>>>>>> gives me some explanations about it.
>>>>>>
>>>>>> In my verification, a Cassandra node runs as a stand-alone server on
>>>>>> Amazon EC2 instance(t2.medium). And I insert 1 Billion records (about 
>>>>>> 100GB
>>>>>> data size) to a table from a client application (which runs on another
>>>>>> instance separated from Cassandra node). After insertion, Cassandra
>>>>>> continues it's I/O activities for (probably) compaction and keep logging
>>>>>> the messages as follows:
>>>>>>
>>>>>> ---
>>>>>> ...
>>>>>> DEBUG [NonPeriodicTasks:1] 2016-03-16 09:59:25,170
>>>>>> FileCacheService.java:102 - Evicting cold readers for
>>>>>> /var/lib/cassandra/data/system/local-7ad54392bcdd35a684174e047860b377/la-6-big-Data.db
>>>>>> DEBUG [NonPeriodicTasks:1] 2016-03-16 09:59:31,780
>>>>>> FileCacheService.java:177 - Invalidating cache for
>>>>>> /var/lib/cassandra/data/test/user-3d988520e9e011e59d830f00df8833fa/tmplink-la-8348-big-Data.db
>>>>>> DEBUG [NonPeriodicTasks:1] 2016-03-16 09:59:36,899
>>>>>> FileCacheService.java:177 - Invalidating cache for
>>>>>> /var/lib/cassandra/data/test/user-3d988520e9e011e59d830f00df8833fa/tmplink-la-8348-big-Data.db
>>>>>> DEBUG [NonPeriodicTasks:1] 2016-03-16 09:59:42,187
>>>>>> FileCacheService.java:177 - Invalidating cache for
>>>>>> /var/lib/cassandra/data/test/user-3d988520e9e011e59d830f00df8833fa/tmplink-la-8348-big-Data.db
>>>>>> DEBUG [NonPeriodicTasks:1] 2016-03-16 09:59:47,308
>>>>>> FileCacheService.java:177 - Invalidating cache for
>>>>>> /var/lib/cassandra/data/test/user-3d988520e9e011e59d830f00df8833fa/tmplink-la-8348-big-Data.db
>>>>>> ...
>>>>>> ---
>>>>>>
>>>>>> I guess these messages are related to the compaction process and
>>>>>> FileCacheService was invalidating cache which associated with a SSTable
>>>>>> file. But I'm not sure what it does actually mean. When the cache is
>>>>>> invalidated? And What happens is after cache invalidation?
>>>>>>
>>>>>>
>>>>>> Regards,
>>>>>> Satoshi
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>>
>>>>>
>>>>> [image: datastax_logo.png] <http://www.datastax.com/>
>>>>>
>>>>> Stefania Alborghetti
>>>>>
>>>>> Apache Cassandra Software Engineer
>>>>>
>>>>> |+852 6114 9265| stefania.alborghe...@datastax.com
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>>
>> --
>>
>>
>> [image: datastax_logo.png] <http://www.datastax.com/>
>>
>> Stefania Alborghetti
>>
>> Apache Cassandra Software Engineer
>>
>> |+852 6114 9265| stefania.alborghe...@datastax.com
>>
>>
>>
>>
>


-- 


[image: datastax_logo.png] <http://www.datastax.com/>

Stefania Alborghetti

Apache Cassandra Software Engineer

|+852 6114 9265| stefania.alborghe...@datastax.com

Reply via email to