[ 
https://issues.apache.org/jira/browse/CASSANDRA-13049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15759893#comment-15759893
 ] 

Stefania commented on CASSANDRA-13049:
--------------------------------------

bq. 1. We use memory map when reading data from sstables. Every time we create 
a new memory map, there is one more file descriptor open. Memory map improves 
IO performance when dealing with large files, do we want to set a file size 
threshold when doing this?

At the moment the default disk access is 
[{{DiskAccessMode.Auto}}|https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/config/Config.java#L74],
 other options are 
[here|https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/config/Config.java#L365].
 With auto, it chooses mmap for 64-bit archs, see 
[here|https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/config/DatabaseDescriptor.java#L319].
 It could be an interesting idea to introduce a new size-based dynamic disk 
access mode, but I would test the performance difference when using disk access 
mode standard for smaller files first. I would imagine the trickiest part is to 
determine a size threshold. If using disk access mode standard, it is possible 
to control the readers buffer size via the [disk optimization 
strategy|https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/config/Config.java#L255]
 and the two parameters just below.

bq. 2. Whenever we finished receiving a file from peer, we create a 
SSTableReader/BigTableReader, which includes opening the data file and index 
file, and keep them open until some time later (unpredictable). See 
StreamReceiveTask#L110, BigTableWriter#openFinal and 
SSTableReader#InstanceTidier. Is it better to lazily open the data/index files 
or close them more often to reclaim the file descriptors?

I haven't done much work on streaming and I only looked at the code very 
briefly, but it seems to me that the sstables are kept until they are 
compacted. [~pauloricardomg] or [~yukim] are probably better qualified to 
comment on this.

> Too many open files during bootstrapping
> ----------------------------------------
>
>                 Key: CASSANDRA-13049
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-13049
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Simon Zhou
>            Assignee: Simon Zhou
>
> We just upgraded from 2.2.5 to 3.0.10 and got issue during bootstrapping. So 
> likely this is something made worse along with improving IO performance in 
> Cassandra 3.
> On our side, the issue is that we have lots of small sstables and thus when 
> bootstrapping a new node, it receives lots of files during streaming and 
> Cassandra keeps all of them open for an unpredictable amount of time. 
> Eventually we hit "Too many open files" error and around that time, I can see 
> ~1M open files through lsof and almost all of them are *-Data.db and 
> *-Index.db. Definitely we should use a better compaction strategy to reduce 
> the number of sstables but I see a few possible improvements in Cassandra:
> 1. We use memory map when reading data from sstables. Every time we create a 
> new memory map, there is one more file descriptor open. Memory map improves 
> IO performance when dealing with large files, do we want to set a file size 
> threshold when doing this?
> 2. Whenever we finished receiving a file from peer, we create a 
> SSTableReader/BigTableReader, which includes opening the data file and index 
> file, and keep them open until some time later (unpredictable). See 
> StreamReceiveTask#L110, BigTableWriter#openFinal and 
> SSTableReader#InstanceTidier. Is it better to lazily open the data/index 
> files or close them more often to reclaim the file descriptors?
> I searched all known issue in JIRA and looks like this is a new issue in 
> Cassandra 3. cc [~Stefania] for comments.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to