[jira] [Commented] (CASSANDRA-14653) The performance of "NonPeriodicTasks" pools defined in class ScheduledExecutors is low

2018-08-20 Thread Peter Xie (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16585843#comment-16585843
 ] 

Peter Xie commented on CASSANDRA-14653:
---

@[~jjirsa]:  The release version is 3.11.2

I increase the sstable_size from 160M (default size) to 1024M, and chunk size 
of compression to 1M, these changes can avoid disk space issue. 

Because compaction task would be reduced by increasing sstable size, so the 
stale sstable number also would be reduced.

But i think it's better make thread pool increasing dynamically when clean 
tasks is heavy. 

> The performance of "NonPeriodicTasks" pools defined in class 
> ScheduledExecutors is low
> --
>
> Key: CASSANDRA-14653
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14653
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
> Environment: Cassandra nodes :
> 3 nodes, 330G physical memory per node , and four data directory (ssd)  per 
> node.
>Reporter: Peter Xie
>Priority: Major
>
> We use cassandra as backend storage for Janusgraph. when we loading huge data 
> (~2 billion vertex, ~10 billion edges), we met some problems.
>  
> At first, we use STCS as compaction strategy , but met below exception.  we 
> checked the value of  "max memory lock" is unlimited and "file map count" is 
> 1 million, these values should enough for loading data. last we found this 
> problem is caused by the virtual memory are all cosumed by cassandra.  So not 
> additional virtual memory can be used by compaction task , and below 
> exception is thrown out.   
> {quote}ERROR [CompactionExecutor:267] 2018-08-09 02:28:40,952 
> JVMStabilityInspector.javv
>  a:74 - OutOfMemory error letting the JVM handle the error:
>  java.lang.OutOfMemoryError: Map failed
> {quote}
> So, we change compaction strategy to LCS, this change seems can resolve the 
> virtual memory problem. But we found another problem : Many sstables which 
> has been compacted are still retained on disk,  these old sstables consume so 
> many disk space, it's causing no enough disk for saving real data. and we 
> found that many files like "mc_txn_compaction_xxx.log" are created under the 
> data directory. 
> After some times' investigaton, found this problem is caused by 
> "NonPeriodicTasks" thread pools.  this pools is always using only one thread 
> for processing clean task after compaction. this thread pool is instanced 
> with class DebuggableScheduledThreadPoolExecutor,
> and DebuggableScheduledThreadPoolExecutor is inherit from class  
> ScheduledThreadPoolExecutor.
> By reading the code of class DebuggableScheduledThreadPoolExecutor,  found 
> DebuggableScheduledThreadPoolExecutor is using an unbound task queue, and 
> core pool size is 1. I think it should wrong using unbound queue.  If we 
> using unbound queue, the thread pool wouldn't  increasing thread even 
> there're many tasks are blocked in queue, because unbound queue never would 
> be full.  I think here should use bound queue, so when clean task is heavily, 
> more threads would created for processing them. 
> {quote}public DebuggableScheduledThreadPoolExecutor(int corePoolSize, String 
> threadPoolName, int priority)
>  Unknown macro: \{ super(corePoolSize, new NamedThreadFactory(threadPoolName, 
> priority)); setRejectedExecutionHandler(rejectedExecutionHandler); }
>   
> public ScheduledThreadPoolExecutor(int corePoolSize,
>  ThreadFactory threadFactory)
>  Unknown macro: \{ super(corePoolSize, Integer.MAX_VALUE, 0, NANOSECONDS, new 
> DelayedWorkQueue(), threadFactory); }
> {quote}
>  Below is the case about clean task after compaction.  there nearly 3 hours 
> delay for removing file "mc-56525". 
> {quote} 
> TRACE [CompactionExecutor:81] 2018-08-16 21:22:29,664 
> LifecycleTransaction.java:363 - Staging for obsolescence 
> BigTableReader(path='/sdb/data/test_2/edgestore-365b0b70a05911e8806001ebe60a5ce7/mc-56525-big-Data.db')
>  ..
>  TRACE [CompactionExecutor:81] 2018-08-16 21:22:41,162 Tracker.java:165 - 
> removing 
> /sdb/data/test_2/edgestore-365b0b70a05911e8806001ebe60a5ce7/mc-56525-big from 
> list of files tracked for test_2.edgestore
>  
>  TRACE [NonPeriodicTasks:1] 2018-08-17 00:28:47,179 SSTableReader.java:2175 - 
> Async instance tidier for 
> /sdb/data/test_2/edgestore-365b0b70a05911e8806001ebe60a5ce7/mc-56525-big, 
> before barrier
>  TRACE [NonPeriodicTasks:1] 2018-08-17 00:28:47,180 SSTableReader.java:2181 - 
> Async instance tidier for 
> /sdb/data/test_2/edgestore-365b0b70a05911e8806001ebe60a5ce7/mc-56525-big, 
> after barrier
>  TRACE [NonPeriodicTasks:1] 2018-08-17 00:28:47,182 SSTableReader.java:2196 - 
> Async instance tidier for 
> 

[jira] [Commented] (CASSANDRA-14653) The performance of "NonPeriodicTasks" pools defined in class ScheduledExecutors is low

2018-08-17 Thread Jeff Jirsa (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16584498#comment-16584498
 ] 

Jeff Jirsa commented on CASSANDRA-14653:


On which version was this observed?


> The performance of "NonPeriodicTasks" pools defined in class 
> ScheduledExecutors is low
> --
>
> Key: CASSANDRA-14653
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14653
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
> Environment: Cassandra nodes :
> 3 nodes, 330G physical memory per node , and four data directory (ssd)  per 
> node.
>Reporter: Peter Xie
>Priority: Major
>
> We use cassandra as backend storage for Janusgraph. when we loading huge data 
> (~2 billion vertex, ~10 billion edges), we met some problems.
>  
> At first, we use STCS as compaction strategy , but met below exception.  we 
> checked the value of  "max memory lock" is unlimited and "file map count" is 
> 1 million, these values should enough for loading data. last we found this 
> problem is caused by the virtual memory are all cosumed by cassandra.  So not 
> additional virtual memory can be used by compaction task , and below 
> exception is thrown out.   
> {quote}ERROR [CompactionExecutor:267] 2018-08-09 02:28:40,952 
> JVMStabilityInspector.javv
>  a:74 - OutOfMemory error letting the JVM handle the error:
>  java.lang.OutOfMemoryError: Map failed
> {quote}
> So, we change compaction strategy to LCS, this change seems can resolve the 
> virtual memory problem. But we found another problem : Many sstables which 
> has been compacted are still retained on disk,  these old sstables consume so 
> many disk space, it's causing no enough disk for saving real data. and we 
> found that many files like "mc_txn_compaction_xxx.log" are created under the 
> data directory. 
> After some times' investigaton, found this problem is caused by 
> "NonPeriodicTasks" thread pools.  this pools is always using only one thread 
> for processing clean task after compaction. this thread pool is instanced 
> with class DebuggableScheduledThreadPoolExecutor,
> and DebuggableScheduledThreadPoolExecutor is inherit from class  
> ScheduledThreadPoolExecutor.
> By reading the code of class DebuggableScheduledThreadPoolExecutor,  found 
> DebuggableScheduledThreadPoolExecutor is using an unbound task queue, and 
> core pool size is 1. I think it should wrong using unbound queue.  If we 
> using unbound queue, the thread pool wouldn't  increasing thread even 
> there're many tasks are blocked in queue, because unbound queue never would 
> be full.  I think here should use bound queue, so when clean task is heavily, 
> more threads would created for processing them. 
> {quote}public DebuggableScheduledThreadPoolExecutor(int corePoolSize, String 
> threadPoolName, int priority)
>  Unknown macro: \{ super(corePoolSize, new NamedThreadFactory(threadPoolName, 
> priority)); setRejectedExecutionHandler(rejectedExecutionHandler); }
>   
> public ScheduledThreadPoolExecutor(int corePoolSize,
>  ThreadFactory threadFactory)
>  Unknown macro: \{ super(corePoolSize, Integer.MAX_VALUE, 0, NANOSECONDS, new 
> DelayedWorkQueue(), threadFactory); }
> {quote}
>  Below is the case about clean task after compaction.  there nearly 3 hours 
> delay for removing file "mc-56525". 
> {quote} 
> TRACE [CompactionExecutor:81] 2018-08-16 21:22:29,664 
> LifecycleTransaction.java:363 - Staging for obsolescence 
> BigTableReader(path='/sdb/data/test_2/edgestore-365b0b70a05911e8806001ebe60a5ce7/mc-56525-big-Data.db')
>  ..
>  TRACE [CompactionExecutor:81] 2018-08-16 21:22:41,162 Tracker.java:165 - 
> removing 
> /sdb/data/test_2/edgestore-365b0b70a05911e8806001ebe60a5ce7/mc-56525-big from 
> list of files tracked for test_2.edgestore
>  
>  TRACE [NonPeriodicTasks:1] 2018-08-17 00:28:47,179 SSTableReader.java:2175 - 
> Async instance tidier for 
> /sdb/data/test_2/edgestore-365b0b70a05911e8806001ebe60a5ce7/mc-56525-big, 
> before barrier
>  TRACE [NonPeriodicTasks:1] 2018-08-17 00:28:47,180 SSTableReader.java:2181 - 
> Async instance tidier for 
> /sdb/data/test_2/edgestore-365b0b70a05911e8806001ebe60a5ce7/mc-56525-big, 
> after barrier
>  TRACE [NonPeriodicTasks:1] 2018-08-17 00:28:47,182 SSTableReader.java:2196 - 
> Async instance tidier for 
> /sdb/data/test_2/edgestore-365b0b70a05911e8806001ebe60a5ce7/mc-56525-big, 
> completed
> {quote}
>  
>  
>  
>  
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org