[jira] [Commented] (CASSANDRA-14646) built_views entries are not removed after dropping keyspace

2018-08-17 Thread ZhaoYang (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16584637#comment-16584637
 ] 

ZhaoYang commented on CASSANDRA-14646:
--

Thanks for reviewing

> built_views entries are not removed after dropping keyspace
> ---
>
> Key: CASSANDRA-14646
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14646
> Project: Cassandra
>  Issue Type: Bug
>  Components: Distributed Metadata, Materialized Views
>Reporter: ZhaoYang
>Assignee: ZhaoYang
>Priority: Major
> Fix For: 4.0
>
>
> If we restore view schema after dropping keyspace, view build won't be 
> triggered because it was marked as SUCCESS in {{built_views}} table.
> | patch | CI | 
> | [trunk|https://github.com/jasonstack/cassandra/commits/mv_drop_ks] | 
> [utest|https://circleci.com/gh/jasonstack/cassandra/739] |
> | [dtest|https://github.com/apache/cassandra-dtest/pull/36]|



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14631) Add RSS support for Cassandra blog

2018-08-17 Thread Dinesh Joshi (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16584535#comment-16584535
 ] 

Dinesh Joshi commented on CASSANDRA-14631:
--

The patch seems to generally work (thanks for the ruby bundler!), I have run 
into a couple issues.

First the RSS icon seems a bit misaligned in Safari and Chrome.

!Screen Shot 2018-08-17 at 5.32.08 PM.png|width=549,height=82!

 Second, the RSS feed seems to have weird html characters in "RSS Follower" app 
on macOS. 

!Screen Shot 2018-08-17 at 5.32.25 PM.png|width=492,height=336!

> Add RSS support for Cassandra blog
> --
>
> Key: CASSANDRA-14631
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14631
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Documentation and Website
>Reporter: Jacques-Henri Berthemet
>Assignee: Jeff Beck
>Priority: Major
> Attachments: 14631-site.txt, Screen Shot 2018-08-17 at 5.32.08 
> PM.png, Screen Shot 2018-08-17 at 5.32.25 PM.png
>
>
> It would be convenient to add RSS support to Cassandra blog:
> [http://cassandra.apache.org/blog/2018/08/07/faster_streaming_in_cassandra.html]
> And maybe also for other resources like new versions, but this ticket is 
> about blog.
>  
> {quote}From: Scott Andreas
> Sent: Wednesday, August 08, 2018 6:53 PM
> To: [d...@cassandra.apache.org|mailto:d...@cassandra.apache.org]
> Subject: Re: Apache Cassandra Blog is now live
>  
> Please feel free to file a ticket (label: Documentation and Website).
>  
> It looks like Jekyll, the static site generator used to build the website, 
> has a plugin that generates Atom feeds if someone would like to work on 
> adding one: [https://github.com/jekyll/jekyll-feed]
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14631) Add RSS support for Cassandra blog

2018-08-17 Thread Dinesh Joshi (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dinesh Joshi updated CASSANDRA-14631:
-
Attachment: Screen Shot 2018-08-17 at 5.32.25 PM.png

> Add RSS support for Cassandra blog
> --
>
> Key: CASSANDRA-14631
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14631
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Documentation and Website
>Reporter: Jacques-Henri Berthemet
>Assignee: Jeff Beck
>Priority: Major
> Attachments: 14631-site.txt, Screen Shot 2018-08-17 at 5.32.08 
> PM.png, Screen Shot 2018-08-17 at 5.32.25 PM.png
>
>
> It would be convenient to add RSS support to Cassandra blog:
> [http://cassandra.apache.org/blog/2018/08/07/faster_streaming_in_cassandra.html]
> And maybe also for other resources like new versions, but this ticket is 
> about blog.
>  
> {quote}From: Scott Andreas
> Sent: Wednesday, August 08, 2018 6:53 PM
> To: [d...@cassandra.apache.org|mailto:d...@cassandra.apache.org]
> Subject: Re: Apache Cassandra Blog is now live
>  
> Please feel free to file a ticket (label: Documentation and Website).
>  
> It looks like Jekyll, the static site generator used to build the website, 
> has a plugin that generates Atom feeds if someone would like to work on 
> adding one: [https://github.com/jekyll/jekyll-feed]
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14631) Add RSS support for Cassandra blog

2018-08-17 Thread Dinesh Joshi (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dinesh Joshi updated CASSANDRA-14631:
-
Attachment: Screen Shot 2018-08-17 at 5.32.08 PM.png

> Add RSS support for Cassandra blog
> --
>
> Key: CASSANDRA-14631
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14631
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Documentation and Website
>Reporter: Jacques-Henri Berthemet
>Assignee: Jeff Beck
>Priority: Major
> Attachments: 14631-site.txt, Screen Shot 2018-08-17 at 5.32.08 
> PM.png, Screen Shot 2018-08-17 at 5.32.25 PM.png
>
>
> It would be convenient to add RSS support to Cassandra blog:
> [http://cassandra.apache.org/blog/2018/08/07/faster_streaming_in_cassandra.html]
> And maybe also for other resources like new versions, but this ticket is 
> about blog.
>  
> {quote}From: Scott Andreas
> Sent: Wednesday, August 08, 2018 6:53 PM
> To: [d...@cassandra.apache.org|mailto:d...@cassandra.apache.org]
> Subject: Re: Apache Cassandra Blog is now live
>  
> Please feel free to file a ticket (label: Documentation and Website).
>  
> It looks like Jekyll, the static site generator used to build the website, 
> has a plugin that generates Atom feeds if someone would like to work on 
> adding one: [https://github.com/jekyll/jekyll-feed]
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14653) The performance of "NonPeriodicTasks" pools defined in class ScheduledExecutors is low

2018-08-17 Thread Jeff Jirsa (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16584498#comment-16584498
 ] 

Jeff Jirsa commented on CASSANDRA-14653:


On which version was this observed?


> The performance of "NonPeriodicTasks" pools defined in class 
> ScheduledExecutors is low
> --
>
> Key: CASSANDRA-14653
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14653
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
> Environment: Cassandra nodes :
> 3 nodes, 330G physical memory per node , and four data directory (ssd)  per 
> node.
>Reporter: Peter Xie
>Priority: Major
>
> We use cassandra as backend storage for Janusgraph. when we loading huge data 
> (~2 billion vertex, ~10 billion edges), we met some problems.
>  
> At first, we use STCS as compaction strategy , but met below exception.  we 
> checked the value of  "max memory lock" is unlimited and "file map count" is 
> 1 million, these values should enough for loading data. last we found this 
> problem is caused by the virtual memory are all cosumed by cassandra.  So not 
> additional virtual memory can be used by compaction task , and below 
> exception is thrown out.   
> {quote}ERROR [CompactionExecutor:267] 2018-08-09 02:28:40,952 
> JVMStabilityInspector.javv
>  a:74 - OutOfMemory error letting the JVM handle the error:
>  java.lang.OutOfMemoryError: Map failed
> {quote}
> So, we change compaction strategy to LCS, this change seems can resolve the 
> virtual memory problem. But we found another problem : Many sstables which 
> has been compacted are still retained on disk,  these old sstables consume so 
> many disk space, it's causing no enough disk for saving real data. and we 
> found that many files like "mc_txn_compaction_xxx.log" are created under the 
> data directory. 
> After some times' investigaton, found this problem is caused by 
> "NonPeriodicTasks" thread pools.  this pools is always using only one thread 
> for processing clean task after compaction. this thread pool is instanced 
> with class DebuggableScheduledThreadPoolExecutor,
> and DebuggableScheduledThreadPoolExecutor is inherit from class  
> ScheduledThreadPoolExecutor.
> By reading the code of class DebuggableScheduledThreadPoolExecutor,  found 
> DebuggableScheduledThreadPoolExecutor is using an unbound task queue, and 
> core pool size is 1. I think it should wrong using unbound queue.  If we 
> using unbound queue, the thread pool wouldn't  increasing thread even 
> there're many tasks are blocked in queue, because unbound queue never would 
> be full.  I think here should use bound queue, so when clean task is heavily, 
> more threads would created for processing them. 
> {quote}public DebuggableScheduledThreadPoolExecutor(int corePoolSize, String 
> threadPoolName, int priority)
>  Unknown macro: \{ super(corePoolSize, new NamedThreadFactory(threadPoolName, 
> priority)); setRejectedExecutionHandler(rejectedExecutionHandler); }
>   
> public ScheduledThreadPoolExecutor(int corePoolSize,
>  ThreadFactory threadFactory)
>  Unknown macro: \{ super(corePoolSize, Integer.MAX_VALUE, 0, NANOSECONDS, new 
> DelayedWorkQueue(), threadFactory); }
> {quote}
>  Below is the case about clean task after compaction.  there nearly 3 hours 
> delay for removing file "mc-56525". 
> {quote} 
> TRACE [CompactionExecutor:81] 2018-08-16 21:22:29,664 
> LifecycleTransaction.java:363 - Staging for obsolescence 
> BigTableReader(path='/sdb/data/test_2/edgestore-365b0b70a05911e8806001ebe60a5ce7/mc-56525-big-Data.db')
>  ..
>  TRACE [CompactionExecutor:81] 2018-08-16 21:22:41,162 Tracker.java:165 - 
> removing 
> /sdb/data/test_2/edgestore-365b0b70a05911e8806001ebe60a5ce7/mc-56525-big from 
> list of files tracked for test_2.edgestore
>  
>  TRACE [NonPeriodicTasks:1] 2018-08-17 00:28:47,179 SSTableReader.java:2175 - 
> Async instance tidier for 
> /sdb/data/test_2/edgestore-365b0b70a05911e8806001ebe60a5ce7/mc-56525-big, 
> before barrier
>  TRACE [NonPeriodicTasks:1] 2018-08-17 00:28:47,180 SSTableReader.java:2181 - 
> Async instance tidier for 
> /sdb/data/test_2/edgestore-365b0b70a05911e8806001ebe60a5ce7/mc-56525-big, 
> after barrier
>  TRACE [NonPeriodicTasks:1] 2018-08-17 00:28:47,182 SSTableReader.java:2196 - 
> Async instance tidier for 
> /sdb/data/test_2/edgestore-365b0b70a05911e8806001ebe60a5ce7/mc-56525-big, 
> completed
> {quote}
>  
>  
>  
>  
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14654) Reduce heap pressure during compactions

2018-08-17 Thread Jeff Jirsa (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Jirsa updated CASSANDRA-14654:
---
Labels: Performance  (was: )

> Reduce heap pressure during compactions
> ---
>
> Key: CASSANDRA-14654
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14654
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Chris Lohfink
>Assignee: Chris Lohfink
>Priority: Major
>  Labels: Performance
> Fix For: 4.x
>
>
> Small partition compactions are painfully slow with a lot of overhead per 
> partition. There also tends to be an excess of objects created (ie 
> 200-700mb/s) per compaction thread.
> The EncoderStats walks through all the partitions and with mergeWith it will 
> create a new one per partition as it walks the potentially millions of 
> partitions. In a test scenario of about 600byte partitions and a couple 100mb 
> of data this consumed ~16% of the heap pressure. Changing this to instead 
> mutably track the min values and create one in a EncodingStats.Collector 
> brought this down considerably (but not 100% since the 
> UnfilteredRowIterator.stats() still creates 1 per partition).
> The KeyCacheKey makes a full copy of the underlying byte array in 
> ByteBufferUtil.getArray in its constructor. This is the dominating heap 
> pressure as there are more sstables. By changing this to just keeping the 
> original it completely eliminates the current dominator of the compactions 
> and also improves read performance.
> Minor tweak included for this as well for operators when compactions are 
> behind on low read clusters is to make the preemptive opening setting a 
> hotprop.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14654) Reduce heap pressure during compactions

2018-08-17 Thread Jeff Jirsa (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Jirsa updated CASSANDRA-14654:
---
Fix Version/s: 4.x

> Reduce heap pressure during compactions
> ---
>
> Key: CASSANDRA-14654
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14654
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Chris Lohfink
>Assignee: Chris Lohfink
>Priority: Major
>  Labels: Performance
> Fix For: 4.x
>
>
> Small partition compactions are painfully slow with a lot of overhead per 
> partition. There also tends to be an excess of objects created (ie 
> 200-700mb/s) per compaction thread.
> The EncoderStats walks through all the partitions and with mergeWith it will 
> create a new one per partition as it walks the potentially millions of 
> partitions. In a test scenario of about 600byte partitions and a couple 100mb 
> of data this consumed ~16% of the heap pressure. Changing this to instead 
> mutably track the min values and create one in a EncodingStats.Collector 
> brought this down considerably (but not 100% since the 
> UnfilteredRowIterator.stats() still creates 1 per partition).
> The KeyCacheKey makes a full copy of the underlying byte array in 
> ByteBufferUtil.getArray in its constructor. This is the dominating heap 
> pressure as there are more sstables. By changing this to just keeping the 
> original it completely eliminates the current dominator of the compactions 
> and also improves read performance.
> Minor tweak included for this as well for operators when compactions are 
> behind on low read clusters is to make the preemptive opening setting a 
> hotprop.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Assigned] (CASSANDRA-14655) Upgrade C* to use latest guava (26.0)

2018-08-17 Thread Vinay Chella (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinay Chella reassigned CASSANDRA-14655:


Assignee: Sumanth Pasupuleti

> Upgrade C* to use latest guava (26.0)
> -
>
> Key: CASSANDRA-14655
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14655
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Libraries
>Reporter: Sumanth Pasupuleti
>Assignee: Sumanth Pasupuleti
>Priority: Minor
> Fix For: 4.x
>
>
> C* currently uses guava 23.3. This JIRA is about changing C* to use latest 
> guava (26.0). Originated from a discussion in the mailing list.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14655) Upgrade C* to use latest guava (26.0)

2018-08-17 Thread Sumanth Pasupuleti (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sumanth Pasupuleti updated CASSANDRA-14655:
---
Fix Version/s: 4.x

> Upgrade C* to use latest guava (26.0)
> -
>
> Key: CASSANDRA-14655
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14655
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Libraries
>Reporter: Sumanth Pasupuleti
>Priority: Minor
> Fix For: 4.x
>
>
> C* currently uses guava 23.3. This JIRA is about changing C* to use latest 
> guava (26.0). Originated from a discussion in the mailing list.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14655) Upgrade C* to use latest guava (26.0)

2018-08-17 Thread Sumanth Pasupuleti (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16584403#comment-16584403
 ] 

Sumanth Pasupuleti commented on CASSANDRA-14655:


Github Branch: 
https://github.com/sumanth-pasupuleti/cassandra/tree/guava_26_trunk
Failing Unit Tests: https://circleci.com/gh/sumanth-pasupuleti/cassandra/84

As confirmed by [~andrew.tolbert], the current version of driver is 
incompatible with the latest Guava, failing the UTs as of now. Will resume work 
on guava upgrade, once we have a new release of the driver that does not have 
guava compatibility issues.

> Upgrade C* to use latest guava (26.0)
> -
>
> Key: CASSANDRA-14655
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14655
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Libraries
>Reporter: Sumanth Pasupuleti
>Priority: Minor
>
> C* currently uses guava 23.3. This JIRA is about changing C* to use latest 
> guava (26.0). Originated from a discussion in the mailing list.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14654) Reduce heap pressure during compactions

2018-08-17 Thread Jeremy Hanna (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeremy Hanna updated CASSANDRA-14654:
-
Component/s: Compaction

> Reduce heap pressure during compactions
> ---
>
> Key: CASSANDRA-14654
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14654
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Chris Lohfink
>Assignee: Chris Lohfink
>Priority: Major
>
> Small partition compactions are painfully slow with a lot of overhead per 
> partition. There also tends to be an excess of objects created (ie 
> 200-700mb/s) per compaction thread.
> The EncoderStats walks through all the partitions and with mergeWith it will 
> create a new one per partition as it walks the potentially millions of 
> partitions. In a test scenario of about 600byte partitions and a couple 100mb 
> of data this consumed ~16% of the heap pressure. Changing this to instead 
> mutably track the min values and create one in a EncodingStats.Collector 
> brought this down considerably (but not 100% since the 
> UnfilteredRowIterator.stats() still creates 1 per partition).
> The KeyCacheKey makes a full copy of the underlying byte array in 
> ByteBufferUtil.getArray in its constructor. This is the dominating heap 
> pressure as there are more sstables. By changing this to just keeping the 
> original it completely eliminates the current dominator of the compactions 
> and also improves read performance.
> Minor tweak included for this as well for operators when compactions are 
> behind on low read clusters is to make the preemptive opening setting a 
> hotprop.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-14655) Upgrade C* to use latest guava (26.0)

2018-08-17 Thread Sumanth Pasupuleti (JIRA)
Sumanth Pasupuleti created CASSANDRA-14655:
--

 Summary: Upgrade C* to use latest guava (26.0)
 Key: CASSANDRA-14655
 URL: https://issues.apache.org/jira/browse/CASSANDRA-14655
 Project: Cassandra
  Issue Type: Improvement
  Components: Libraries
Reporter: Sumanth Pasupuleti


C* currently uses guava 23.3. This JIRA is about changing C* to use latest 
guava (26.0). Originated from a discussion in the mailing list.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-14654) Reduce heap pressure during compactions

2018-08-17 Thread Chris Lohfink (JIRA)
Chris Lohfink created CASSANDRA-14654:
-

 Summary: Reduce heap pressure during compactions
 Key: CASSANDRA-14654
 URL: https://issues.apache.org/jira/browse/CASSANDRA-14654
 Project: Cassandra
  Issue Type: Improvement
Reporter: Chris Lohfink
Assignee: Chris Lohfink


Small partition compactions are painfully slow with a lot of overhead per 
partition. There also tends to be an excess of objects created (ie 200-700mb/s) 
per compaction thread.

The EncoderStats walks through all the partitions and with mergeWith it will 
create a new one per partition as it walks the potentially millions of 
partitions. In a test scenario of about 600byte partitions and a couple 100mb 
of data this consumed ~16% of the heap pressure. Changing this to instead 
mutably track the min values and create one in a EncodingStats.Collector 
brought this down considerably (but not 100% since the 
UnfilteredRowIterator.stats() still creates 1 per partition).

The KeyCacheKey makes a full copy of the underlying byte array in 
ByteBufferUtil.getArray in its constructor. This is the dominating heap 
pressure as there are more sstables. By changing this to just keeping the 
original it completely eliminates the current dominator of the compactions and 
also improves read performance.

Minor tweak included for this as well for operators when compactions are behind 
on low read clusters is to make the preemptive opening setting a hotprop.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14631) Add RSS support for Cassandra blog

2018-08-17 Thread Dinesh Joshi (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dinesh Joshi updated CASSANDRA-14631:
-
Reviewer: Dinesh Joshi

> Add RSS support for Cassandra blog
> --
>
> Key: CASSANDRA-14631
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14631
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Documentation and Website
>Reporter: Jacques-Henri Berthemet
>Assignee: Jeff Beck
>Priority: Major
> Attachments: 14631-site.txt
>
>
> It would be convenient to add RSS support to Cassandra blog:
> [http://cassandra.apache.org/blog/2018/08/07/faster_streaming_in_cassandra.html]
> And maybe also for other resources like new versions, but this ticket is 
> about blog.
>  
> {quote}From: Scott Andreas
> Sent: Wednesday, August 08, 2018 6:53 PM
> To: [d...@cassandra.apache.org|mailto:d...@cassandra.apache.org]
> Subject: Re: Apache Cassandra Blog is now live
>  
> Please feel free to file a ticket (label: Documentation and Website).
>  
> It looks like Jekyll, the static site generator used to build the website, 
> has a plugin that generates Atom feeds if someone would like to work on 
> adding one: [https://github.com/jekyll/jekyll-feed]
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Assigned] (CASSANDRA-14631) Add RSS support for Cassandra blog

2018-08-17 Thread Dinesh Joshi (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dinesh Joshi reassigned CASSANDRA-14631:


Assignee: Jeff Beck

> Add RSS support for Cassandra blog
> --
>
> Key: CASSANDRA-14631
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14631
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Documentation and Website
>Reporter: Jacques-Henri Berthemet
>Assignee: Jeff Beck
>Priority: Major
> Attachments: 14631-site.txt
>
>
> It would be convenient to add RSS support to Cassandra blog:
> [http://cassandra.apache.org/blog/2018/08/07/faster_streaming_in_cassandra.html]
> And maybe also for other resources like new versions, but this ticket is 
> about blog.
>  
> {quote}From: Scott Andreas
> Sent: Wednesday, August 08, 2018 6:53 PM
> To: [d...@cassandra.apache.org|mailto:d...@cassandra.apache.org]
> Subject: Re: Apache Cassandra Blog is now live
>  
> Please feel free to file a ticket (label: Documentation and Website).
>  
> It looks like Jekyll, the static site generator used to build the website, 
> has a plugin that generates Atom feeds if someone would like to work on 
> adding one: [https://github.com/jekyll/jekyll-feed]
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14631) Add RSS support for Cassandra blog

2018-08-17 Thread Jeff Beck (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Beck updated CASSANDRA-14631:
--
Attachment: 14631-site.txt
Status: Patch Available  (was: Open)

I added the feed plugin that allows RSS subscriptions.

Given we need more gems to make this work I setup bundler and a gemlock to make 
sure it will always match up. 

When I generated locally I noticed the date in the published blog post doesn't 
match what is generated not sure if that is an artifact of how it was 
originally published

> Add RSS support for Cassandra blog
> --
>
> Key: CASSANDRA-14631
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14631
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Documentation and Website
>Reporter: Jacques-Henri Berthemet
>Priority: Major
> Attachments: 14631-site.txt
>
>
> It would be convenient to add RSS support to Cassandra blog:
> [http://cassandra.apache.org/blog/2018/08/07/faster_streaming_in_cassandra.html]
> And maybe also for other resources like new versions, but this ticket is 
> about blog.
>  
> {quote}From: Scott Andreas
> Sent: Wednesday, August 08, 2018 6:53 PM
> To: [d...@cassandra.apache.org|mailto:d...@cassandra.apache.org]
> Subject: Re: Apache Cassandra Blog is now live
>  
> Please feel free to file a ticket (label: Documentation and Website).
>  
> It looks like Jekyll, the static site generator used to build the website, 
> has a plugin that generates Atom feeds if someone would like to work on 
> adding one: [https://github.com/jekyll/jekyll-feed]
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14651) No longer possible to specify cassandra_dir via pytest.ini on cassandra-dtest

2018-08-17 Thread Jordan West (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16584170#comment-16584170
 ] 

Jordan West commented on CASSANDRA-14651:
-

On the code, the only comment I have, which is a very minor/optional nit, is it 
would be nice to encapsulate the fetching of cassandra_dir in a method. 
{{.pytest_cache}} also seems like a good thing to put in {{.gitignore}}. Its in 
[Github's official one for 
Python|https://github.com/github/gitignore/commit/f651f0d3eef062a8592e017a194e703d93f3e5c9].

> No longer possible to specify cassandra_dir via pytest.ini on cassandra-dtest
> -
>
> Key: CASSANDRA-14651
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14651
> Project: Cassandra
>  Issue Type: Bug
>  Components: Testing
>Reporter: Paulo Motta
>Assignee: Paulo Motta
>Priority: Trivial
>
> It seems like ability to specify {{cassandra_dir}} via {{pytest.init}}, as 
> [stated on the 
> doc|https://github.com/apache/cassandra-dtest/blame/master/README.md#L79] was 
> lost after CASSANDRA-14449. We should either get it back or remove it from 
> the doc.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14436) Add sampler for query time and expose with nodetool

2018-08-17 Thread Chris Lohfink (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16584137#comment-16584137
 ] 

Chris Lohfink commented on CASSANDRA-14436:
---

Pushed requested changes. Having some issues with circleci and dtests Ill ask 
for some help with though

|[units|https://circleci.com/gh/clohfink/cassandra/298]|


> Add sampler for query time and expose with nodetool
> ---
>
> Key: CASSANDRA-14436
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14436
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Chris Lohfink
>Assignee: Chris Lohfink
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Create a new {{nodetool profileload}} that functions just like toppartitions 
> but with more data, returning the slowest local reads and writes on the host 
> during a given duration and highest frequency touched partitions (same as 
> {{nodetool toppartitions}}). Refactor included to extend use of the sampler 
> for uses outside of top frequency (max instead of total sample values).
> Future work to this is to include top cpu and allocations by query and 
> possibly tasks/cpu/allocations by stage during time window.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14652) Extend IAuthenticator to accept peer SSL certificates

2018-08-17 Thread Jeremy Hanna (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14652?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeremy Hanna updated CASSANDRA-14652:
-
Labels: Security  (was: )

> Extend IAuthenticator to accept peer SSL certificates
> -
>
> Key: CASSANDRA-14652
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14652
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Auth
>Reporter: Dinesh Joshi
>Assignee: Dinesh Joshi
>Priority: Major
>  Labels: Security
> Fix For: 4.0
>
>
> This patch will extend the IAuthenticator interface to accept peer's SSL 
> certificates. This will allow the Authenticator implementations to perform 
> additional checks from the client, if so desired.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14346) Scheduled Repair in Cassandra

2018-08-17 Thread Stefan Podkowinski (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16584032#comment-16584032
 ] 

Stefan Podkowinski commented on CASSANDRA-14346:


I'd add a license file per jar to keep things consistent. Adding the 
servlet-api (dual CDDL/GPL) needs some more careful handling compared to 
permissive licensed deps, but shouldn't be a blocker per se, if we decide we 
really want to get this committed.

> Scheduled Repair in Cassandra
> -
>
> Key: CASSANDRA-14346
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14346
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Repair
>Reporter: Joseph Lynch
>Assignee: Joseph Lynch
>Priority: Major
>  Labels: 4.0-feature-freeze-review-requested, 
> CommunityFeedbackRequested
> Fix For: 4.0
>
> Attachments: ScheduledRepairV1_20180327.pdf
>
>
> There have been many attempts to automate repair in Cassandra, which makes 
> sense given that it is necessary to give our users eventual consistency. Most 
> recently CASSANDRA-10070, CASSANDRA-8911 and CASSANDRA-13924 have all looked 
> for ways to solve this problem.
> At Netflix we've built a scheduled repair service within Priam (our sidecar), 
> which we spoke about last year at NGCC. Given the positive feedback at NGCC 
> we focussed on getting it production ready and have now been using it in 
> production to repair hundreds of clusters, tens of thousands of nodes, and 
> petabytes of data for the past six months. Also based on feedback at NGCC we 
> have invested effort in figuring out how to integrate this natively into 
> Cassandra rather than open sourcing it as an external service (e.g. in Priam).
> As such, [~vinaykumarcse] and I would like to re-work and merge our 
> implementation into Cassandra, and have created a [design 
> document|https://docs.google.com/document/d/1RV4rOrG1gwlD5IljmrIq_t45rz7H3xs9GbFSEyGzEtM/edit?usp=sharing]
>  showing how we plan to make it happen, including the the user interface.
> As we work on the code migration from Priam to Cassandra, any feedback would 
> be greatly appreciated about the interface or v1 implementation features. I 
> have tried to call out in the document features which we explicitly consider 
> future work (as well as a path forward to implement them in the future) 
> because I would very much like to get this done before the 4.0 merge window 
> closes, and to do that I think aggressively pruning scope is going to be a 
> necessity.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-11671) Remove check on gossip status from DynamicEndpointSnitch::updateScores

2018-08-17 Thread Jason Brown (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Brown updated CASSANDRA-11671:

Reviewer: Jason Brown

> Remove check on gossip status from DynamicEndpointSnitch::updateScores
> --
>
> Key: CASSANDRA-11671
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11671
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Coordination
>Reporter: Sam Tunnicliffe
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.x
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> It seems that historically there were initialization ordering issues which 
> affected DES and StorageService (CASSANDRA-1756) and so a condition was added 
> to DES::updateScores() to ensure that SS had finished setup. In fact, the 
> check was actually testing whether gossip was active or not. CASSANDRA-10134 
> preserved this behaviour, but it seems likely that the check can be removed 
> from DES completely now. If not, it can at least be switched to use 
> SS::isInitialized() which post CASSANDRA-10134 actually reports what it's 
> name suggests.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-11671) Remove check on gossip status from DynamicEndpointSnitch::updateScores

2018-08-17 Thread Artsiom Yudovin (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Artsiom Yudovin updated CASSANDRA-11671:

Status: Patch Available  (was: Awaiting Feedback)

> Remove check on gossip status from DynamicEndpointSnitch::updateScores
> --
>
> Key: CASSANDRA-11671
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11671
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Coordination
>Reporter: Sam Tunnicliffe
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.x
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> It seems that historically there were initialization ordering issues which 
> affected DES and StorageService (CASSANDRA-1756) and so a condition was added 
> to DES::updateScores() to ensure that SS had finished setup. In fact, the 
> check was actually testing whether gossip was active or not. CASSANDRA-10134 
> preserved this behaviour, but it seems likely that the check can be removed 
> from DES completely now. If not, it can at least be switched to use 
> SS::isInitialized() which post CASSANDRA-10134 actually reports what it's 
> name suggests.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-11671) Remove check on gossip status from DynamicEndpointSnitch::updateScores

2018-08-17 Thread Artsiom Yudovin (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Artsiom Yudovin updated CASSANDRA-11671:

Status: Ready to Commit  (was: Patch Available)

> Remove check on gossip status from DynamicEndpointSnitch::updateScores
> --
>
> Key: CASSANDRA-11671
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11671
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Coordination
>Reporter: Sam Tunnicliffe
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.x
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> It seems that historically there were initialization ordering issues which 
> affected DES and StorageService (CASSANDRA-1756) and so a condition was added 
> to DES::updateScores() to ensure that SS had finished setup. In fact, the 
> check was actually testing whether gossip was active or not. CASSANDRA-10134 
> preserved this behaviour, but it seems likely that the check can be removed 
> from DES completely now. If not, it can at least be switched to use 
> SS::isInitialized() which post CASSANDRA-10134 actually reports what it's 
> name suggests.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-11671) Remove check on gossip status from DynamicEndpointSnitch::updateScores

2018-08-17 Thread Artsiom Yudovin (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Artsiom Yudovin updated CASSANDRA-11671:

Status: Patch Available  (was: Open)

this [pull request|https://github.com/apache/cassandra/pull/251] 

> Remove check on gossip status from DynamicEndpointSnitch::updateScores
> --
>
> Key: CASSANDRA-11671
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11671
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Coordination
>Reporter: Sam Tunnicliffe
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.x
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> It seems that historically there were initialization ordering issues which 
> affected DES and StorageService (CASSANDRA-1756) and so a condition was added 
> to DES::updateScores() to ensure that SS had finished setup. In fact, the 
> check was actually testing whether gossip was active or not. CASSANDRA-10134 
> preserved this behaviour, but it seems likely that the check can be removed 
> from DES completely now. If not, it can at least be switched to use 
> SS::isInitialized() which post CASSANDRA-10134 actually reports what it's 
> name suggests.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-11671) Remove check on gossip status from DynamicEndpointSnitch::updateScores

2018-08-17 Thread Artsiom Yudovin (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Artsiom Yudovin updated CASSANDRA-11671:

Status: Awaiting Feedback  (was: In Progress)

> Remove check on gossip status from DynamicEndpointSnitch::updateScores
> --
>
> Key: CASSANDRA-11671
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11671
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Coordination
>Reporter: Sam Tunnicliffe
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.x
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> It seems that historically there were initialization ordering issues which 
> affected DES and StorageService (CASSANDRA-1756) and so a condition was added 
> to DES::updateScores() to ensure that SS had finished setup. In fact, the 
> check was actually testing whether gossip was active or not. CASSANDRA-10134 
> preserved this behaviour, but it seems likely that the check can be removed 
> from DES completely now. If not, it can at least be switched to use 
> SS::isInitialized() which post CASSANDRA-10134 actually reports what it's 
> name suggests.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-11671) Remove check on gossip status from DynamicEndpointSnitch::updateScores

2018-08-17 Thread Artsiom Yudovin (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Artsiom Yudovin updated CASSANDRA-11671:

Status: In Progress  (was: Ready to Commit)

> Remove check on gossip status from DynamicEndpointSnitch::updateScores
> --
>
> Key: CASSANDRA-11671
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11671
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Coordination
>Reporter: Sam Tunnicliffe
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.x
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> It seems that historically there were initialization ordering issues which 
> affected DES and StorageService (CASSANDRA-1756) and so a condition was added 
> to DES::updateScores() to ensure that SS had finished setup. In fact, the 
> check was actually testing whether gossip was active or not. CASSANDRA-10134 
> preserved this behaviour, but it seems likely that the check can be removed 
> from DES completely now. If not, it can at least be switched to use 
> SS::isInitialized() which post CASSANDRA-10134 actually reports what it's 
> name suggests.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-11671) Remove check on gossip status from DynamicEndpointSnitch::updateScores

2018-08-17 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated CASSANDRA-11671:
---
Labels: pull-request-available  (was: )

> Remove check on gossip status from DynamicEndpointSnitch::updateScores
> --
>
> Key: CASSANDRA-11671
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11671
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Coordination
>Reporter: Sam Tunnicliffe
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.x
>
>
> It seems that historically there were initialization ordering issues which 
> affected DES and StorageService (CASSANDRA-1756) and so a condition was added 
> to DES::updateScores() to ensure that SS had finished setup. In fact, the 
> check was actually testing whether gossip was active or not. CASSANDRA-10134 
> preserved this behaviour, but it seems likely that the check can be removed 
> from DES completely now. If not, it can at least be switched to use 
> SS::isInitialized() which post CASSANDRA-10134 actually reports what it's 
> name suggests.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14652) Extend IAuthenticator to accept peer SSL certificates

2018-08-17 Thread Jason Brown (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14652?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Brown updated CASSANDRA-14652:

Resolution: Fixed
Status: Resolved  (was: Patch Available)

+1

Committed as sha {{ac1bb75867a9a878a86d9b659234f78772627287}}. Thanks!

> Extend IAuthenticator to accept peer SSL certificates
> -
>
> Key: CASSANDRA-14652
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14652
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Auth
>Reporter: Dinesh Joshi
>Assignee: Dinesh Joshi
>Priority: Major
> Fix For: 4.0
>
>
> This patch will extend the IAuthenticator interface to accept peer's SSL 
> certificates. This will allow the Authenticator implementations to perform 
> additional checks from the client, if so desired.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



cassandra git commit: Extend IAuthenticator to accept peer SSL certificates

2018-08-17 Thread jasobrown
Repository: cassandra
Updated Branches:
  refs/heads/trunk 298416a74 -> ac1bb7586


Extend IAuthenticator to accept peer SSL certificates

patch by Dinesh Joshi; reviewed by jasobrown for CASSANDRA-14652


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/ac1bb758
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/ac1bb758
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/ac1bb758

Branch: refs/heads/trunk
Commit: ac1bb75867a9a878a86d9b659234f78772627287
Parents: 298416a
Author: Dinesh A. Joshi 
Authored: Thu Aug 16 15:01:20 2018 -0700
Committer: Jason Brown 
Committed: Fri Aug 17 06:43:45 2018 -0700

--
 CHANGES.txt |  1 +
 .../apache/cassandra/auth/IAuthenticator.java   | 18 +++
 .../cassandra/transport/ServerConnection.java   | 33 +++-
 3 files changed, 51 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/ac1bb758/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 0e671b0..d906879 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 4.0
+ * Extend IAuthenticator to accept peer SSL certificates (CASSANDRA-14652)
  * Incomplete handling of exceptions when decoding incoming messages 
(CASSANDRA-14574)
  * Add diagnostic events for user audit logging (CASSANDRA-13668)
  * Allow retrieving diagnostic events via JMX (CASSANDRA-14435)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/ac1bb758/src/java/org/apache/cassandra/auth/IAuthenticator.java
--
diff --git a/src/java/org/apache/cassandra/auth/IAuthenticator.java 
b/src/java/org/apache/cassandra/auth/IAuthenticator.java
index 9eb50a7..212e774 100644
--- a/src/java/org/apache/cassandra/auth/IAuthenticator.java
+++ b/src/java/org/apache/cassandra/auth/IAuthenticator.java
@@ -21,6 +21,8 @@ import java.net.InetAddress;
 import java.util.Map;
 import java.util.Set;
 
+import javax.security.cert.X509Certificate;
+
 import org.apache.cassandra.exceptions.AuthenticationException;
 import org.apache.cassandra.exceptions.ConfigurationException;
 
@@ -65,6 +67,22 @@ public interface IAuthenticator
 SaslNegotiator newSaslNegotiator(InetAddress clientAddress);
 
 /**
+ * Provide a SASL handler to perform authentication for an single 
connection. SASL
+ * is a stateful protocol, so a new instance must be used for each 
authentication
+ * attempt. This method accepts certificates as well. Authentication 
strategies can
+ * override this method to gain access to client's certificate chain, if 
present.
+ * @param clientAddress the IP address of the client whom we wish to 
authenticate, or null
+ *  if an internal client (one not connected over the 
remote transport).
+ * @param certificates the peer's X509 Certificate chain, if present.
+ * @return org.apache.cassandra.auth.IAuthenticator.SaslNegotiator 
implementation
+ * (see {@link 
org.apache.cassandra.auth.PasswordAuthenticator.PlainTextSaslAuthenticator})
+ */
+default SaslNegotiator newSaslNegotiator(InetAddress clientAddress, 
X509Certificate[] certificates)
+{
+return newSaslNegotiator(clientAddress);
+}
+
+/**
  * A legacy method that is still used by JMX authentication.
  *
  * You should implement this for having JMX authentication through your

http://git-wip-us.apache.org/repos/asf/cassandra/blob/ac1bb758/src/java/org/apache/cassandra/transport/ServerConnection.java
--
diff --git a/src/java/org/apache/cassandra/transport/ServerConnection.java 
b/src/java/org/apache/cassandra/transport/ServerConnection.java
index d78b7c0..00e334c 100644
--- a/src/java/org/apache/cassandra/transport/ServerConnection.java
+++ b/src/java/org/apache/cassandra/transport/ServerConnection.java
@@ -20,8 +20,15 @@ package org.apache.cassandra.transport;
 import java.util.concurrent.ConcurrentHashMap;
 import java.util.concurrent.ConcurrentMap;
 
+import javax.net.ssl.SSLPeerUnverifiedException;
+import javax.security.cert.X509Certificate;
+
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
 import io.netty.channel.Channel;
 import com.codahale.metrics.Counter;
+import io.netty.handler.ssl.SslHandler;
 import org.apache.cassandra.auth.IAuthenticator;
 import org.apache.cassandra.config.DatabaseDescriptor;
 import org.apache.cassandra.service.ClientState;
@@ -29,6 +36,7 @@ import org.apache.cassandra.service.QueryState;
 
 public class ServerConnection extends Connection
 {
+private static Logger logger = 
LoggerFactory.getLogger(ServerConnection.class);
 
 private 

[jira] [Commented] (CASSANDRA-14647) Reading cardinality from Statistics.db failed

2018-08-17 Thread Romain Hardouin (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16583927#comment-16583927
 ] 

Romain Hardouin commented on CASSANDRA-14647:
-

This is not due to STCS -> LCS. I have the same behavior on one cluster with 
LCS and heavy writes. STCS has never been configured on it.

> Reading cardinality from Statistics.db failed
> -
>
> Key: CASSANDRA-14647
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14647
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
> Environment: Clients are doing only writes with Local One, cluster 
> consist of 3 regions with RF3.
> Storage is configured wth jbod/XFS on 10 x 1Tb disks
> IOPS limit for each disk 500 (total 5000 iops)
> Bandwith for each disk 60mb/s (600 total)
> OS is Debian linux.
>Reporter: Vitali Djatsuk
>Priority: Major
> Fix For: 3.0.x
>
> Attachments: cassandra_compaction_pending_tasks_7days.png
>
>
> There is some issue with sstable metadata which is visible in system.log, the 
> messages says:
> {noformat}
> WARN  [Thread-6] 2018-07-25 07:12:47,928 SSTableReader.java:249 - Reading 
> cardinality from Statistics.db failed for 
> /opt/data/disk5/data/keyspace/table/mc-big-Data.db.{noformat}
> Although there is no such file. 
> The message has appeared after i've changed the compaction strategy from 
> SizeTiered to Leveled. Compaction strategy has been changed region by region 
> (total 3 regions) and it has coincided with the double client write traffic 
> increase.
>  I have tried to run nodetool scrub to rebuilt the sstable, but that does not 
> fix the issue.
> So very hard to define the steps to reproduce, probably it will be:
>  # run stress tool with write traffic
>  # under load change compaction strategy from SireTiered to Leveled for the 
> bunch of hosts
>  # add more write traffic
> Reading the code it is said that if this metadata is broken, then "estimating 
> the keys will be done using index summary". 
>  
> [https://github.com/apache/cassandra/blob/cassandra-3.0.17/src/java/org/apache/cassandra/io/sstable/format/SSTableReader.java#L247]
>   



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14574) Incomplete handling of exceptions when decoding incoming messages

2018-08-17 Thread Jason Brown (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Brown updated CASSANDRA-14574:

Resolution: Fixed
Status: Resolved  (was: Ready to Commit)

> Incomplete handling of exceptions when decoding incoming messages 
> --
>
> Key: CASSANDRA-14574
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14574
> Project: Cassandra
>  Issue Type: Bug
>  Components: Streaming and Messaging
>Reporter: Aleksey Yeschenko
>Assignee: Jason Brown
>Priority: Major
> Fix For: 4.0
>
>
> {{MessageInHandler.decode()}} occasionally reads the payload incorrectly, 
> passing the full message to {{MessageIn.read()}} instead of just the payload 
> bytes.
> You can see the stack trace in the logs from this [CI 
> run|https://circleci.com/gh/iamaleksey/cassandra/437#tests/containers/38].
> {code}
> Caused by: java.lang.AssertionError: null
>   at 
> org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:351)
>   at 
> org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:371)
>   at 
> org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:335)
>   at org.apache.cassandra.net.MessageIn.read(MessageIn.java:158)
>   at 
> org.apache.cassandra.net.async.MessageInHandler.decode(MessageInHandler.java:132)
> {code}
> Reconstructed, truncated stream passed to {{MessageIn.read()}}:
> {{000b000743414c5f42414301002a01e1a5c9b089fd11e8b517436ee124300704005d10fc50ec}}
> You can clearly see parameters in there encoded before the payload:
> {{[43414c5f424143 - CAL_BAC] [01 - ONE_BYTE] [002a - 42, payload size] 01 e1 
> a5 c9 b0 89 fd 11 e8 b5 17 43 6e e1 24 30 07 04 00 00 00 1d 10 fc 50 ec}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14574) Incomplete handling of exceptions when decoding incoming messages

2018-08-17 Thread Jason Brown (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16583874#comment-16583874
 ] 

Jason Brown commented on CASSANDRA-14574:
-

Committed to c* as sha {{298416a7445aa50874caebc779ca3094b32f3e31}}, committed 
to dtest as sha {{6e80b1846c308bb13d0b700263c89f10caa17d28}}. Thanks, all!

> Incomplete handling of exceptions when decoding incoming messages 
> --
>
> Key: CASSANDRA-14574
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14574
> Project: Cassandra
>  Issue Type: Bug
>  Components: Streaming and Messaging
>Reporter: Aleksey Yeschenko
>Assignee: Jason Brown
>Priority: Major
> Fix For: 4.0
>
>
> {{MessageInHandler.decode()}} occasionally reads the payload incorrectly, 
> passing the full message to {{MessageIn.read()}} instead of just the payload 
> bytes.
> You can see the stack trace in the logs from this [CI 
> run|https://circleci.com/gh/iamaleksey/cassandra/437#tests/containers/38].
> {code}
> Caused by: java.lang.AssertionError: null
>   at 
> org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:351)
>   at 
> org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:371)
>   at 
> org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:335)
>   at org.apache.cassandra.net.MessageIn.read(MessageIn.java:158)
>   at 
> org.apache.cassandra.net.async.MessageInHandler.decode(MessageInHandler.java:132)
> {code}
> Reconstructed, truncated stream passed to {{MessageIn.read()}}:
> {{000b000743414c5f42414301002a01e1a5c9b089fd11e8b517436ee124300704005d10fc50ec}}
> You can clearly see parameters in there encoded before the payload:
> {{[43414c5f424143 - CAL_BAC] [01 - ONE_BYTE] [002a - 42, payload size] 01 e1 
> a5 c9 b0 89 fd 11 e8 b5 17 43 6e e1 24 30 07 04 00 00 00 1d 10 fc 50 ec}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



cassandra-dtest git commit: Test corrupting an internode messaging connection, and ensure it reconnects.

2018-08-17 Thread jasobrown
Repository: cassandra-dtest
Updated Branches:
  refs/heads/master e426ce1da -> 6e80b1846


Test corrupting an internode messaging connection, and ensure it reconnects.

patch by jasobrown; reviewed by Dinesh Joshi for CASSANDRA-14574


Project: http://git-wip-us.apache.org/repos/asf/cassandra-dtest/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra-dtest/commit/6e80b184
Tree: http://git-wip-us.apache.org/repos/asf/cassandra-dtest/tree/6e80b184
Diff: http://git-wip-us.apache.org/repos/asf/cassandra-dtest/diff/6e80b184

Branch: refs/heads/master
Commit: 6e80b1846c308bb13d0b700263c89f10caa17d28
Parents: e426ce1
Author: Jason Brown 
Authored: Thu Aug 16 06:27:23 2018 -0700
Committer: Jason Brown 
Committed: Fri Aug 17 05:55:40 2018 -0700

--
 byteman/corrupt_internode_messages_gossip.btm | 17 
 internode_messaging_test.py   | 48 ++
 2 files changed, 65 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/cassandra-dtest/blob/6e80b184/byteman/corrupt_internode_messages_gossip.btm
--
diff --git a/byteman/corrupt_internode_messages_gossip.btm 
b/byteman/corrupt_internode_messages_gossip.btm
new file mode 100644
index 000..66e4fe2
--- /dev/null
+++ b/byteman/corrupt_internode_messages_gossip.btm
@@ -0,0 +1,17 @@
+#
+# corrupt the first gossip ACK message. we corrupt it on the way out,
+# in serialize(), so it fails on deserializing. However, we also need
+# to hack the serializedSize().
+#
+
+RULE corrupt the first gossip ACK message
+CLASS org.apache.cassandra.gms.GossipDigestAckSerializer
+METHOD serialize(org.apache.cassandra.gms.GossipDigestAck, 
org.apache.cassandra.io.util.DataOutputPlus, int)
+AT ENTRY
+# set flag to only run this rule once.
+IF NOT flagged("done")
+DO
+   flag("done");
+   $2.writeInt(-1);
+ENDRULE
+

http://git-wip-us.apache.org/repos/asf/cassandra-dtest/blob/6e80b184/internode_messaging_test.py
--
diff --git a/internode_messaging_test.py b/internode_messaging_test.py
new file mode 100644
index 000..d0d4d1f
--- /dev/null
+++ b/internode_messaging_test.py
@@ -0,0 +1,48 @@
+import pytest
+import logging
+import time
+
+from dtest import Tester
+
+since = pytest.mark.since
+logger = logging.getLogger(__name__)
+
+_LOG_ERR_ILLEGAL_CAPACITY = "Caused by: java.lang.IllegalArgumentException: 
Illegal Capacity: -1"
+
+
+@since('4.0')
+class TestInternodeMessaging(Tester):
+
+@pytest.fixture(autouse=True)
+def fixture_add_additional_log_patterns(self, fixture_dtest_setup):
+fixture_dtest_setup.ignore_log_patterns = (
+r'Illegal Capacity: -1',
+r'reported message size'
+)
+
+def test_message_corruption(self):
+"""
+@jira_ticket CASSANDRA-14574
+
+Use byteman to corrupt an outgoing gossip ACK message, check that the 
recipient fails *once* on the message
+but does not spin out of control trying to process the rest of the 
bytes in the buffer.
+Then make sure normal messaging can occur after a reconnect (on a 
different socket, of course).
+"""
+cluster = self.cluster
+cluster.populate(2, install_byteman=True)
+cluster.start(wait_other_notice=True)
+
+node1, node2 = cluster.nodelist()
+node1_log_mark = node1.mark_log()
+node2_log_mark = node2.mark_log()
+
+
node2.byteman_submit(['./byteman/corrupt_internode_messages_gossip.btm'])
+
+# wait for the deserialization error to happen on node1
+time.sleep(10)
+assert len(node1.grep_log(_LOG_ERR_ILLEGAL_CAPACITY, 
from_mark=node1_log_mark)) == 1
+
+# now, make sure node2 reconnects (and continues gossiping).
+# node.watch_log_for() will time out if it cannot find the log entry
+assert node2.grep_log('successfully connected to 127.0.0.1:7000 
\(GOSSIP\)',
+  from_mark=node2_log_mark, filename='debug.log')


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



cassandra git commit: Incomplete handling of exceptions when decoding incoming messages

2018-08-17 Thread jasobrown
Repository: cassandra
Updated Branches:
  refs/heads/trunk d8c451923 -> 298416a74


Incomplete handling of exceptions when decoding incoming messages

patch by jasobrown; reviewed by Dinesh Joshi for CASSANDRA-14574


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/298416a7
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/298416a7
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/298416a7

Branch: refs/heads/trunk
Commit: 298416a7445aa50874caebc779ca3094b32f3e31
Parents: d8c4519
Author: Jason Brown 
Authored: Wed Jul 18 13:47:22 2018 -0700
Committer: Jason Brown 
Committed: Fri Aug 17 05:54:37 2018 -0700

--
 CHANGES.txt |   1 +
 .../net/async/BaseMessageInHandler.java |  61 -
 .../cassandra/net/async/MessageInHandler.java   | 121 +++-
 .../net/async/MessageInHandlerPre40.java| 137 +--
 .../test/microbench/MessageOutBench.java|   6 +-
 .../net/async/MessageInHandlerTest.java |  65 -
 6 files changed, 238 insertions(+), 153 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/298416a7/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index d2970a4..0e671b0 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 4.0
+ * Incomplete handling of exceptions when decoding incoming messages 
(CASSANDRA-14574)
  * Add diagnostic events for user audit logging (CASSANDRA-13668)
  * Allow retrieving diagnostic events via JMX (CASSANDRA-14435)
  * Add base classes for diagnostic events (CASSANDRA-13457)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/298416a7/src/java/org/apache/cassandra/net/async/BaseMessageInHandler.java
--
diff --git a/src/java/org/apache/cassandra/net/async/BaseMessageInHandler.java 
b/src/java/org/apache/cassandra/net/async/BaseMessageInHandler.java
index 7314999..2f2a973 100644
--- a/src/java/org/apache/cassandra/net/async/BaseMessageInHandler.java
+++ b/src/java/org/apache/cassandra/net/async/BaseMessageInHandler.java
@@ -26,7 +26,6 @@ import java.util.Map;
 import java.util.function.BiConsumer;
 
 import com.google.common.annotations.VisibleForTesting;
-
 import org.slf4j.Logger;
 import org.slf4j.LoggerFactory;
 
@@ -40,6 +39,14 @@ import org.apache.cassandra.net.MessageIn;
 import org.apache.cassandra.net.MessagingService;
 import org.apache.cassandra.net.ParameterType;
 
+/**
+ * Parses out individual messages from the incoming buffers. Each message, 
both header and payload, is incrementally built up
+ * from the available input data, then passed to the {@link #messageConsumer}.
+ *
+ * Note: this class derives from {@link ByteToMessageDecoder} to take 
advantage of the {@link ByteToMessageDecoder.Cumulator}
+ * behavior across {@link #decode(ChannelHandlerContext, ByteBuf, List)} 
invocations. That way we don't have to maintain
+ * the not-fully consumed {@link ByteBuf}s.
+ */
 public abstract class BaseMessageInHandler extends ByteToMessageDecoder
 {
 public static final Logger logger = 
LoggerFactory.getLogger(BaseMessageInHandler.class);
@@ -52,7 +59,8 @@ public abstract class BaseMessageInHandler extends 
ByteToMessageDecoder
 READ_PARAMETERS_SIZE,
 READ_PARAMETERS_DATA,
 READ_PAYLOAD_SIZE,
-READ_PAYLOAD
+READ_PAYLOAD,
+CLOSED
 }
 
 /**
@@ -77,6 +85,8 @@ public abstract class BaseMessageInHandler extends 
ByteToMessageDecoder
 final InetAddressAndPort peer;
 final int messagingVersion;
 
+protected State state;
+
 public BaseMessageInHandler(InetAddressAndPort peer, int messagingVersion, 
BiConsumer messageConsumer)
 {
 this.peer = peer;
@@ -84,7 +94,36 @@ public abstract class BaseMessageInHandler extends 
ByteToMessageDecoder
 this.messageConsumer = messageConsumer;
 }
 
-public abstract void decode(ChannelHandlerContext ctx, ByteBuf in, 
List out);
+// redeclared here to make the method public (for testing)
+@VisibleForTesting
+public void decode(ChannelHandlerContext ctx, ByteBuf in, List 
out) throws Exception
+{
+if (state == State.CLOSED)
+{
+in.skipBytes(in.readableBytes());
+return;
+}
+
+try
+{
+handleDecode(ctx, in, out);
+}
+catch (Exception e)
+{
+// prevent any future attempts at reading messages from any 
inbound buffers, as we're already in a bad state
+state = State.CLOSED;
+
+// force the buffer to appear to be consumed, thereby exiting the 
ByteToMessageDecoder.callDecode() loop,
+// and other 

[jira] [Commented] (CASSANDRA-14346) Scheduled Repair in Cassandra

2018-08-17 Thread Joseph Lynch (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16583869#comment-16583869
 ] 

Joseph Lynch commented on CASSANDRA-14346:
--

[~spo...@gmail.com] great catch, sorry we missed those! I'm working on adding 
them but I just wanted to double check if we need a license per jar or we can 
group them (e.g. jetty, websocket)?

Regarding {{javax.servlet-api}} I thought that project was dual licensed under 
GPLv2 as well 
([maven|https://mvnrepository.com/artifact/javax.servlet/javax.servlet-api], 
[source code|https://github.com/javaee/glassfish/blob/master/LICENSE], [website 
license page|https://javaee.github.io/glassfish/LICENSE])? Also I think it's a 
pretty common Apache project dependency, for example 
[hadoop|https://mvnrepository.com/artifact/org.apache.hadoop/hadoop-common/3.1.1]
 has it as a compile and runtime dependency I believe. Is it still an issue if 
we choose to use it under the GPLv2?

> Scheduled Repair in Cassandra
> -
>
> Key: CASSANDRA-14346
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14346
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Repair
>Reporter: Joseph Lynch
>Assignee: Joseph Lynch
>Priority: Major
>  Labels: 4.0-feature-freeze-review-requested, 
> CommunityFeedbackRequested
> Fix For: 4.0
>
> Attachments: ScheduledRepairV1_20180327.pdf
>
>
> There have been many attempts to automate repair in Cassandra, which makes 
> sense given that it is necessary to give our users eventual consistency. Most 
> recently CASSANDRA-10070, CASSANDRA-8911 and CASSANDRA-13924 have all looked 
> for ways to solve this problem.
> At Netflix we've built a scheduled repair service within Priam (our sidecar), 
> which we spoke about last year at NGCC. Given the positive feedback at NGCC 
> we focussed on getting it production ready and have now been using it in 
> production to repair hundreds of clusters, tens of thousands of nodes, and 
> petabytes of data for the past six months. Also based on feedback at NGCC we 
> have invested effort in figuring out how to integrate this natively into 
> Cassandra rather than open sourcing it as an external service (e.g. in Priam).
> As such, [~vinaykumarcse] and I would like to re-work and merge our 
> implementation into Cassandra, and have created a [design 
> document|https://docs.google.com/document/d/1RV4rOrG1gwlD5IljmrIq_t45rz7H3xs9GbFSEyGzEtM/edit?usp=sharing]
>  showing how we plan to make it happen, including the the user interface.
> As we work on the code migration from Priam to Cassandra, any feedback would 
> be greatly appreciated about the interface or v1 implementation features. I 
> have tried to call out in the document features which we explicitly consider 
> future work (as well as a path forward to implement them in the future) 
> because I would very much like to get this done before the 4.0 merge window 
> closes, and to do that I think aggressively pruning scope is going to be a 
> necessity.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14435) Diag. Events: JMX events

2018-08-17 Thread Stefan Podkowinski (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Podkowinski updated CASSANDRA-14435:
---
   Resolution: Fixed
Fix Version/s: (was: 4.x)
   4.0
   Status: Resolved  (was: Patch Available)

Committed as a79e5903b552e40f77c!

> Diag. Events: JMX events
> 
>
> Key: CASSANDRA-14435
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14435
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Observability
>Reporter: Stefan Podkowinski
>Assignee: Stefan Podkowinski
>Priority: Major
> Fix For: 4.0
>
>
> Nodes currently use JMX events for progress reporting on bootstrap and 
> repairs. This might also be an option to expose diagnostic events to external 
> subscribers.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13668) Diag. events for user audit logging

2018-08-17 Thread Stefan Podkowinski (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Podkowinski updated CASSANDRA-13668:
---
   Resolution: Fixed
Fix Version/s: (was: 4.x)
   4.0
   Status: Resolved  (was: Patch Available)

Committed as d8c45192318584!

> Diag. events for user audit logging
> ---
>
> Key: CASSANDRA-13668
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13668
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Observability
>Reporter: Stefan Podkowinski
>Assignee: Stefan Podkowinski
>Priority: Major
> Fix For: 4.0
>
>
> With the availability of CASSANDRA-13459, any native transport enabled client 
> will be able to subscribe to internal Cassandra events. External tools can 
> take advantage by monitoring these events in various ways. Use-cases for this 
> can be e.g. auditing tools for compliance and security purposes.
> The scope of this ticket is to add diagnostic events that are raised around 
> authentication and CQL operations. These events can then be consumed and used 
> by external tools to implement a Cassandra user auditing solution.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13457) Diag. Events: Add base classes

2018-08-17 Thread Stefan Podkowinski (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13457?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Podkowinski updated CASSANDRA-13457:
---
   Resolution: Fixed
Fix Version/s: 4.0
   Status: Resolved  (was: Patch Available)

Committed as 2846b22a70d48bae.

Thanks [~michaelsembwever] and [~jasobrown] for reviewing and sharing your 
feedback!

> Diag. Events: Add base classes
> --
>
> Key: CASSANDRA-13457
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13457
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Core, Observability
>Reporter: Stefan Podkowinski
>Assignee: Stefan Podkowinski
>Priority: Major
> Fix For: 4.0
>
>
> Base ticket for adding classes that will allow you to implement and subscribe 
> to events.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[1/5] cassandra git commit: Add diagnostic events base classes

2018-08-17 Thread spod
Repository: cassandra
Updated Branches:
  refs/heads/trunk d3e6891ec -> d8c451923


http://git-wip-us.apache.org/repos/asf/cassandra/blob/2846b22a/src/java/org/apache/cassandra/service/PendingRangeCalculatorServiceDiagnostics.java
--
diff --git 
a/src/java/org/apache/cassandra/service/PendingRangeCalculatorServiceDiagnostics.java
 
b/src/java/org/apache/cassandra/service/PendingRangeCalculatorServiceDiagnostics.java
new file mode 100644
index 000..ec09e3f
--- /dev/null
+++ 
b/src/java/org/apache/cassandra/service/PendingRangeCalculatorServiceDiagnostics.java
@@ -0,0 +1,73 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.cassandra.service;
+
+import java.util.concurrent.atomic.AtomicInteger;
+
+import org.apache.cassandra.diag.DiagnosticEventService;
+import 
org.apache.cassandra.service.PendingRangeCalculatorServiceEvent.PendingRangeCalculatorServiceEventType;
+
+/**
+ * Utility methods for diagnostic events related to {@link 
PendingRangeCalculatorService}.
+ */
+final class PendingRangeCalculatorServiceDiagnostics
+{
+private static final DiagnosticEventService service = 
DiagnosticEventService.instance();
+
+private PendingRangeCalculatorServiceDiagnostics()
+{
+}
+
+static void taskStarted(PendingRangeCalculatorService calculatorService, 
AtomicInteger taskCount)
+{
+if (isEnabled(PendingRangeCalculatorServiceEventType.TASK_STARTED))
+service.publish(new 
PendingRangeCalculatorServiceEvent(PendingRangeCalculatorServiceEventType.TASK_STARTED,
+   
calculatorService,
+   
taskCount.get()));
+}
+
+static void taskFinished(PendingRangeCalculatorService calculatorService, 
AtomicInteger taskCount)
+{
+if 
(isEnabled(PendingRangeCalculatorServiceEventType.TASK_FINISHED_SUCCESSFULLY))
+service.publish(new 
PendingRangeCalculatorServiceEvent(PendingRangeCalculatorServiceEventType.TASK_FINISHED_SUCCESSFULLY,
+   
calculatorService,
+   
taskCount.get()));
+}
+
+static void taskRejected(PendingRangeCalculatorService calculatorService, 
AtomicInteger taskCount)
+{
+if 
(isEnabled(PendingRangeCalculatorServiceEventType.TASK_EXECUTION_REJECTED))
+service.publish(new 
PendingRangeCalculatorServiceEvent(PendingRangeCalculatorServiceEventType.TASK_EXECUTION_REJECTED,
+   
calculatorService,
+   
taskCount.get()));
+}
+
+static void taskCountChanged(PendingRangeCalculatorService 
calculatorService, int taskCount)
+{
+if 
(isEnabled(PendingRangeCalculatorServiceEventType.TASK_COUNT_CHANGED))
+service.publish(new 
PendingRangeCalculatorServiceEvent(PendingRangeCalculatorServiceEventType.TASK_COUNT_CHANGED,
+   
calculatorService,
+   taskCount));
+}
+
+private static boolean isEnabled(PendingRangeCalculatorServiceEventType 
type)
+{
+return service.isEnabled(PendingRangeCalculatorServiceEvent.class, 
type);
+}
+}

http://git-wip-us.apache.org/repos/asf/cassandra/blob/2846b22a/src/java/org/apache/cassandra/service/PendingRangeCalculatorServiceEvent.java
--
diff --git 
a/src/java/org/apache/cassandra/service/PendingRangeCalculatorServiceEvent.java 
b/src/java/org/apache/cassandra/service/PendingRangeCalculatorServiceEvent.java
new file mode 100644
index 000..3024149
--- /dev/null
+++ 
b/src/java/org/apache/cassandra/service/PendingRangeCalculatorServiceEvent.java
@@ -0,0 +1,69 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with 

[2/5] cassandra git commit: Add diagnostic events base classes

2018-08-17 Thread spod
http://git-wip-us.apache.org/repos/asf/cassandra/blob/2846b22a/src/java/org/apache/cassandra/gms/GossiperEvent.java
--
diff --git a/src/java/org/apache/cassandra/gms/GossiperEvent.java 
b/src/java/org/apache/cassandra/gms/GossiperEvent.java
new file mode 100644
index 000..2de88bc
--- /dev/null
+++ b/src/java/org/apache/cassandra/gms/GossiperEvent.java
@@ -0,0 +1,111 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.cassandra.gms;
+
+import java.io.Serializable;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.Set;
+import javax.annotation.Nullable;
+
+import org.apache.cassandra.diag.DiagnosticEvent;
+import org.apache.cassandra.locator.InetAddressAndPort;
+
+/**
+ * DiagnosticEvent implementation for {@link Gossiper} activities.
+ */
+final class GossiperEvent extends DiagnosticEvent
+{
+private final InetAddressAndPort endpoint;
+@Nullable
+private final Long quarantineExpiration;
+@Nullable
+private final EndpointState localState;
+
+private final Map endpointStateMap;
+private final boolean inShadowRound;
+private final Map justRemovedEndpoints;
+private final long lastProcessedMessageAt;
+private final Set liveEndpoints;
+private final List seeds;
+private final Set seedsInShadowRound;
+private final Map unreachableEndpoints;
+
+
+enum GossiperEventType
+{
+MARKED_AS_SHUTDOWN,
+CONVICTED,
+REPLACEMENT_QUARANTINE,
+REPLACED_ENDPOINT,
+EVICTED_FROM_MEMBERSHIP,
+REMOVED_ENDPOINT,
+QUARANTINED_ENDPOINT,
+MARKED_ALIVE,
+REAL_MARKED_ALIVE,
+MARKED_DEAD,
+MAJOR_STATE_CHANGE_HANDLED,
+SEND_GOSSIP_DIGEST_SYN
+}
+
+public GossiperEventType type;
+
+
+GossiperEvent(GossiperEventType type, Gossiper gossiper, 
InetAddressAndPort endpoint,
+  @Nullable Long quarantineExpiration, @Nullable EndpointState 
localState)
+{
+this.type = type;
+this.endpoint = endpoint;
+this.quarantineExpiration = quarantineExpiration;
+this.localState = localState;
+
+this.endpointStateMap = gossiper.getEndpointStateMap();
+this.inShadowRound = gossiper.isInShadowRound();
+this.justRemovedEndpoints = gossiper.getJustRemovedEndpoints();
+this.lastProcessedMessageAt = gossiper.getLastProcessedMessageAt();
+this.liveEndpoints = gossiper.getLiveMembers();
+this.seeds = gossiper.getSeeds();
+this.seedsInShadowRound = gossiper.getSeedsInShadowRound();
+this.unreachableEndpoints = gossiper.getUnreachableEndpoints();
+}
+
+public Enum getType()
+{
+return type;
+}
+
+public HashMap toMap()
+{
+// be extra defensive against nulls and bugs
+HashMap ret = new HashMap<>();
+if (endpoint != null) ret.put("endpoint", 
endpoint.getHostAddress(true));
+ret.put("quarantineExpiration", quarantineExpiration);
+ret.put("localState", String.valueOf(localState));
+ret.put("endpointStateMap", String.valueOf(endpointStateMap));
+ret.put("inShadowRound", inShadowRound);
+ret.put("justRemovedEndpoints", String.valueOf(justRemovedEndpoints));
+ret.put("lastProcessedMessageAt", lastProcessedMessageAt);
+ret.put("liveEndpoints", String.valueOf(liveEndpoints));
+ret.put("seeds", String.valueOf(seeds));
+ret.put("seedsInShadowRound", String.valueOf(seedsInShadowRound));
+ret.put("unreachableEndpoints", String.valueOf(unreachableEndpoints));
+return ret;
+}
+}
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/cassandra/blob/2846b22a/src/java/org/apache/cassandra/hints/Hint.java
--
diff --git a/src/java/org/apache/cassandra/hints/Hint.java 
b/src/java/org/apache/cassandra/hints/Hint.java
index b0abd50..7e4618c 100644
--- a/src/java/org/apache/cassandra/hints/Hint.java
+++ b/src/java/org/apache/cassandra/hints/Hint.java
@@ -132,7 +132,7 @@ public final class Hint
   

[5/5] cassandra git commit: Add diagnostic events for user audit logging

2018-08-17 Thread spod
Add diagnostic events for user audit logging

patch by Stefan Podkowinski; reviewed by Mick Semb Wever for CASSANDRA-13668


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/d8c45192
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/d8c45192
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/d8c45192

Branch: refs/heads/trunk
Commit: d8c451923185841ca28e8cb1177b71edafbfd988
Parents: a79e590
Author: Stefan Podkowinski 
Authored: Fri Apr 6 09:49:38 2018 +0200
Committer: Stefan Podkowinski 
Committed: Fri Aug 17 14:08:37 2018 +0200

--
 CHANGES.txt |   1 +
 .../org/apache/cassandra/audit/AuditEvent.java  |  75 ++
 .../audit/DiagnosticEventAuditLogger.java   |  39 +++
 .../cassandra/transport/CQLUserAuditTest.java   | 253 +++
 4 files changed, 368 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/d8c45192/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 097e7dd..d2970a4 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 4.0
+ * Add diagnostic events for user audit logging (CASSANDRA-13668)
  * Allow retrieving diagnostic events via JMX (CASSANDRA-14435)
  * Add base classes for diagnostic events (CASSANDRA-13457)
  * Clear view system metadata when dropping keyspace (CASSANDRA-14646)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/d8c45192/src/java/org/apache/cassandra/audit/AuditEvent.java
--
diff --git a/src/java/org/apache/cassandra/audit/AuditEvent.java 
b/src/java/org/apache/cassandra/audit/AuditEvent.java
new file mode 100644
index 000..b21fe58
--- /dev/null
+++ b/src/java/org/apache/cassandra/audit/AuditEvent.java
@@ -0,0 +1,75 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.cassandra.audit;
+
+import java.io.Serializable;
+import java.util.HashMap;
+import java.util.Map;
+
+import org.apache.cassandra.diag.DiagnosticEvent;
+import org.apache.cassandra.diag.DiagnosticEventService;
+
+/**
+ * {@Link AuditLogEntry} wrapper to expose audit events as {@link 
DiagnosticEvent}s.
+ */
+public final class AuditEvent extends DiagnosticEvent
+{
+private final AuditLogEntry entry;
+
+private AuditEvent(AuditLogEntry entry)
+{
+this.entry = entry;
+}
+
+static void create(AuditLogEntry entry)
+{
+if (isEnabled(entry.getType()))
+DiagnosticEventService.instance().publish(new AuditEvent(entry));
+}
+
+private static boolean isEnabled(AuditLogEntryType type)
+{
+return DiagnosticEventService.instance().isEnabled(AuditEvent.class, 
type);
+}
+
+public Enum getType()
+{
+return entry.getType();
+}
+
+public String getSource()
+{
+return entry.getSource().toString(true);
+}
+
+public AuditLogEntry getEntry()
+{
+return entry;
+}
+
+public Map toMap()
+{
+HashMap ret = new HashMap<>();
+if (entry.getKeyspace() != null) ret.put("keyspace", 
entry.getKeyspace());
+if (entry.getOperation() != null) ret.put("operation", 
entry.getOperation());
+if (entry.getScope() != null) ret.put("scope", entry.getScope());
+if (entry.getUser() != null) ret.put("user", entry.getUser());
+return ret;
+}
+}

http://git-wip-us.apache.org/repos/asf/cassandra/blob/d8c45192/src/java/org/apache/cassandra/audit/DiagnosticEventAuditLogger.java
--
diff --git 
a/src/java/org/apache/cassandra/audit/DiagnosticEventAuditLogger.java 
b/src/java/org/apache/cassandra/audit/DiagnosticEventAuditLogger.java
new file mode 100644
index 000..9d586ba
--- /dev/null
+++ b/src/java/org/apache/cassandra/audit/DiagnosticEventAuditLogger.java
@@ -0,0 +1,39 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor 

[4/5] cassandra git commit: Add JMX query support for diagnostic events

2018-08-17 Thread spod
Add JMX query support for diagnostic events

patch by Stefan Podkowinski; reviewed by Mick Semb Wever for CASSANDRA-14435


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/a79e5903
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/a79e5903
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/a79e5903

Branch: refs/heads/trunk
Commit: a79e5903b552e40f77c151e23172f054ffb7f39e
Parents: 2846b22
Author: Stefan Podkowinski 
Authored: Wed May 2 13:03:10 2018 +0200
Committer: Stefan Podkowinski 
Committed: Fri Aug 17 14:07:45 2018 +0200

--
 CHANGES.txt |   1 +
 .../cassandra/config/DatabaseDescriptor.java|   1 -
 .../diag/DiagnosticEventPersistence.java| 151 
 .../cassandra/diag/DiagnosticEventService.java  |  65 ++-
 .../diag/DiagnosticEventServiceMBean.java   |  59 +++
 .../cassandra/diag/LastEventIdBroadcaster.java  | 150 
 .../diag/LastEventIdBroadcasterMBean.java   |  41 +
 .../diag/store/DiagnosticEventMemoryStore.java  |  97 +++
 .../diag/store/DiagnosticEventStore.java|  52 ++
 .../cassandra/service/StorageService.java   |   5 +-
 .../progress/jmx/JMXBroadcastExecutor.java  |  35 
 .../DiagnosticEventPersistenceBench.java|  73 
 .../microbench/DiagnosticEventServiceBench.java | 103 +++
 .../config/OverrideConfigurationLoader.java |  47 +
 .../diag/DiagnosticEventServiceTest.java|   6 +-
 .../store/DiagnosticEventMemoryStoreTest.java   | 170 +++
 16 files changed, 1047 insertions(+), 9 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/a79e5903/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index ceba843..097e7dd 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 4.0
+ * Allow retrieving diagnostic events via JMX (CASSANDRA-14435)
  * Add base classes for diagnostic events (CASSANDRA-13457)
  * Clear view system metadata when dropping keyspace (CASSANDRA-14646)
  * Allocate ReentrantLock on-demand in java11 AtomicBTreePartitionerBase 
(CASSANDRA-14637)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/a79e5903/src/java/org/apache/cassandra/config/DatabaseDescriptor.java
--
diff --git a/src/java/org/apache/cassandra/config/DatabaseDescriptor.java 
b/src/java/org/apache/cassandra/config/DatabaseDescriptor.java
index 65a34f0..aa5ca92 100644
--- a/src/java/org/apache/cassandra/config/DatabaseDescriptor.java
+++ b/src/java/org/apache/cassandra/config/DatabaseDescriptor.java
@@ -2541,7 +2541,6 @@ public class DatabaseDescriptor
 return conf.diagnostic_events_enabled;
 }
 
-@VisibleForTesting
 public static void setDiagnosticEventsEnabled(boolean enabled)
 {
 conf.diagnostic_events_enabled = enabled;

http://git-wip-us.apache.org/repos/asf/cassandra/blob/a79e5903/src/java/org/apache/cassandra/diag/DiagnosticEventPersistence.java
--
diff --git a/src/java/org/apache/cassandra/diag/DiagnosticEventPersistence.java 
b/src/java/org/apache/cassandra/diag/DiagnosticEventPersistence.java
new file mode 100644
index 000..7da335c
--- /dev/null
+++ b/src/java/org/apache/cassandra/diag/DiagnosticEventPersistence.java
@@ -0,0 +1,151 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.cassandra.diag;
+
+import java.io.InvalidClassException;
+import java.io.Serializable;
+import java.util.HashMap;
+import java.util.Map;
+import java.util.NavigableMap;
+import java.util.SortedMap;
+import java.util.TreeMap;
+import java.util.concurrent.ConcurrentHashMap;
+import java.util.function.Consumer;
+
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import org.apache.cassandra.diag.store.DiagnosticEventMemoryStore;
+import 

[3/5] cassandra git commit: Add diagnostic events base classes

2018-08-17 Thread spod
Add diagnostic events base classes

patch by Stefan Podkowinski; reviewed by Mick Semb Wever for CASSANDRA-13457


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/2846b22a
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/2846b22a
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/2846b22a

Branch: refs/heads/trunk
Commit: 2846b22a70d48bae25203be945e02dd3b6cfda56
Parents: d3e6891
Author: Stefan Podkowinski 
Authored: Thu Mar 16 12:50:52 2017 +0100
Committer: Stefan Podkowinski 
Committed: Fri Aug 17 14:06:57 2018 +0200

--
 CHANGES.txt |   1 +
 conf/cassandra.yaml |   5 +
 .../org/apache/cassandra/config/Config.java |   3 +
 .../cassandra/config/DatabaseDescriptor.java|  11 +
 .../schema/AlterKeyspaceStatement.java  |   5 +
 .../statements/schema/AlterTableStatement.java  |   5 +
 .../statements/schema/AlterTypeStatement.java   |   5 +
 .../statements/schema/AlterViewStatement.java   |   5 +
 .../schema/CreateAggregateStatement.java|   5 +
 .../schema/CreateFunctionStatement.java |   5 +
 .../statements/schema/CreateIndexStatement.java |   5 +
 .../schema/CreateKeyspaceStatement.java |   5 +
 .../statements/schema/CreateTableStatement.java |   5 +
 .../schema/CreateTriggerStatement.java  |   5 +
 .../statements/schema/CreateTypeStatement.java  |   5 +
 .../statements/schema/CreateViewStatement.java  |   5 +
 .../schema/DropAggregateStatement.java  |   5 +
 .../schema/DropFunctionStatement.java   |   5 +
 .../statements/schema/DropIndexStatement.java   |   5 +
 .../schema/DropKeyspaceStatement.java   |   5 +
 .../statements/schema/DropTableStatement.java   |   5 +
 .../statements/schema/DropTriggerStatement.java |   5 +
 .../statements/schema/DropTypeStatement.java|   5 +
 .../statements/schema/DropViewStatement.java|   5 +
 .../org/apache/cassandra/dht/BootStrapper.java  |  14 +-
 .../cassandra/dht/BootstrapDiagnostics.java |  80 +
 .../apache/cassandra/dht/BootstrapEvent.java|  82 +
 .../NoReplicationTokenAllocator.java|   4 +
 .../ReplicationAwareTokenAllocator.java |   7 +-
 .../TokenAllocatorDiagnostics.java  | 195 
 .../dht/tokenallocator/TokenAllocatorEvent.java | 113 +++
 .../tokenallocator/TokenAllocatorFactory.java   |   8 +-
 .../apache/cassandra/diag/DiagnosticEvent.java  |  50 +++
 .../cassandra/diag/DiagnosticEventService.java  | 291 +
 src/java/org/apache/cassandra/gms/Gossiper.java |  46 ++-
 .../cassandra/gms/GossiperDiagnostics.java  | 113 +++
 .../org/apache/cassandra/gms/GossiperEvent.java | 111 +++
 src/java/org/apache/cassandra/hints/Hint.java   |   2 +-
 .../apache/cassandra/hints/HintDiagnostics.java |  85 +
 .../org/apache/cassandra/hints/HintEvent.java   | 102 ++
 .../cassandra/hints/HintsDispatchExecutor.java  |  10 +
 .../apache/cassandra/hints/HintsDispatcher.java |  52 +--
 .../apache/cassandra/hints/HintsService.java|  12 +-
 .../hints/HintsServiceDiagnostics.java  |  65 
 .../cassandra/hints/HintsServiceEvent.java  |  71 +
 .../apache/cassandra/locator/TokenMetadata.java |   2 +
 .../locator/TokenMetadataDiagnostics.java   |  46 +++
 .../cassandra/locator/TokenMetadataEvent.java   |  62 
 src/java/org/apache/cassandra/schema/Diff.java  |   5 +
 .../cassandra/schema/MigrationManager.java  |  33 +-
 .../apache/cassandra/schema/MigrationTask.java  |   5 +
 .../org/apache/cassandra/schema/Schema.java |  34 ++
 .../schema/SchemaAnnouncementDiagnostics.java   |  60 
 .../schema/SchemaAnnouncementEvent.java | 104 ++
 .../cassandra/schema/SchemaDiagnostics.java | 178 +++
 .../apache/cassandra/schema/SchemaEvent.java| 318 +++
 .../schema/SchemaMigrationDiagnostics.java  |  83 +
 .../cassandra/schema/SchemaMigrationEvent.java  | 114 +++
 .../cassandra/schema/SchemaPushVerbHandler.java |   1 +
 .../service/PendingRangeCalculatorService.java  |  23 +-
 ...endingRangeCalculatorServiceDiagnostics.java |  73 +
 .../PendingRangeCalculatorServiceEvent.java |  69 
 .../diag/DiagnosticEventServiceTest.java| 244 ++
 63 files changed, 3047 insertions(+), 40 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/2846b22a/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index d8aca56..ceba843 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 4.0
+ * Add base classes for diagnostic events (CASSANDRA-13457)
  * Clear view system metadata when dropping keyspace (CASSANDRA-14646)
  * Allocate 

[jira] [Comment Edited] (CASSANDRA-14346) Scheduled Repair in Cassandra

2018-08-17 Thread Stefan Podkowinski (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16583811#comment-16583811
 ] 

Stefan Podkowinski edited comment on CASSANDRA-14346 at 8/17/18 11:52 AM:
--

Please include license files like we do in {{lib/licenses}}. Adding 
{{javax.servlet-api}} (assuming CDDL1.1) requires special handling ([category 
b|https://www.apache.org/legal/resolved.html#category-x]), so it would be 
preferable not having to use that dependency.




was (Author: spo...@gmail.com):
Please include license files like we do in {{lib/licenses}}. Adding 
{{javax.servlet-api}} (assuming CDDL1.1) requires special handling (category 
b), so it would be preferable not having to use that dependency.



> Scheduled Repair in Cassandra
> -
>
> Key: CASSANDRA-14346
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14346
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Repair
>Reporter: Joseph Lynch
>Assignee: Joseph Lynch
>Priority: Major
>  Labels: 4.0-feature-freeze-review-requested, 
> CommunityFeedbackRequested
> Fix For: 4.0
>
> Attachments: ScheduledRepairV1_20180327.pdf
>
>
> There have been many attempts to automate repair in Cassandra, which makes 
> sense given that it is necessary to give our users eventual consistency. Most 
> recently CASSANDRA-10070, CASSANDRA-8911 and CASSANDRA-13924 have all looked 
> for ways to solve this problem.
> At Netflix we've built a scheduled repair service within Priam (our sidecar), 
> which we spoke about last year at NGCC. Given the positive feedback at NGCC 
> we focussed on getting it production ready and have now been using it in 
> production to repair hundreds of clusters, tens of thousands of nodes, and 
> petabytes of data for the past six months. Also based on feedback at NGCC we 
> have invested effort in figuring out how to integrate this natively into 
> Cassandra rather than open sourcing it as an external service (e.g. in Priam).
> As such, [~vinaykumarcse] and I would like to re-work and merge our 
> implementation into Cassandra, and have created a [design 
> document|https://docs.google.com/document/d/1RV4rOrG1gwlD5IljmrIq_t45rz7H3xs9GbFSEyGzEtM/edit?usp=sharing]
>  showing how we plan to make it happen, including the the user interface.
> As we work on the code migration from Priam to Cassandra, any feedback would 
> be greatly appreciated about the interface or v1 implementation features. I 
> have tried to call out in the document features which we explicitly consider 
> future work (as well as a path forward to implement them in the future) 
> because I would very much like to get this done before the 4.0 merge window 
> closes, and to do that I think aggressively pruning scope is going to be a 
> necessity.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14346) Scheduled Repair in Cassandra

2018-08-17 Thread Stefan Podkowinski (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16583811#comment-16583811
 ] 

Stefan Podkowinski commented on CASSANDRA-14346:


Please include license files like we do in {{lib/licenses}}. Adding 
{{javax.servlet-api}} (assuming CDDL1.1) requires special handling (category 
b), so it would be preferable not having to use that dependency.



> Scheduled Repair in Cassandra
> -
>
> Key: CASSANDRA-14346
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14346
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Repair
>Reporter: Joseph Lynch
>Assignee: Joseph Lynch
>Priority: Major
>  Labels: 4.0-feature-freeze-review-requested, 
> CommunityFeedbackRequested
> Fix For: 4.0
>
> Attachments: ScheduledRepairV1_20180327.pdf
>
>
> There have been many attempts to automate repair in Cassandra, which makes 
> sense given that it is necessary to give our users eventual consistency. Most 
> recently CASSANDRA-10070, CASSANDRA-8911 and CASSANDRA-13924 have all looked 
> for ways to solve this problem.
> At Netflix we've built a scheduled repair service within Priam (our sidecar), 
> which we spoke about last year at NGCC. Given the positive feedback at NGCC 
> we focussed on getting it production ready and have now been using it in 
> production to repair hundreds of clusters, tens of thousands of nodes, and 
> petabytes of data for the past six months. Also based on feedback at NGCC we 
> have invested effort in figuring out how to integrate this natively into 
> Cassandra rather than open sourcing it as an external service (e.g. in Priam).
> As such, [~vinaykumarcse] and I would like to re-work and merge our 
> implementation into Cassandra, and have created a [design 
> document|https://docs.google.com/document/d/1RV4rOrG1gwlD5IljmrIq_t45rz7H3xs9GbFSEyGzEtM/edit?usp=sharing]
>  showing how we plan to make it happen, including the the user interface.
> As we work on the code migration from Priam to Cassandra, any feedback would 
> be greatly appreciated about the interface or v1 implementation features. I 
> have tried to call out in the document features which we explicitly consider 
> future work (as well as a path forward to implement them in the future) 
> because I would very much like to get this done before the 4.0 merge window 
> closes, and to do that I think aggressively pruning scope is going to be a 
> necessity.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14653) The performance of "NonPeriodicTasks" pools defined in class ScheduledExecutors is low

2018-08-17 Thread Peter Xie (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Xie updated CASSANDRA-14653:
--
Description: 
We use cassandra as backend storage for Janusgraph. when we loading huge data 
(~2 billion vertex, ~10 billion edges), we met some problems.

 

At first, we use STCS as compaction strategy , but met below exception.  we 
checked the value of  "max memory lock" is unlimited and "file map count" is 1 
million, these values should enough for loading data. last we found this 
problem is caused by the virtual memory are all cosumed by cassandra.  So not 
additional virtual memory can be used by compaction task , and below exception 
is thrown out.   
{quote}ERROR [CompactionExecutor:267] 2018-08-09 02:28:40,952 
JVMStabilityInspector.javv
 a:74 - OutOfMemory error letting the JVM handle the error:
 java.lang.OutOfMemoryError: Map failed
{quote}
So, we change compaction strategy to LCS, this change seems can resolve the 
virtual memory problem. But we found another problem : Many sstables which has 
been compacted are still retained on disk,  these old sstables consume so many 
disk space, it's causing no enough disk for saving real data. and we found that 
many files like "mc_txn_compaction_xxx.log" are created under the data 
directory. 

After some times' investigaton, found this problem is caused by 
"NonPeriodicTasks" thread pools.  this pools is always using only one thread 
for processing clean task after compaction. this thread pool is instanced with 
class DebuggableScheduledThreadPoolExecutor,

and DebuggableScheduledThreadPoolExecutor is inherit from class  
ScheduledThreadPoolExecutor.

By reading the code of class DebuggableScheduledThreadPoolExecutor,  found 
DebuggableScheduledThreadPoolExecutor is using an unbound task queue, and core 
pool size is 1. I think it should wrong using unbound queue.  If we using 
unbound queue, the thread pool wouldn't  increasing thread even there're many 
tasks are blocked in queue, because unbound queue never would be full.  I think 
here should use bound queue, so when clean task is heavily, more threads would 
created for processing them. 
{quote}public DebuggableScheduledThreadPoolExecutor(int corePoolSize, String 
threadPoolName, int priority)
 Unknown macro: \{ super(corePoolSize, new NamedThreadFactory(threadPoolName, 
priority)); setRejectedExecutionHandler(rejectedExecutionHandler); }
  

public ScheduledThreadPoolExecutor(int corePoolSize,
 ThreadFactory threadFactory)
 Unknown macro: \{ super(corePoolSize, Integer.MAX_VALUE, 0, NANOSECONDS, new 
DelayedWorkQueue(), threadFactory); }
{quote}
 Below is the case about clean task after compaction.  there nearly 3 hours 
delay for removing file "mc-56525". 
{quote} 

TRACE [CompactionExecutor:81] 2018-08-16 21:22:29,664 
LifecycleTransaction.java:363 - Staging for obsolescence 
BigTableReader(path='/sdb/data/test_2/edgestore-365b0b70a05911e8806001ebe60a5ce7/mc-56525-big-Data.db')
 ..
 TRACE [CompactionExecutor:81] 2018-08-16 21:22:41,162 Tracker.java:165 - 
removing 
/sdb/data/test_2/edgestore-365b0b70a05911e8806001ebe60a5ce7/mc-56525-big from 
list of files tracked for test_2.edgestore
 
 TRACE [NonPeriodicTasks:1] 2018-08-17 00:28:47,179 SSTableReader.java:2175 - 
Async instance tidier for 
/sdb/data/test_2/edgestore-365b0b70a05911e8806001ebe60a5ce7/mc-56525-big, 
before barrier
 TRACE [NonPeriodicTasks:1] 2018-08-17 00:28:47,180 SSTableReader.java:2181 - 
Async instance tidier for 
/sdb/data/test_2/edgestore-365b0b70a05911e8806001ebe60a5ce7/mc-56525-big, after 
barrier
 TRACE [NonPeriodicTasks:1] 2018-08-17 00:28:47,182 SSTableReader.java:2196 - 
Async instance tidier for 
/sdb/data/test_2/edgestore-365b0b70a05911e8806001ebe60a5ce7/mc-56525-big, 
completed
{quote}
 

 

 

 

 

 

 

  was:
We use cassandra as backend storage for Janusgraph. when we loading huge data 
(~2 billion vertex, ~10 billion edges), we met some problems.

 

At first, we use STCS as compaction strategy , but met below exception.  we 
checked the value of  "max memory lock" is unlimited and "file map count" is 1 
million, these values should enough for loading data. last we found this 
problem is caused by the virtual memory are all cosumed by cassandra.  So not 
additional virtual memory can be used by compaction task , and below exception 
is thrown out.   
{quote}ERROR [CompactionExecutor:267] 2018-08-09 02:28:40,952 
JVMStabilityInspector.javv
 a:74 - OutOfMemory error letting the JVM handle the error:
 java.lang.OutOfMemoryError: Map failed
{quote}
So, we change compaction strategy to LCS, this change seems can resolve the 
virtual memory problem. But we found another problem : Many sstables which has 
been compacted are still retained on disk,  these old sstables consume so many 
disk space, it's causing no enough disk for saving real data. and we found that 
many files like 

[jira] [Updated] (CASSANDRA-14653) The performance of "NonPeriodicTasks" pools defined in class ScheduledExecutors is low

2018-08-17 Thread Peter Xie (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Xie updated CASSANDRA-14653:
--
Description: 
We use cassandra as backend storage for Janusgraph. when we loading huge data 
(~2 billion vertex, ~10 billion edges), we met some problems.

 

At first, we use STCS as compaction strategy , but met below exception.  we 
checked the value of  "max memory lock" is unlimited and "file map count" is 1 
million, these values should enough for loading data. last we found this 
problem is caused by the virtual memory are all cosumed by cassandra.  So not 
additional virtual memory can be used by compaction task , and below exception 
is thrown out.   
{quote}ERROR [CompactionExecutor:267] 2018-08-09 02:28:40,952 
JVMStabilityInspector.javv
 a:74 - OutOfMemory error letting the JVM handle the error:
 java.lang.OutOfMemoryError: Map failed
{quote}
So, we change compaction strategy to LCS, this change seems can resolve the 
virtual memory problem. But we found another problem : Many sstables which has 
been compacted are still retained on disk,  these old sstables consume so many 
disk space, it's causing no enough disk for saving real data. and we found that 
many files like "mc_txn_compaction_xxx.log" are created under the data 
directory. 

After some times' investigaton, found this problem is caused by 
"NonPeriodicTasks" thread pools.  this pools is always using only one thread 
for processing clean task after compaction. this thread pool is instanced with 
class DebuggableScheduledThreadPoolExecutor,

and DebuggableScheduledThreadPoolExecutor is inherit from class  
ScheduledThreadPoolExecutor.

By reading the code of class DebuggableScheduledThreadPoolExecutor,  found 
DebuggableScheduledThreadPoolExecutor is using an unbound task queue, and core 
pool size is 1. I think it should wrong using unbound queue.  If we using 
unbound queue, the thread pool wouldn't  increasing thread even there so many 
task are blocked in queue, because unbound queue never would be full.  I think 
here should use bound queue, so when task is heavily, more threads would 
created for processing them. 
{quote}public DebuggableScheduledThreadPoolExecutor(int corePoolSize, String 
threadPoolName, int priority)
 Unknown macro: \{ super(corePoolSize, new NamedThreadFactory(threadPoolName, 
priority)); setRejectedExecutionHandler(rejectedExecutionHandler); }
  

public ScheduledThreadPoolExecutor(int corePoolSize,
 ThreadFactory threadFactory)
 Unknown macro: \{ super(corePoolSize, Integer.MAX_VALUE, 0, NANOSECONDS, new 
DelayedWorkQueue(), threadFactory); }
{quote}
 Below is the case about clean task after compaction.  there nearly 3 hours 
delay for removing file "mc-56525". 
{quote} 

TRACE [CompactionExecutor:81] 2018-08-16 21:22:29,664 
LifecycleTransaction.java:363 - Staging for obsolescence 
BigTableReader(path='/sdb/data/test_2/edgestore-365b0b70a05911e8806001ebe60a5ce7/mc-56525-big-Data.db')
 ..
 TRACE [CompactionExecutor:81] 2018-08-16 21:22:41,162 Tracker.java:165 - 
removing 
/sdb/data/test_2/edgestore-365b0b70a05911e8806001ebe60a5ce7/mc-56525-big from 
list of files tracked for test_2.edgestore
 
 TRACE [NonPeriodicTasks:1] 2018-08-17 00:28:47,179 SSTableReader.java:2175 - 
Async instance tidier for 
/sdb/data/test_2/edgestore-365b0b70a05911e8806001ebe60a5ce7/mc-56525-big, 
before barrier
 TRACE [NonPeriodicTasks:1] 2018-08-17 00:28:47,180 SSTableReader.java:2181 - 
Async instance tidier for 
/sdb/data/test_2/edgestore-365b0b70a05911e8806001ebe60a5ce7/mc-56525-big, after 
barrier
 TRACE [NonPeriodicTasks:1] 2018-08-17 00:28:47,182 SSTableReader.java:2196 - 
Async instance tidier for 
/sdb/data/test_2/edgestore-365b0b70a05911e8806001ebe60a5ce7/mc-56525-big, 
completed
{quote}
 

 

 

 

 

 

 

  was:
We use cassandra as backend storage for Janusgraph. when we loading huge data 
(~2 billion vertex, ~10 billion edges), we met some problems.

 

At first, we use STCS as compaction strategy , but met below exception.  we 
checked the value of  "max memory lock" is unlimited and "file map count" is 1 
million, these values should enough for loading data. last we found this 
problem is caused by the virtual memory are all cosumed by cassandra.  So not 
additional virtual memory can be used by compaction task , and below exception 
is thrown out.   
{quote}ERROR [CompactionExecutor:267] 2018-08-09 02:28:40,952 
JVMStabilityInspector.javv
 a:74 - OutOfMemory error letting the JVM handle the error:
 java.lang.OutOfMemoryError: Map failed
{quote}
So, we change compaction strategy to LCS, this change seems can resolve the 
virtual memory problem. But we found another problem : Many sstables which has 
been compacted are still retained on disk,  these old sstables consume so many 
disk space, it's causing no enough disk for saving real data. and we found that 
many files like 

[jira] [Updated] (CASSANDRA-14653) The performance of "NonPeriodicTasks" pools defined in class ScheduledExecutors is low

2018-08-17 Thread Peter Xie (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Xie updated CASSANDRA-14653:
--
Description: 
We use cassandra as backend storage for Janusgraph. when we loading huge data 
(~2 billion vertex, ~10 billion edges), we met some problems.

 

At first, we use STCS as compaction strategy , but met below exception.  we 
checked the value of  "max memory lock" is unlimited and "file map count" is 1 
million, these values should enough for loading data. last we found this 
problem is caused by the virtual memory are all cosumed by cassandra.  So not 
additional virtual memory can be used by compaction task , and below exception 
is thrown out.   
{quote}ERROR [CompactionExecutor:267] 2018-08-09 02:28:40,952 
JVMStabilityInspector.javv
 a:74 - OutOfMemory error letting the JVM handle the error:
 java.lang.OutOfMemoryError: Map failed
{quote}
So, we change compaction strategy to LCS, this change seems can resolve the 
virtual memory problem. But we found another problem : Many sstables which has 
been compacted are still retained on disk,  these old sstables consume so many 
disk space, it's causing no enough disk for saving real data. and we found that 
many files like "mc_txn_compaction_xxx.log" are created under the data 
directory. 

After some times' investigaton, found this problem is caused by 
"NonPeriodicTasks" thread pools.  this pools is always using only one thread 
for processing clean task and compaction. this thread pool is instanced with 
class DebuggableScheduledThreadPoolExecutor,

and DebuggableScheduledThreadPoolExecutor is inherit from class  
ScheduledThreadPoolExecutor.

By reading the code of class DebuggableScheduledThreadPoolExecutor,  found 
DebuggableScheduledThreadPoolExecutor is using an unbound task queue, and core 
pool size is 1.   Why here use the unbound queue for queuing submitted tasks?  
If we using unbound queue, the thread pool wouldn't  increasing thread even 
there so many task are blocked in queue, because unbound queue never would be 
full.  I think here should use bound queue, so when task is heavily, more 
threads would created for processing them. 
{quote}public DebuggableScheduledThreadPoolExecutor(int corePoolSize, String 
threadPoolName, int priority)
 Unknown macro: \{ super(corePoolSize, new NamedThreadFactory(threadPoolName, 
priority)); setRejectedExecutionHandler(rejectedExecutionHandler); }
  

public ScheduledThreadPoolExecutor(int corePoolSize,
 ThreadFactory threadFactory)
 Unknown macro: \{ super(corePoolSize, Integer.MAX_VALUE, 0, NANOSECONDS, new 
DelayedWorkQueue(), threadFactory); }
{quote}
 Below is the case about clean task after compaction.  there nearly 3 hours 
delay for removing file "mc-56525". 
{quote} 

TRACE [CompactionExecutor:81] 2018-08-16 21:22:29,664 
LifecycleTransaction.java:363 - Staging for obsolescence 
BigTableReader(path='/sdb/data/test_2/edgestore-365b0b70a05911e8806001ebe60a5ce7/mc-56525-big-Data.db')
 ..
 TRACE [CompactionExecutor:81] 2018-08-16 21:22:41,162 Tracker.java:165 - 
removing 
/sdb/data/test_2/edgestore-365b0b70a05911e8806001ebe60a5ce7/mc-56525-big from 
list of files tracked for test_2.edgestore
 
 TRACE [NonPeriodicTasks:1] 2018-08-17 00:28:47,179 SSTableReader.java:2175 - 
Async instance tidier for 
/sdb/data/test_2/edgestore-365b0b70a05911e8806001ebe60a5ce7/mc-56525-big, 
before barrier
 TRACE [NonPeriodicTasks:1] 2018-08-17 00:28:47,180 SSTableReader.java:2181 - 
Async instance tidier for 
/sdb/data/test_2/edgestore-365b0b70a05911e8806001ebe60a5ce7/mc-56525-big, after 
barrier
 TRACE [NonPeriodicTasks:1] 2018-08-17 00:28:47,182 SSTableReader.java:2196 - 
Async instance tidier for 
/sdb/data/test_2/edgestore-365b0b70a05911e8806001ebe60a5ce7/mc-56525-big, 
completed
{quote}
 

 

 

 

 

 

 

  was:
We use cassandra as backend storage for Janusgraph. when we loading huge data 
(~2 billion vertex, ~10 billion edges), we met some problems.

 

At first, we use STCS as compaction strategy , but met below exception.  we 
checked the value of  "max memory lock" is unlimited and "file map count" is 1 
million, these values should enough for loading data. last we found this 
problem is caused by the virtual memory are all cosumed by cassandra.  So not 
additional virtual memory can be used by compaction task , and below exception 
is thrown out.   
{quote}ERROR [CompactionExecutor:267] 2018-08-09 02:28:40,952 
JVMStabilityInspector.javv
 a:74 - OutOfMemory error letting the JVM handle the error:
 java.lang.OutOfMemoryError: Map failed
{quote}
So, we change compaction strategy to LCS, this change seems can resolve the 
virtual memory problem. But we found another problem : Many sstables which has 
been compacted are still retained on disk,  these old sstables consume so many 
disk space, it's causing no enough disk for saving real data. and we found that 
many files like 

[jira] [Updated] (CASSANDRA-14653) The performance of "NonPeriodicTasks" pools defined in class ScheduledExecutors is low

2018-08-17 Thread Peter Xie (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Xie updated CASSANDRA-14653:
--
Description: 
We use cassandra as backend storage for Janusgraph. when we loading huge data 
(~2 billion vertex, ~10 billion edges), we met some problems.

 

At first, we use STCS as compaction strategy , but met below exception.  we 
checked the value of  "max memory lock" is unlimited and "file map count" is 1 
million, these values should enough for loading data. last we found this 
problem is caused by the virtual memory are all cosumed by cassandra.  So not 
additional virtual memory can be used by compaction task , and below exception 
is thrown out.   
{quote}ERROR [CompactionExecutor:267] 2018-08-09 02:28:40,952 
JVMStabilityInspector.javv
 a:74 - OutOfMemory error letting the JVM handle the error:
 java.lang.OutOfMemoryError: Map failed
{quote}
So, we change compaction strategy to LCS, this change seems can resolve the 
virtual memory problem. But we found another problem : Many sstables which has 
been compacted are still retained on disk,  these old sstables consume so many 
disk space, it's causing no enough disk for saving real data. and we found that 
many files like "mc_txn_compaction_xxx.log" are created under the data 
directory. 

After some times' investigaton, found this problem is caused by 
"NonPeriodicTasks" thread pools.  this pools is always using only one thread 
for processing clean task after compaction. this thread pool is instanced with 
class DebuggableScheduledThreadPoolExecutor,

and DebuggableScheduledThreadPoolExecutor is inherit from class  
ScheduledThreadPoolExecutor.

By reading the code of class DebuggableScheduledThreadPoolExecutor,  found 
DebuggableScheduledThreadPoolExecutor is using an unbound task queue, and core 
pool size is 1.   Why here use the unbound queue for queuing submitted tasks?  
If we using unbound queue, the thread pool wouldn't  increasing thread even 
there so many task are blocked in queue, because unbound queue never would be 
full.  I think here should use bound queue, so when task is heavily, more 
threads would created for processing them. 
{quote}public DebuggableScheduledThreadPoolExecutor(int corePoolSize, String 
threadPoolName, int priority)
 Unknown macro: \{ super(corePoolSize, new NamedThreadFactory(threadPoolName, 
priority)); setRejectedExecutionHandler(rejectedExecutionHandler); }
  

public ScheduledThreadPoolExecutor(int corePoolSize,
 ThreadFactory threadFactory)
 Unknown macro: \{ super(corePoolSize, Integer.MAX_VALUE, 0, NANOSECONDS, new 
DelayedWorkQueue(), threadFactory); }
{quote}
 Below is the case about clean task after compaction.  there nearly 3 hours 
delay for removing file "mc-56525". 
{quote} 

TRACE [CompactionExecutor:81] 2018-08-16 21:22:29,664 
LifecycleTransaction.java:363 - Staging for obsolescence 
BigTableReader(path='/sdb/data/test_2/edgestore-365b0b70a05911e8806001ebe60a5ce7/mc-56525-big-Data.db')
 ..
 TRACE [CompactionExecutor:81] 2018-08-16 21:22:41,162 Tracker.java:165 - 
removing 
/sdb/data/test_2/edgestore-365b0b70a05911e8806001ebe60a5ce7/mc-56525-big from 
list of files tracked for test_2.edgestore
 
 TRACE [NonPeriodicTasks:1] 2018-08-17 00:28:47,179 SSTableReader.java:2175 - 
Async instance tidier for 
/sdb/data/test_2/edgestore-365b0b70a05911e8806001ebe60a5ce7/mc-56525-big, 
before barrier
 TRACE [NonPeriodicTasks:1] 2018-08-17 00:28:47,180 SSTableReader.java:2181 - 
Async instance tidier for 
/sdb/data/test_2/edgestore-365b0b70a05911e8806001ebe60a5ce7/mc-56525-big, after 
barrier
 TRACE [NonPeriodicTasks:1] 2018-08-17 00:28:47,182 SSTableReader.java:2196 - 
Async instance tidier for 
/sdb/data/test_2/edgestore-365b0b70a05911e8806001ebe60a5ce7/mc-56525-big, 
completed
{quote}
 

 

 

 

 

 

 

  was:
We use cassandra as backend storage for Janusgraph. when we loading huge data 
(~2 billion vertex, ~10 billion edges), we met some problems.

 

At first, we use STCS as compaction strategy , but met below exception.  we 
checked the value of  "max memory lock" is unlimited and "file map count" is 1 
million, these values should enough for loading data. last we found this 
problem is caused by the virtual memory are all cosumed by cassandra.  So not 
additional virtual memory can be used by compaction task , and below exception 
is thrown out.   
{quote}ERROR [CompactionExecutor:267] 2018-08-09 02:28:40,952 
JVMStabilityInspector.javv
 a:74 - OutOfMemory error letting the JVM handle the error:
 java.lang.OutOfMemoryError: Map failed
{quote}
So, we change compaction strategy to LCS, this change seems can resolve the 
virtual memory problem. But we found another problem : Many sstables which has 
been compacted are still retained on disk,  these old sstables consume so many 
disk space, it's causing no enough disk for saving real data. and we found that 
many files like 

[jira] [Updated] (CASSANDRA-14653) The performance of "NonPeriodicTasks" pools defined in class ScheduledExecutors is low

2018-08-17 Thread Peter Xie (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Xie updated CASSANDRA-14653:
--
Description: 
We use cassandra as backend storage for Janusgraph. when we loading huge data 
(~2 billion vertex, ~10 billion edges), we met some problems.

 

At first, we use STCS as compaction strategy , but met below exception.  we 
checked the value of  "max memory lock" is unlimited and "file map count" is 1 
million, these values should enough for loading data. last we found this 
problem is caused by the virtual memory are all cosumed by cassandra.  So not 
additional virtual memory can be used by compaction task , and below exception 
is thrown out.   
{quote}ERROR [CompactionExecutor:267] 2018-08-09 02:28:40,952 
JVMStabilityInspector.javv
 a:74 - OutOfMemory error letting the JVM handle the error:
 java.lang.OutOfMemoryError: Map failed
{quote}
So, we change compaction strategy to LCS, this change seems can resolve the 
virtual memory problem. But we found another problem : Many sstables which has 
been compacted are still retained on disk,  these old sstables consume so many 
disk space, it's causing no enough disk for saving real data. we found that so 
many files like "mc_txn_compaction_xxx.log" are created under the data 
directory. 

After some times' investigaton, we found that this problem is caused by 
"NonPeriodicTasks" thread pools.  this pools is always using only one thread 
for processing clean task and compaction. this thread pool is instanced with 
class DebuggableScheduledThreadPoolExecutor,

and DebuggableScheduledThreadPoolExecutor is inherit from class  
ScheduledThreadPoolExecutor.

By reading the code of class DebuggableScheduledThreadPoolExecutor,  found 
DebuggableScheduledThreadPoolExecutor is using an unbound task queue, and core 
pool size is 1.   Why here use the unbound queue for queuing submitted tasks?  
If we using unbound queue, the thread pool wouldn't  increasing thread even 
there so many task are blocked in queue, because unbound queue never would be 
full.  I think here should use bound queue, so when task is heavily, more 
threads would created for processing them. 
{quote}public DebuggableScheduledThreadPoolExecutor(int corePoolSize, String 
threadPoolName, int priority)
 Unknown macro: \{ super(corePoolSize, new NamedThreadFactory(threadPoolName, 
priority)); setRejectedExecutionHandler(rejectedExecutionHandler); }
  

public ScheduledThreadPoolExecutor(int corePoolSize,
 ThreadFactory threadFactory)
 Unknown macro: \{ super(corePoolSize, Integer.MAX_VALUE, 0, NANOSECONDS, new 
DelayedWorkQueue(), threadFactory); }
{quote}
 Below is the case about clean task after compaction.  there nearly 3 hours 
delay for removing file "mc-56525". 
{quote} 

TRACE [CompactionExecutor:81] 2018-08-16 21:22:29,664 
LifecycleTransaction.java:363 - Staging for obsolescence 
BigTableReader(path='/sdb/data/test_2/edgestore-365b0b70a05911e8806001ebe60a5ce7/mc-56525-big-Data.db')
 ..
 TRACE [CompactionExecutor:81] 2018-08-16 21:22:41,162 Tracker.java:165 - 
removing 
/sdb/data/test_2/edgestore-365b0b70a05911e8806001ebe60a5ce7/mc-56525-big from 
list of files tracked for test_2.edgestore
 
 TRACE [NonPeriodicTasks:1] 2018-08-17 00:28:47,179 SSTableReader.java:2175 - 
Async instance tidier for 
/sdb/data/test_2/edgestore-365b0b70a05911e8806001ebe60a5ce7/mc-56525-big, 
before barrier
 TRACE [NonPeriodicTasks:1] 2018-08-17 00:28:47,180 SSTableReader.java:2181 - 
Async instance tidier for 
/sdb/data/test_2/edgestore-365b0b70a05911e8806001ebe60a5ce7/mc-56525-big, after 
barrier
 TRACE [NonPeriodicTasks:1] 2018-08-17 00:28:47,182 SSTableReader.java:2196 - 
Async instance tidier for 
/sdb/data/test_2/edgestore-365b0b70a05911e8806001ebe60a5ce7/mc-56525-big, 
completed
{quote}
 

 

 

 

 

 

 

  was:
We use cassandra as backend storage for Janusgraph. when we loading huge data 
(~2 billion vertex, ~10 billion edges), we met some problems.

 

At first, we use STCS as compaction strategy , but met below exception.  we 
checked the value of  "max memory lock" is unlimited and "file map count" is 1 
million, these values should enough for loading data. last we found this 
problem is caused by the virtual memory are all cosumed by cassandra.  So not 
additional virtual memory can be used by compaction task , and below exception 
is thrown out.   
{quote}ERROR [CompactionExecutor:267] 2018-08-09 02:28:40,952 
JVMStabilityInspector.javv
 a:74 - OutOfMemory error letting the JVM handle the error:
 java.lang.OutOfMemoryError: Map failed
{quote}
So, we change compaction strategy to LCS, this change seems can resolve the 
virtual memory problem. But we found another problem : Many sstables which has 
been compacted are still retained on disk,  these old sstable consume so many 
disk space, it's causing no enough disk for saving real data. we found that so 
many files 

[jira] [Updated] (CASSANDRA-14653) The performance of "NonPeriodicTasks" pools defined in class ScheduledExecutors is low

2018-08-17 Thread Peter Xie (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Xie updated CASSANDRA-14653:
--
Description: 
We use cassandra as backend storage for Janusgraph. when we loading huge data 
(~2 billion vertex, ~10 billion edges), we met some problems.

 

At first, we use STCS as compaction strategy , but met below exception.  we 
checked the value of  "max memory lock" is unlimited and "file map count" is 1 
million, these values should enough for loading data. last we found this 
problem is caused by the virtual memory are all cosumed by cassandra.  So not 
additional virtual memory can be used by compaction task , and below exception 
is thrown out.   
{quote}ERROR [CompactionExecutor:267] 2018-08-09 02:28:40,952 
JVMStabilityInspector.javv
 a:74 - OutOfMemory error letting the JVM handle the error:
 java.lang.OutOfMemoryError: Map failed
{quote}
So, we change compaction strategy to LCS, this change seems can resolve the 
virtual memory problem. But we found another problem : Many sstables which has 
been compacted are still retained on disk,  these old sstable consume so many 
disk space, it's causing no enough disk for saving real data. we found that so 
many files like "mc_txn_compaction_xxx.log" are created under the data 
directory. 

After some times' investigaton, we found that this problem is caused by 
"NonPeriodicTasks" thread pools.  this pools is always using only one thread 
for processing clean task and compaction. this thread pool is instanced with 
class DebuggableScheduledThreadPoolExecutor,

and DebuggableScheduledThreadPoolExecutor is inherit from class  
ScheduledThreadPoolExecutor.

By reading the code of class DebuggableScheduledThreadPoolExecutor,  found 
DebuggableScheduledThreadPoolExecutor is using an unbound task queue, and core 
pool size is 1.   Why here use the unbound queue for queuing submitted tasks?  
If we using unbound queue, the thread pool wouldn't  increasing thread even 
there so many task are blocked in queue, because unbound queue never would be 
full.  I think here should use bound queue, so when task is heavily, more 
threads would created for processing them. 
{quote}public DebuggableScheduledThreadPoolExecutor(int corePoolSize, String 
threadPoolName, int priority)
 Unknown macro: \{ super(corePoolSize, new NamedThreadFactory(threadPoolName, 
priority)); setRejectedExecutionHandler(rejectedExecutionHandler); }
  

public ScheduledThreadPoolExecutor(int corePoolSize,
 ThreadFactory threadFactory)
 Unknown macro: \{ super(corePoolSize, Integer.MAX_VALUE, 0, NANOSECONDS, new 
DelayedWorkQueue(), threadFactory); }
{quote}
 Below is the case about clean task after compaction.  there nearly 3 hours 
delay for removing file "mc-56525". 
{quote} 

TRACE [CompactionExecutor:81] 2018-08-16 21:22:29,664 
LifecycleTransaction.java:363 - Staging for obsolescence 
BigTableReader(path='/sdb/data/test_2/edgestore-365b0b70a05911e8806001ebe60a5ce7/mc-56525-big-Data.db')
 ..
 TRACE [CompactionExecutor:81] 2018-08-16 21:22:41,162 Tracker.java:165 - 
removing 
/sdb/data/test_2/edgestore-365b0b70a05911e8806001ebe60a5ce7/mc-56525-big from 
list of files tracked for test_2.edgestore
 
 TRACE [NonPeriodicTasks:1] 2018-08-17 00:28:47,179 SSTableReader.java:2175 - 
Async instance tidier for 
/sdb/data/test_2/edgestore-365b0b70a05911e8806001ebe60a5ce7/mc-56525-big, 
before barrier
 TRACE [NonPeriodicTasks:1] 2018-08-17 00:28:47,180 SSTableReader.java:2181 - 
Async instance tidier for 
/sdb/data/test_2/edgestore-365b0b70a05911e8806001ebe60a5ce7/mc-56525-big, after 
barrier
 TRACE [NonPeriodicTasks:1] 2018-08-17 00:28:47,182 SSTableReader.java:2196 - 
Async instance tidier for 
/sdb/data/test_2/edgestore-365b0b70a05911e8806001ebe60a5ce7/mc-56525-big, 
completed
{quote}
 

 

 

 

 

 

 

  was:
We use cassandra as backend storage for Janusgraph. when we loading huge data 
(~2 billion vertex, ~10 billion edges), we met some problems.

 

At first, we use STCS as compaction strategy , but met below exception.  we 
checked the value of  "max memory lock" is unlimited and "file map count" is 1 
million, these values should enough for loading data. last we found this 
problem is caused by the virtual memory are all cosumed by cassandra.  So not 
additional virtual memory can be used by compaction task , and below exception 
is thrown out.   
{quote}ERROR [CompactionExecutor:267] 2018-08-09 02:28:40,952 
JVMStabilityInspector.javv
 a:74 - OutOfMemory error letting the JVM handle the error:
 java.lang.OutOfMemoryError: Map failed
{quote}
So, we change compaction strategy to LCS, this change seems can resolve the 
virtual memory problem. But we found another problem : Many sstables which has 
been compacted are still retained on disk, at last these old sstable consume so 
many disk space, it's causing no enough disk for saving real data. we found 
that so many 

[jira] [Updated] (CASSANDRA-14653) The performance of "NonPeriodicTasks" pools defined in class ScheduledExecutors is low

2018-08-17 Thread Peter Xie (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Xie updated CASSANDRA-14653:
--
Description: 
We use cassandra as backend storage for Janusgraph. when we loading huge data 
(~2 billion vertex, ~10 billion edges), we met some problems.

 

At first, we use STCS as compaction strategy , but met below exception.  we 
checked the value of  "max memory lock" is unlimited and "file map count" is 1 
million, these values should enough for loading data. last we found this 
problem is caused by the virtual memory are all cosumed by cassandra.  So not 
additional virtual memory can be used by compaction task , and below exception 
is thrown out.   
{quote}ERROR [CompactionExecutor:267] 2018-08-09 02:28:40,952 
JVMStabilityInspector.javv
 a:74 - OutOfMemory error letting the JVM handle the error:
 java.lang.OutOfMemoryError: Map failed
{quote}
So, we change compaction strategy to LCS, this change seems can resolve the 
virtual memory problem. But we found another problem : Many sstables which has 
been compacted are still retained on disk,  these old sstables consume so many 
disk space, it's causing no enough disk for saving real data. and we found that 
many files like "mc_txn_compaction_xxx.log" are created under the data 
directory. 

After some times' investigaton, we found that this problem is caused by 
"NonPeriodicTasks" thread pools.  this pools is always using only one thread 
for processing clean task and compaction. this thread pool is instanced with 
class DebuggableScheduledThreadPoolExecutor,

and DebuggableScheduledThreadPoolExecutor is inherit from class  
ScheduledThreadPoolExecutor.

By reading the code of class DebuggableScheduledThreadPoolExecutor,  found 
DebuggableScheduledThreadPoolExecutor is using an unbound task queue, and core 
pool size is 1.   Why here use the unbound queue for queuing submitted tasks?  
If we using unbound queue, the thread pool wouldn't  increasing thread even 
there so many task are blocked in queue, because unbound queue never would be 
full.  I think here should use bound queue, so when task is heavily, more 
threads would created for processing them. 
{quote}public DebuggableScheduledThreadPoolExecutor(int corePoolSize, String 
threadPoolName, int priority)
 Unknown macro: \{ super(corePoolSize, new NamedThreadFactory(threadPoolName, 
priority)); setRejectedExecutionHandler(rejectedExecutionHandler); }
  

public ScheduledThreadPoolExecutor(int corePoolSize,
 ThreadFactory threadFactory)
 Unknown macro: \{ super(corePoolSize, Integer.MAX_VALUE, 0, NANOSECONDS, new 
DelayedWorkQueue(), threadFactory); }
{quote}
 Below is the case about clean task after compaction.  there nearly 3 hours 
delay for removing file "mc-56525". 
{quote} 

TRACE [CompactionExecutor:81] 2018-08-16 21:22:29,664 
LifecycleTransaction.java:363 - Staging for obsolescence 
BigTableReader(path='/sdb/data/test_2/edgestore-365b0b70a05911e8806001ebe60a5ce7/mc-56525-big-Data.db')
 ..
 TRACE [CompactionExecutor:81] 2018-08-16 21:22:41,162 Tracker.java:165 - 
removing 
/sdb/data/test_2/edgestore-365b0b70a05911e8806001ebe60a5ce7/mc-56525-big from 
list of files tracked for test_2.edgestore
 
 TRACE [NonPeriodicTasks:1] 2018-08-17 00:28:47,179 SSTableReader.java:2175 - 
Async instance tidier for 
/sdb/data/test_2/edgestore-365b0b70a05911e8806001ebe60a5ce7/mc-56525-big, 
before barrier
 TRACE [NonPeriodicTasks:1] 2018-08-17 00:28:47,180 SSTableReader.java:2181 - 
Async instance tidier for 
/sdb/data/test_2/edgestore-365b0b70a05911e8806001ebe60a5ce7/mc-56525-big, after 
barrier
 TRACE [NonPeriodicTasks:1] 2018-08-17 00:28:47,182 SSTableReader.java:2196 - 
Async instance tidier for 
/sdb/data/test_2/edgestore-365b0b70a05911e8806001ebe60a5ce7/mc-56525-big, 
completed
{quote}
 

 

 

 

 

 

 

  was:
We use cassandra as backend storage for Janusgraph. when we loading huge data 
(~2 billion vertex, ~10 billion edges), we met some problems.

 

At first, we use STCS as compaction strategy , but met below exception.  we 
checked the value of  "max memory lock" is unlimited and "file map count" is 1 
million, these values should enough for loading data. last we found this 
problem is caused by the virtual memory are all cosumed by cassandra.  So not 
additional virtual memory can be used by compaction task , and below exception 
is thrown out.   
{quote}ERROR [CompactionExecutor:267] 2018-08-09 02:28:40,952 
JVMStabilityInspector.javv
 a:74 - OutOfMemory error letting the JVM handle the error:
 java.lang.OutOfMemoryError: Map failed
{quote}
So, we change compaction strategy to LCS, this change seems can resolve the 
virtual memory problem. But we found another problem : Many sstables which has 
been compacted are still retained on disk,  these old sstables consume so many 
disk space, it's causing no enough disk for saving real data. we found that so 
many files 

[jira] [Updated] (CASSANDRA-14653) The performance of "NonPeriodicTasks" pools defined in class ScheduledExecutors is low

2018-08-17 Thread Peter Xie (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Xie updated CASSANDRA-14653:
--
Description: 
We use cassandra as backend storage for Janusgraph. when we loading huge data 
(~2 billion vertex, ~10 billion edges), we met some problems.

 

At first, we use STCS as compaction strategy , but met below exception.  we 
checked the value of  "max memory lock" is unlimited and "file map count" is 1 
million, these values should enough for loading data. last we found this 
problem is caused by the virtual memory are all cosumed by cassandra.  So not 
additional virtual memory can be used by compaction task , and below exception 
is throwed out.   
{quote}ERROR [CompactionExecutor:267] 2018-08-09 02:28:40,952 
JVMStabilityInspector.javv
 a:74 - OutOfMemory error letting the JVM handle the error:
 java.lang.OutOfMemoryError: Map failed
{quote}
So, we change compaction strategy to LCS, this change seems can resolve the 
virtual memory problem. But we found another problem : Many sstables which has 
been compacted are still retained on disk, at last these old sstable consume so 
many disk space, it's causing no enough disk for saving real data. we found 
that so many files like "mc_txn_compaction_xxx.log" are created under the data 
directory. 

After some times' investigaton, we found that this problem is caused by 
"NonPeriodicTasks" thread pools.  this pools is always using only one thread 
for processing clean task and compaction. this thread pool is instanced with 
class DebuggableScheduledThreadPoolExecutor,

and DebuggableScheduledThreadPoolExecutor is inherit from class  
ScheduledThreadPoolExecutor.

By reading the code of class DebuggableScheduledThreadPoolExecutor,  found 
DebuggableScheduledThreadPoolExecutor is using an unbound task queue, and core 
pool size is 1.   Why here use the unbound queue for queuing submitted tasks?  
If we using unbound queue, the thread pool wouldn't  increasing thread even 
there so many task are blocked in queue, because unbound queue never would be 
full.  I think here should use bound queue, so when task is heavily, more 
threads would created for processing them. 
{quote}public DebuggableScheduledThreadPoolExecutor(int corePoolSize, String 
threadPoolName, int priority)
 Unknown macro: \{ super(corePoolSize, new NamedThreadFactory(threadPoolName, 
priority)); setRejectedExecutionHandler(rejectedExecutionHandler); }
  

public ScheduledThreadPoolExecutor(int corePoolSize,
 ThreadFactory threadFactory)
 Unknown macro: \{ super(corePoolSize, Integer.MAX_VALUE, 0, NANOSECONDS, new 
DelayedWorkQueue(), threadFactory); }
{quote}
 Below is the case about clean task after compaction.  there nearly 3 hours 
delay for removing file "mc-56525". 
{quote} 

TRACE [CompactionExecutor:81] 2018-08-16 21:22:29,664 
LifecycleTransaction.java:363 - Staging for obsolescence 
BigTableReader(path='/sdb/data/test_2/edgestore-365b0b70a05911e8806001ebe60a5ce7/mc-56525-big-Data.db')
 ..
 TRACE [CompactionExecutor:81] 2018-08-16 21:22:41,162 Tracker.java:165 - 
removing 
/sdb/data/test_2/edgestore-365b0b70a05911e8806001ebe60a5ce7/mc-56525-big from 
list of files tracked for test_2.edgestore
 
 TRACE [NonPeriodicTasks:1] 2018-08-17 00:28:47,179 SSTableReader.java:2175 - 
Async instance tidier for 
/sdb/data/test_2/edgestore-365b0b70a05911e8806001ebe60a5ce7/mc-56525-big, 
before barrier
 TRACE [NonPeriodicTasks:1] 2018-08-17 00:28:47,180 SSTableReader.java:2181 - 
Async instance tidier for 
/sdb/data/test_2/edgestore-365b0b70a05911e8806001ebe60a5ce7/mc-56525-big, after 
barrier
 TRACE [NonPeriodicTasks:1] 2018-08-17 00:28:47,182 SSTableReader.java:2196 - 
Async instance tidier for 
/sdb/data/test_2/edgestore-365b0b70a05911e8806001ebe60a5ce7/mc-56525-big, 
completed
{quote}
 

 

 

 

 

 

 

  was:
We use cassandra as backend storage for Janusgraph. when we loading huge data 
(~2 billion vertex, ~10 billion edges), we met some problems.

 

At first, we use STCS as compaction strategy , but met below exception.  we 
checked the value of  "max memory lock" is unlimited and "file map count" is 1 
million, these values should enough for loading data. last we found this 
problem is caused by the virtual memory are all cosumed by cassandra.  So not 
additional virtual memory can be used by compaction task , and below exception 
is thrower out.   
{quote}ERROR [CompactionExecutor:267] 2018-08-09 02:28:40,952 
JVMStabilityInspector.javv
 a:74 - OutOfMemory error letting the JVM handle the error:
 java.lang.OutOfMemoryError: Map failed
{quote}
So, we change compaction strategy to LCS, this change seems can resolve the 
virtual memory problem. But we found another problem : Many sstables which has 
been compacted are still retained on disk, at last these old sstable consume so 
many disk space, it's causing no enough disk for saving real data. we found 
that so 

[jira] [Updated] (CASSANDRA-14653) The performance of "NonPeriodicTasks" pools defined in class ScheduledExecutors is low

2018-08-17 Thread Peter Xie (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Xie updated CASSANDRA-14653:
--
Description: 
We use cassandra as backend storage for Janusgraph. when we loading huge data 
(~2 billion vertex, ~10 billion edges), we met some problems.

 

At first, we use STCS as compaction strategy , but met below exception.  we 
checked the value of  "max memory lock" is unlimited and "file map count" is 1 
million, these values should enough for loading data. last we found this 
problem is caused by the virtual memory are all cosumed by cassandra.  So not 
additional virtual memory can be used by compaction task , and below exception 
is thrown out.   
{quote}ERROR [CompactionExecutor:267] 2018-08-09 02:28:40,952 
JVMStabilityInspector.javv
 a:74 - OutOfMemory error letting the JVM handle the error:
 java.lang.OutOfMemoryError: Map failed
{quote}
So, we change compaction strategy to LCS, this change seems can resolve the 
virtual memory problem. But we found another problem : Many sstables which has 
been compacted are still retained on disk, at last these old sstable consume so 
many disk space, it's causing no enough disk for saving real data. we found 
that so many files like "mc_txn_compaction_xxx.log" are created under the data 
directory. 

After some times' investigaton, we found that this problem is caused by 
"NonPeriodicTasks" thread pools.  this pools is always using only one thread 
for processing clean task and compaction. this thread pool is instanced with 
class DebuggableScheduledThreadPoolExecutor,

and DebuggableScheduledThreadPoolExecutor is inherit from class  
ScheduledThreadPoolExecutor.

By reading the code of class DebuggableScheduledThreadPoolExecutor,  found 
DebuggableScheduledThreadPoolExecutor is using an unbound task queue, and core 
pool size is 1.   Why here use the unbound queue for queuing submitted tasks?  
If we using unbound queue, the thread pool wouldn't  increasing thread even 
there so many task are blocked in queue, because unbound queue never would be 
full.  I think here should use bound queue, so when task is heavily, more 
threads would created for processing them. 
{quote}public DebuggableScheduledThreadPoolExecutor(int corePoolSize, String 
threadPoolName, int priority)
 Unknown macro: \{ super(corePoolSize, new NamedThreadFactory(threadPoolName, 
priority)); setRejectedExecutionHandler(rejectedExecutionHandler); }
  

public ScheduledThreadPoolExecutor(int corePoolSize,
 ThreadFactory threadFactory)
 Unknown macro: \{ super(corePoolSize, Integer.MAX_VALUE, 0, NANOSECONDS, new 
DelayedWorkQueue(), threadFactory); }
{quote}
 Below is the case about clean task after compaction.  there nearly 3 hours 
delay for removing file "mc-56525". 
{quote} 

TRACE [CompactionExecutor:81] 2018-08-16 21:22:29,664 
LifecycleTransaction.java:363 - Staging for obsolescence 
BigTableReader(path='/sdb/data/test_2/edgestore-365b0b70a05911e8806001ebe60a5ce7/mc-56525-big-Data.db')
 ..
 TRACE [CompactionExecutor:81] 2018-08-16 21:22:41,162 Tracker.java:165 - 
removing 
/sdb/data/test_2/edgestore-365b0b70a05911e8806001ebe60a5ce7/mc-56525-big from 
list of files tracked for test_2.edgestore
 
 TRACE [NonPeriodicTasks:1] 2018-08-17 00:28:47,179 SSTableReader.java:2175 - 
Async instance tidier for 
/sdb/data/test_2/edgestore-365b0b70a05911e8806001ebe60a5ce7/mc-56525-big, 
before barrier
 TRACE [NonPeriodicTasks:1] 2018-08-17 00:28:47,180 SSTableReader.java:2181 - 
Async instance tidier for 
/sdb/data/test_2/edgestore-365b0b70a05911e8806001ebe60a5ce7/mc-56525-big, after 
barrier
 TRACE [NonPeriodicTasks:1] 2018-08-17 00:28:47,182 SSTableReader.java:2196 - 
Async instance tidier for 
/sdb/data/test_2/edgestore-365b0b70a05911e8806001ebe60a5ce7/mc-56525-big, 
completed
{quote}
 

 

 

 

 

 

 

  was:
We use cassandra as backend storage for Janusgraph. when we loading huge data 
(~2 billion vertex, ~10 billion edges), we met some problems.

 

At first, we use STCS as compaction strategy , but met below exception.  we 
checked the value of  "max memory lock" is unlimited and "file map count" is 1 
million, these values should enough for loading data. last we found this 
problem is caused by the virtual memory are all cosumed by cassandra.  So not 
additional virtual memory can be used by compaction task , and below exception 
is throwed out.   
{quote}ERROR [CompactionExecutor:267] 2018-08-09 02:28:40,952 
JVMStabilityInspector.javv
 a:74 - OutOfMemory error letting the JVM handle the error:
 java.lang.OutOfMemoryError: Map failed
{quote}
So, we change compaction strategy to LCS, this change seems can resolve the 
virtual memory problem. But we found another problem : Many sstables which has 
been compacted are still retained on disk, at last these old sstable consume so 
many disk space, it's causing no enough disk for saving real data. we found 
that so 

[jira] [Updated] (CASSANDRA-14653) The performance of "NonPeriodicTasks" pools defined in class ScheduledExecutors is low

2018-08-17 Thread Peter Xie (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Xie updated CASSANDRA-14653:
--
Description: 
We use cassandra as backend storage for Janusgraph. when we loading huge data 
(~2 billion vertex, ~10 billion edges), we met some problems.

 

At first, we use STCS as compaction strategy , but met below exception.  we 
checked the value of  "max memory lock" is unlimited and "file map count" is 1 
million, these values should enough for loading data. last , we found this 
problem is caused by the virtual memory are all cosumed by cassandra.  So not 
additional virtual memory can be used by compaction task , and below exception 
is thrower out.   
{quote}ERROR [CompactionExecutor:267] 2018-08-09 02:28:40,952 
JVMStabilityInspector.javv
 a:74 - OutOfMemory error letting the JVM handle the error:
 java.lang.OutOfMemoryError: Map failed
{quote}
So, we change compaction strategy to LCS, this change seems can resolve the 
virtual memory problem. But we found another problem : Many sstables which has 
been compacted are still retained on disk, at last these old sstable consume so 
many disk space, it's causing no enough disk for saving real data. we found 
that so many files like "mc_txn_compaction_xxx.log" are created under the data 
directory. 

After some times' investigaton, we found that this problem is caused by 
"NonPeriodicTasks" thread pools.  this pools is always using only one thread 
for processing clean task and compaction. this thread pool is instanced with 
class DebuggableScheduledThreadPoolExecutor,

and DebuggableScheduledThreadPoolExecutor is inherit from class  
ScheduledThreadPoolExecutor.

By reading the code of class DebuggableScheduledThreadPoolExecutor,  found 
DebuggableScheduledThreadPoolExecutor is using an unbound task queue, and core 
pool size is 1.   Why here use the unbound queue for queuing submitted tasks?  
If we using unbound queue, the thread pool wouldn't  increasing thread even 
there so many task are blocked in queue, because unbound queue never would be 
full.  I think here should use bound queue, so when task is heavily, more 
threads would created for processing them. 
{quote}public DebuggableScheduledThreadPoolExecutor(int corePoolSize, String 
threadPoolName, int priority)
 Unknown macro: \{ super(corePoolSize, new NamedThreadFactory(threadPoolName, 
priority)); setRejectedExecutionHandler(rejectedExecutionHandler); }
  

public ScheduledThreadPoolExecutor(int corePoolSize,
 ThreadFactory threadFactory)
 Unknown macro: \{ super(corePoolSize, Integer.MAX_VALUE, 0, NANOSECONDS, new 
DelayedWorkQueue(), threadFactory); }
{quote}
 Below is the case about clean task after compaction.  there nearly 3 hours 
delay for removing file "mc-56525". 
{quote} 

TRACE [CompactionExecutor:81] 2018-08-16 21:22:29,664 
LifecycleTransaction.java:363 - Staging for obsolescence 
BigTableReader(path='/sdb/data/test_2/edgestore-365b0b70a05911e8806001ebe60a5ce7/mc-56525-big-Data.db')
 ..
 TRACE [CompactionExecutor:81] 2018-08-16 21:22:41,162 Tracker.java:165 - 
removing 
/sdb/data/test_2/edgestore-365b0b70a05911e8806001ebe60a5ce7/mc-56525-big from 
list of files tracked for test_2.edgestore
 
 TRACE [NonPeriodicTasks:1] 2018-08-17 00:28:47,179 SSTableReader.java:2175 - 
Async instance tidier for 
/sdb/data/test_2/edgestore-365b0b70a05911e8806001ebe60a5ce7/mc-56525-big, 
before barrier
 TRACE [NonPeriodicTasks:1] 2018-08-17 00:28:47,180 SSTableReader.java:2181 - 
Async instance tidier for 
/sdb/data/test_2/edgestore-365b0b70a05911e8806001ebe60a5ce7/mc-56525-big, after 
barrier
 TRACE [NonPeriodicTasks:1] 2018-08-17 00:28:47,182 SSTableReader.java:2196 - 
Async instance tidier for 
/sdb/data/test_2/edgestore-365b0b70a05911e8806001ebe60a5ce7/mc-56525-big, 
completed
{quote}
 

 

 

 

 

 

 

  was:
We use cassandra as backend storage for Janusgraph. when we loading huge data 
(~2 billion vertex, ~10 billion edges), we met some problems.

 

At first, we use STCS as compaction strategy , but met below exception.  we 
checked the value of  "max memory lock" is unlimited and "file map count" is 1 
million, these values should enough for loading. last , we found this problem 
is caused by the virtual memory are all cosumed by cassandra.  So not 
additional virtual memory can be used by compaction task , and below exception 
is thrower out.   
{quote}ERROR [CompactionExecutor:267] 2018-08-09 02:28:40,952 
JVMStabilityInspector.javv
 a:74 - OutOfMemory error letting the JVM handle the error:
 java.lang.OutOfMemoryError: Map failed
{quote}
So, we change compaction strategy to LCS, this change seems can resolve the 
virtual memory problem. But we found another problem : Many sstables which has 
been compacted are still retained on disk, at last these old sstable consume so 
many disk space, it's causing no enough disk for saving real data. we found 
that so 

[jira] [Updated] (CASSANDRA-14653) The performance of "NonPeriodicTasks" pools defined in class ScheduledExecutors is low

2018-08-17 Thread Peter Xie (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Xie updated CASSANDRA-14653:
--
Description: 
We use cassandra as backend storage for Janusgraph. when we loading huge data 
(~2 billion vertex, ~10 billion edges), we met some problems.

 

At first, we use STCS as compaction strategy , but met below exception.  we 
checked the value of  "max memory lock" is unlimited and "file map count" is 1 
million, these values should enough for loading data. last we found this 
problem is caused by the virtual memory are all cosumed by cassandra.  So not 
additional virtual memory can be used by compaction task , and below exception 
is thrower out.   
{quote}ERROR [CompactionExecutor:267] 2018-08-09 02:28:40,952 
JVMStabilityInspector.javv
 a:74 - OutOfMemory error letting the JVM handle the error:
 java.lang.OutOfMemoryError: Map failed
{quote}
So, we change compaction strategy to LCS, this change seems can resolve the 
virtual memory problem. But we found another problem : Many sstables which has 
been compacted are still retained on disk, at last these old sstable consume so 
many disk space, it's causing no enough disk for saving real data. we found 
that so many files like "mc_txn_compaction_xxx.log" are created under the data 
directory. 

After some times' investigaton, we found that this problem is caused by 
"NonPeriodicTasks" thread pools.  this pools is always using only one thread 
for processing clean task and compaction. this thread pool is instanced with 
class DebuggableScheduledThreadPoolExecutor,

and DebuggableScheduledThreadPoolExecutor is inherit from class  
ScheduledThreadPoolExecutor.

By reading the code of class DebuggableScheduledThreadPoolExecutor,  found 
DebuggableScheduledThreadPoolExecutor is using an unbound task queue, and core 
pool size is 1.   Why here use the unbound queue for queuing submitted tasks?  
If we using unbound queue, the thread pool wouldn't  increasing thread even 
there so many task are blocked in queue, because unbound queue never would be 
full.  I think here should use bound queue, so when task is heavily, more 
threads would created for processing them. 
{quote}public DebuggableScheduledThreadPoolExecutor(int corePoolSize, String 
threadPoolName, int priority)
 Unknown macro: \{ super(corePoolSize, new NamedThreadFactory(threadPoolName, 
priority)); setRejectedExecutionHandler(rejectedExecutionHandler); }
  

public ScheduledThreadPoolExecutor(int corePoolSize,
 ThreadFactory threadFactory)
 Unknown macro: \{ super(corePoolSize, Integer.MAX_VALUE, 0, NANOSECONDS, new 
DelayedWorkQueue(), threadFactory); }
{quote}
 Below is the case about clean task after compaction.  there nearly 3 hours 
delay for removing file "mc-56525". 
{quote} 

TRACE [CompactionExecutor:81] 2018-08-16 21:22:29,664 
LifecycleTransaction.java:363 - Staging for obsolescence 
BigTableReader(path='/sdb/data/test_2/edgestore-365b0b70a05911e8806001ebe60a5ce7/mc-56525-big-Data.db')
 ..
 TRACE [CompactionExecutor:81] 2018-08-16 21:22:41,162 Tracker.java:165 - 
removing 
/sdb/data/test_2/edgestore-365b0b70a05911e8806001ebe60a5ce7/mc-56525-big from 
list of files tracked for test_2.edgestore
 
 TRACE [NonPeriodicTasks:1] 2018-08-17 00:28:47,179 SSTableReader.java:2175 - 
Async instance tidier for 
/sdb/data/test_2/edgestore-365b0b70a05911e8806001ebe60a5ce7/mc-56525-big, 
before barrier
 TRACE [NonPeriodicTasks:1] 2018-08-17 00:28:47,180 SSTableReader.java:2181 - 
Async instance tidier for 
/sdb/data/test_2/edgestore-365b0b70a05911e8806001ebe60a5ce7/mc-56525-big, after 
barrier
 TRACE [NonPeriodicTasks:1] 2018-08-17 00:28:47,182 SSTableReader.java:2196 - 
Async instance tidier for 
/sdb/data/test_2/edgestore-365b0b70a05911e8806001ebe60a5ce7/mc-56525-big, 
completed
{quote}
 

 

 

 

 

 

 

  was:
We use cassandra as backend storage for Janusgraph. when we loading huge data 
(~2 billion vertex, ~10 billion edges), we met some problems.

 

At first, we use STCS as compaction strategy , but met below exception.  we 
checked the value of  "max memory lock" is unlimited and "file map count" is 1 
million, these values should enough for loading data. last , we found this 
problem is caused by the virtual memory are all cosumed by cassandra.  So not 
additional virtual memory can be used by compaction task , and below exception 
is thrower out.   
{quote}ERROR [CompactionExecutor:267] 2018-08-09 02:28:40,952 
JVMStabilityInspector.javv
 a:74 - OutOfMemory error letting the JVM handle the error:
 java.lang.OutOfMemoryError: Map failed
{quote}
So, we change compaction strategy to LCS, this change seems can resolve the 
virtual memory problem. But we found another problem : Many sstables which has 
been compacted are still retained on disk, at last these old sstable consume so 
many disk space, it's causing no enough disk for saving real data. we found 
that 

[jira] [Updated] (CASSANDRA-14653) The performance of "NonPeriodicTasks" pools defined in class ScheduledExecutors is low

2018-08-17 Thread Peter Xie (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Xie updated CASSANDRA-14653:
--
Description: 
We use cassandra as backend storage for Janusgraph. when we loading huge data 
(~2 billion vertex, ~10 billion edges), we met some problems.

 

At first, we use STCS as compaction strategy , but met below exception.  we 
checked the value of  "max memory lock" is unlimited and "file map count" is 1 
million, these values should enough for loading. last , we found this problem 
is caused by the virtual memory are all cosumed by cassandra.  So not 
additional virtual memory can be used by compaction task , and below exception 
is thrower out.   
{quote}ERROR [CompactionExecutor:267] 2018-08-09 02:28:40,952 
JVMStabilityInspector.javv
 a:74 - OutOfMemory error letting the JVM handle the error:
 java.lang.OutOfMemoryError: Map failed
{quote}
So, we change compaction strategy to LCS, this change seems can resolve the 
virtual memory problem. But we found another problem : Many sstables which has 
been compacted are still retained on disk, at last these old sstable consume so 
many disk space, it's causing no enough disk for saving real data. we found 
that so many files like "mc_txn_compaction_xxx.log" are created under the data 
directory. 

After some times' investigaton, we found that this problem is caused by 
"NonPeriodicTasks" thread pools.  this pools is always using only one thread 
for processing clean task and compaction. this thread pool is instanced with 
class DebuggableScheduledThreadPoolExecutor,

and DebuggableScheduledThreadPoolExecutor is inherit from class  
ScheduledThreadPoolExecutor.

By reading the code of class DebuggableScheduledThreadPoolExecutor,  found 
DebuggableScheduledThreadPoolExecutor is using an unbound task queue, and core 
pool size is 1.   Why here use the unbound queue for queuing submitted tasks?  
If we using unbound queue, the thread pool wouldn't  increasing thread even 
there so many task are blocked in queue, because unbound queue never would be 
full.  I think here should use bound queue, so when task is heavily, more 
threads would created for processing them. 
{quote}public DebuggableScheduledThreadPoolExecutor(int corePoolSize, String 
threadPoolName, int priority)
 Unknown macro: \{ super(corePoolSize, new NamedThreadFactory(threadPoolName, 
priority)); setRejectedExecutionHandler(rejectedExecutionHandler); }
  

public ScheduledThreadPoolExecutor(int corePoolSize,
 ThreadFactory threadFactory)
 Unknown macro: \{ super(corePoolSize, Integer.MAX_VALUE, 0, NANOSECONDS, new 
DelayedWorkQueue(), threadFactory); }
{quote}
 Below is the case about clean task after compaction.  there nearly 3 hours 
delay for removing file "mc-56525". 
{quote} 

TRACE [CompactionExecutor:81] 2018-08-16 21:22:29,664 
LifecycleTransaction.java:363 - Staging for obsolescence 
BigTableReader(path='/sdb/data/test_2/edgestore-365b0b70a05911e8806001ebe60a5ce7/mc-56525-big-Data.db')
 ..
 TRACE [CompactionExecutor:81] 2018-08-16 21:22:41,162 Tracker.java:165 - 
removing 
/sdb/data/test_2/edgestore-365b0b70a05911e8806001ebe60a5ce7/mc-56525-big from 
list of files tracked for test_2.edgestore
 
 TRACE [NonPeriodicTasks:1] 2018-08-17 00:28:47,179 SSTableReader.java:2175 - 
Async instance tidier for 
/sdb/data/test_2/edgestore-365b0b70a05911e8806001ebe60a5ce7/mc-56525-big, 
before barrier
 TRACE [NonPeriodicTasks:1] 2018-08-17 00:28:47,180 SSTableReader.java:2181 - 
Async instance tidier for 
/sdb/data/test_2/edgestore-365b0b70a05911e8806001ebe60a5ce7/mc-56525-big, after 
barrier
 TRACE [NonPeriodicTasks:1] 2018-08-17 00:28:47,182 SSTableReader.java:2196 - 
Async instance tidier for 
/sdb/data/test_2/edgestore-365b0b70a05911e8806001ebe60a5ce7/mc-56525-big, 
completed
{quote}
 

 

 

 

 

 

 

  was:
We use cassandra as backend storage for Janusgraph. when we loading huge data 
(~2 billion vertex, ~10 billion edges), we met some problems.

 

At first, we use STCS as compaction strategy , but met below exception.  we 
checked the value of  "max memory lock" is unlimited and "map count" is 1 
million, these values should enough for loading. last , we found this problem 
is caused by the virtual memory are all cosumed by cassandra.  So not 
additional virtual memory can be used by compaction task , and below exception 
is thrower out.   
{quote}ERROR [CompactionExecutor:267] 2018-08-09 02:28:40,952 
JVMStabilityInspector.javv
 a:74 - OutOfMemory error letting the JVM handle the error:
 java.lang.OutOfMemoryError: Map failed
{quote}
So, we change compaction strategy to LCS, this change seems can resolve the 
virtual memory problem. But we found another problem : Many sstables which has 
been compacted are still retained on disk, at last these old sstable consume so 
many disk space, it's causing no enough disk for saving real data. we found 
that so many files 

[jira] [Updated] (CASSANDRA-14653) The performance of "NonPeriodicTasks" pools defined in class ScheduledExecutors is low

2018-08-17 Thread Peter Xie (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Xie updated CASSANDRA-14653:
--
Description: 
We use cassandra as backend storage for Janusgraph. when we loading huge data 
(~2 billion vertex, ~10 billion edges), we met some problems.

 

At first, we use STCS as compaction strategy , but met below exception.  we 
checked the value of  "max memory lock" is unlimited and "map count" is 1 
million, these values should enough for loading. last , we found this problem 
is caused by the virtual memory are all cosumed by cassandra.  So not 
additional virtual memory can be used by compaction task , and below exception 
is thrower out.   
{quote}ERROR [CompactionExecutor:267] 2018-08-09 02:28:40,952 
JVMStabilityInspector.javv
 a:74 - OutOfMemory error letting the JVM handle the error:
 java.lang.OutOfMemoryError: Map failed
{quote}
So, we change compaction strategy to LCS, this change seems can resolve the 
virtual memory problem. But we found another problem : Many sstables which has 
been compacted are still retained on disk, at last these old sstable consume so 
many disk space, it's causing no enough disk for saving real data. we found 
that so many files like "mc_txn_compaction_xxx.log" are created under the data 
directory. 

After some times' investigaton, we found that this problem is caused by 
"NonPeriodicTasks" thread pools.  this pools is always using only one thread 
for processing clean task and compaction. this thread pool is instanced with 
class DebuggableScheduledThreadPoolExecutor,

and DebuggableScheduledThreadPoolExecutor is inherit from class  
ScheduledThreadPoolExecutor.

By reading the code of class DebuggableScheduledThreadPoolExecutor,  found 
DebuggableScheduledThreadPoolExecutor is using an unbound task queue, and core 
pool size is 1.   Why here use the unbound queue for queuing submitted tasks?  
If we using unbound queue, the thread pool wouldn't  increasing thread even 
there so many task are blocked in queue, because unbound queue never would be 
full.  I think here should use bound queue, so when task is heavily, more 
threads would created for processing them. 
{quote}public DebuggableScheduledThreadPoolExecutor(int corePoolSize, String 
threadPoolName, int priority)
Unknown macro: \{ super(corePoolSize, new NamedThreadFactory(threadPoolName, 
priority)); setRejectedExecutionHandler(rejectedExecutionHandler); }
 

public ScheduledThreadPoolExecutor(int corePoolSize,
 ThreadFactory threadFactory)
Unknown macro: \{ super(corePoolSize, Integer.MAX_VALUE, 0, NANOSECONDS, new 
DelayedWorkQueue(), threadFactory); }{quote}
 Below is the case about clean task after compaction.  there nearly 3 hours 
delay for removing file "mc-56525". 
{quote} 

TRACE [CompactionExecutor:81] 2018-08-16 21:22:29,664 
LifecycleTransaction.java:363 - Staging for obsolescence 
BigTableReader(path='/sdb/data/test_2/edgestore-365b0b70a05911e8806001ebe60a5ce7/mc-56525-big-Data.db')
 ..
 TRACE [CompactionExecutor:81] 2018-08-16 21:22:41,162 Tracker.java:165 - 
removing 
/sdb/data/test_2/edgestore-365b0b70a05911e8806001ebe60a5ce7/mc-56525-big from 
list of files tracked for test_2.edgestore
 
 TRACE [NonPeriodicTasks:1] 2018-08-17 00:28:47,179 SSTableReader.java:2175 - 
Async instance tidier for 
/sdb/data/test_2/edgestore-365b0b70a05911e8806001ebe60a5ce7/mc-56525-big, 
before barrier
 TRACE [NonPeriodicTasks:1] 2018-08-17 00:28:47,180 SSTableReader.java:2181 - 
Async instance tidier for 
/sdb/data/test_2/edgestore-365b0b70a05911e8806001ebe60a5ce7/mc-56525-big, after 
barrier
 TRACE [NonPeriodicTasks:1] 2018-08-17 00:28:47,182 SSTableReader.java:2196 - 
Async instance tidier for 
/sdb/data/test_2/edgestore-365b0b70a05911e8806001ebe60a5ce7/mc-56525-big, 
completed
{quote}
 

 

 

 

 

 

 

  was:
We use cassandra as backend storage for Janusgraph. when we loading huge data 
(~2 billion vertex, ~10 billion edges), we met an problems.

 

At first, we use STCS as compaction strategy , but met below exception.  we 
checked the value of  "max memory lock" is unlimited and "map count" is 1 
million, these values should enough for loading. last , we found this problem 
is caused by the virtual memory are all cosumed by cassandra.  So not 
additional virtual memory can be used by compaction task , and below exception 
is thrower out.   
{quote}ERROR [CompactionExecutor:267] 2018-08-09 02:28:40,952 
JVMStabilityInspector.javv
a:74 - OutOfMemory error letting the JVM handle the error:
java.lang.OutOfMemoryError: Map failed
{quote}
So, we change compaction strategy to LCS, this change seems can resolve the 
virtual memory problem. But we found another problem : Many sstables which has 
been compacted are still retained on disk, at last these old sstable consume so 
many disk space, it's causing no enough disk for saving real data. we found 
that so many files like 

[jira] [Created] (CASSANDRA-14653) The performance of "NonPeriodicTasks" pools defined in class ScheduledExecutors is low

2018-08-17 Thread Peter Xie (JIRA)
Peter Xie created CASSANDRA-14653:
-

 Summary: The performance of "NonPeriodicTasks" pools defined in 
class ScheduledExecutors is low
 Key: CASSANDRA-14653
 URL: https://issues.apache.org/jira/browse/CASSANDRA-14653
 Project: Cassandra
  Issue Type: Improvement
  Components: Compaction
 Environment: Cassandra nodes :

3 nodes, 330G physical memory per node , and four data directory (ssd)  per 
node.
Reporter: Peter Xie


We use cassandra as backend storage for Janusgraph. when we loading huge data 
(~2 billion vertex, ~10 billion edges), we met an problems.

 

At first, we use STCS as compaction strategy , but met below exception.  we 
checked the value of  "max memory lock" is unlimited and "map count" is 1 
million, these values should enough for loading. last , we found this problem 
is caused by the virtual memory are all cosumed by cassandra.  So not 
additional virtual memory can be used by compaction task , and below exception 
is thrower out.   
{quote}ERROR [CompactionExecutor:267] 2018-08-09 02:28:40,952 
JVMStabilityInspector.javv
a:74 - OutOfMemory error letting the JVM handle the error:
java.lang.OutOfMemoryError: Map failed
{quote}
So, we change compaction strategy to LCS, this change seems can resolve the 
virtual memory problem. But we found another problem : Many sstables which has 
been compacted are still retained on disk, at last these old sstable consume so 
many disk space, it's causing no enough disk for saving real data. we found 
that so many files like "mc_txn_compaction_xxx.log" are created under the data 
directory. 

After some times' investigaton, we found that this problem is caused by 
"NonPeriodicTasks" thread pools.  this pools is always using only one thread 
for processing clean task and compaction. this thread pool is instanced with 
class DebuggableScheduledThreadPoolExecutor,

and DebuggableScheduledThreadPoolExecutor is inherit from class  
ScheduledThreadPoolExecutor.

By reading the code of class DebuggableScheduledThreadPoolExecutor,  found 
DebuggableScheduledThreadPoolExecutor is using an unbound task queue, and core 
pool size is 1.   Why here use the unbound queue for queuing submitted tasks?  
If we using unbound queue, the thread pool wouldn't  increasing thread even 
there so many task are blocked in queue, because unbound queue never would be 
full.  I think here should use bound queue, so when task is heavily, more 
threads would created for processing them. 
{quote}public DebuggableScheduledThreadPoolExecutor(int corePoolSize, String 
threadPoolName, int priority)
{
 super(corePoolSize, new NamedThreadFactory(threadPoolName, priority));
 setRejectedExecutionHandler(rejectedExecutionHandler);
}
 

public ScheduledThreadPoolExecutor(int corePoolSize,
 ThreadFactory threadFactory) {
 super(corePoolSize, Integer.MAX_VALUE, 0, NANOSECONDS,
 new DelayedWorkQueue(), threadFactory);
}
{quote}
 Below is the case about clean task after compaction.  there nearly 3 hours 
delay for removing file "mc-56525". 
{quote} 

TRACE [CompactionExecutor:81] 2018-08-16 21:22:29,664 
LifecycleTransaction.java:363 - Staging for obsolescence 
BigTableReader(path='/sdb/data/test_2/edgestore-365b0b70a05911e8806001ebe60a5ce7/mc-56525-big-Data.db')
..
TRACE [CompactionExecutor:81] 2018-08-16 21:22:41,162 Tracker.java:165 - 
removing 
/sdb/data/test_2/edgestore-365b0b70a05911e8806001ebe60a5ce7/mc-56525-big from 
list of files tracked for test_2.edgestore

TRACE [NonPeriodicTasks:1] 2018-08-17 00:28:47,179 SSTableReader.java:2175 - 
Async instance tidier for 
/sdb/data/test_2/edgestore-365b0b70a05911e8806001ebe60a5ce7/mc-56525-big, 
before barrier
TRACE [NonPeriodicTasks:1] 2018-08-17 00:28:47,180 SSTableReader.java:2181 - 
Async instance tidier for 
/sdb/data/test_2/edgestore-365b0b70a05911e8806001ebe60a5ce7/mc-56525-big, after 
barrier
TRACE [NonPeriodicTasks:1] 2018-08-17 00:28:47,182 SSTableReader.java:2196 - 
Async instance tidier for 
/sdb/data/test_2/edgestore-365b0b70a05911e8806001ebe60a5ce7/mc-56525-big, 
completed
{quote}
 

 

 

 

 

 

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[Cassandra Wiki] Update of "Committers" by BenjaminLerer

2018-08-17 Thread Apache Wiki
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for 
change notification.

The "Committers" page has been changed by BenjaminLerer:
https://wiki.apache.org/cassandra/Committers?action=diff=78=79

  ||Josh Mckenzie ||Jul 2014 ||Datastax ||PMC member ||
  ||Robert Stupp ||Jan 2015 ||Datastax || ||
  ||Sam Tunnicliffe ||May 2015 ||Apple || ||
- ||Benjamin Lerer ||Jul 2015 ||Datastax || ||
+ ||Benjamin Lerer ||Jul 2015 ||Datastax ||PMC member ||
  ||Carl Yeksigian ||Jan 2016 ||Datastax ||Also a 
[[http://thrift.apache.org|Thrift]] committer ||
  ||Stefania Alborghetti ||Apr 2016 ||Datastax || ||
  ||Jeff Jirsa ||June 2016 ||Apple|| PMC member ||

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org