[ 
https://issues.apache.org/jira/browse/CASSANDRA-11218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15502288#comment-15502288
 ] 

Jeff Jirsa edited comment on CASSANDRA-11218 at 9/19/16 5:04 AM:
-----------------------------------------------------------------

I have a version of this patch I'll be submitting very soon, but while I wait 
for internal approvals, I'd like to describe the implementation so that those 
of you who care about this can provide feedback conceptually before I submit a 
patch for review.

I'm implementing this as a priority queue that uses a custom comparator 
implemented with three tiers:

* Operation type priority (to allow certain types - like index rebuild - to run 
at higher priorities, and others - scrub / cleanup / verify - to run at much 
lower priorities). This is defined as an int field in the enum in the 
OperationType, and can be overridden via system property. Lot of opportunity 
for bike shedding here in picking exact priorities - I've chosen (highest 
priority to lowest):

** Anticompaction / Index Summary Redistribution
** Index Build / View Build
** Key Cache Save / Row Cache Save / Counter Cache Save
** User Defined Compaction
** Compaction (including maximal/major compaction)
** Tombstone Compaction
** Scrub / Cleanup / Upgrade SSTables
** Verify

* Sub type priority (to allow compaction tasks within a type to have preference 
- to enable behavior like CASSANDRA-6288 ). This is defined as a long, and set 
by the compaction strategies, and by default, I'm setting this as the bytes on 
disk of the source sstables - larger transactions (at the time the task was 
created) preferred over smaller transactions. 

* Timestamp priority, where tasks with the same type/subtype values are served 
FIFO.

The implementation here was pretty straight forward - we create a new interface 
to expose the three priority values, and then extend AbstractCompactionTask and 
de-anonymize the handful of anonymous runnables/wrapped runnables/callables to 
implement that interface so they can be sorted in the PriorityBlockingQueue. 

There may an opportunity to try to get clever to protect against starvation in 
under-resourced systems, such as increasing type priority over time as tasks 
age, but I'm leaving that as a potential optimization for the future - I'm not 
sure it's really needed, it makes reasoning about compaction harder, but maybe 
there exists a use case where it's necessary. 

Expecting to submit the patch early this week - if either of you (Sankalp / 
Marcus) finds this approach conflicts with your expectations, or if you want to 
volunteer to review, let me know.


was (Author: jjirsa):
I have a version of this patch I'll be submitting very soon, but while I wait 
for internal approvals, I'd like to describe the implementation so that those 
of you who care about this can provide feedback conceptually before I submit a 
patch for review.

I'm implementing this as a priority queue that uses a custom comparator 
implemented with three tiers:

* Operation type priority (to allow certain types - like index rebuild - to run 
at higher priorities, and others - scrub / cleanup / verify - to run at much 
lower priorities). This is defined as an int field in the enum in the 
OperationType, and can be overridden via system property. Lot of opportunity 
for bike shedding here in picking exact priorities - I've chosen (highest 
priority to lowest):

** Anticompaction
** Index Build / View Build
** Key Cache Save / Row Cache Save / Counter Cache Save
** User Defined Compaction
** Compaction (including maximal/major compaction)
** Tombstone Compaction
** Scrub / Cleanup / Upgrade SSTables
** Index Summary Redistribution
** Verify

* Sub type priority (to allow compaction tasks within a type to have preference 
- to enable behavior like CASSANDRA-6288 ). This is defined as a long, and set 
by the compaction strategies, and by default, I'm setting this as the bytes on 
disk of the source sstables - larger transactions (at the time the task was 
created) preferred over smaller transactions. 

* Timestamp priority, where tasks with the same type/subtype values are served 
FIFO.

The implementation here was pretty straight forward - we create a new interface 
to expose the three priority values, and then extend AbstractCompactionTask and 
de-anonymize the handful of anonymous runnables/wrapped runnables/callables to 
implement that interface so they can be sorted in the PriorityBlockingQueue. 

There may an opportunity to try to get clever to protect against starvation in 
under-resourced systems, such as increasing type priority over time as tasks 
age, but I'm leaving that as a potential optimization for the future - I'm not 
sure it's really needed, it makes reasoning about compaction harder, but maybe 
there exists a use case where it's necessary. 

Expecting to submit the patch early this week - if either of you (Sankalp / 
Marcus) finds this approach conflicts with your expectations, or if you want to 
volunteer to review, let me know.

> Prioritize Secondary Index rebuild
> ----------------------------------
>
>                 Key: CASSANDRA-11218
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-11218
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: sankalp kohli
>            Assignee: Jeff Jirsa
>            Priority: Minor
>
> We have seen that secondary index rebuild get stuck behind other compaction 
> during a bootstrap and other operations. This causes things to not finish. We 
> should prioritize index rebuild via a separate thread pool or using a 
> priority queue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to