abossert opened a new issue #8696: A more granular method of retaining recently 
completed tasks
URL: https://github.com/apache/incubator-druid/issues/8696
 
 
   ### Description
   
   The sys.tasks table has an existing setting 
"druid.indexer.storage.recentlyFinishedThreshold" that dictates how how long 
tasks are kept in history (for viewing within the console or API).
   
   In cases where there are an unusually large number of tasks that are tracked 
within the retention period shown above, then there is the possibility of 
causing significant performance issues, especially when trying to work with a 
large number of failed tasks.
   
   I would like to propose two changes:
   
   1. add a "druid.indexer.storage.recentlyFinishedThresholdPeriod" and 
"druid.indexer.storage.recentlyFinishedThresholdCount" setting so that users 
could specify a threshold in terms of time and raw count.  The server would be 
expected to honor whichever of the two settings triggers first.
   
   2. In addition to the new, more granular settings, allow both of those 
settings to be applied to tasks based on the task status (e.g. SUCCESS, FAILED, 
etc.) and include a prioritization scheme for that.  This way, One could, for 
example, discard tasks that succeeded more quickly than those that failed.  
Additionally, it might be useful to be able to throttle failed tasks by 
sampling, perhaps.
   
   The reason for providing such fine-grained control over these settings is 
that in essence, a large number of failed tasks could, in the worst case, cause 
a self-inflicted denial of service or severe enough degradation of system 
performance that the effect would be similar and compound the difficulty and 
time taken to troubleshoot the system (when the reason for the glut of tasks is 
due to failures).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to