abossert opened a new issue #8696: A more granular method of retaining recently completed tasks URL: https://github.com/apache/incubator-druid/issues/8696 ### Description The sys.tasks table has an existing setting "druid.indexer.storage.recentlyFinishedThreshold" that dictates how how long tasks are kept in history (for viewing within the console or API). In cases where there are an unusually large number of tasks that are tracked within the retention period shown above, then there is the possibility of causing significant performance issues, especially when trying to work with a large number of failed tasks. I would like to propose two changes: 1. add a "druid.indexer.storage.recentlyFinishedThresholdPeriod" and "druid.indexer.storage.recentlyFinishedThresholdCount" setting so that users could specify a threshold in terms of time and raw count. The server would be expected to honor whichever of the two settings triggers first. 2. In addition to the new, more granular settings, allow both of those settings to be applied to tasks based on the task status (e.g. SUCCESS, FAILED, etc.) and include a prioritization scheme for that. This way, One could, for example, discard tasks that succeeded more quickly than those that failed. Additionally, it might be useful to be able to throttle failed tasks by sampling, perhaps. The reason for providing such fine-grained control over these settings is that in essence, a large number of failed tasks could, in the worst case, cause a self-inflicted denial of service or severe enough degradation of system performance that the effect would be similar and compound the difficulty and time taken to troubleshoot the system (when the reason for the glut of tasks is due to failures).
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
