vanzin commented on issue #24982: [SPARK-28181][CORE] Add a filter interface to KVStore to speed up the entities retrieve URL: https://github.com/apache/spark/pull/24982#issuecomment-506782293 I haven't read the code (just some of the comments), but I wonder why you're using this approach to implement SPARK-28183. With this approach you have to load (i.e. deserialize in the case of disk store) all tasks for a particular stage to filter them. While I think the API itself you're adding here is ok (it's basically what `KVUtils.viewToSeq` does and could replace it), it will be terribly slow for large stages (think a stage with 100k tasks). SPARK-28183 would be way more efficient if you instead scanned the tasks based on the status you want, applying the offset and limit, and sorted based on a different property after that (because of offset and limit, you wouldn't have a lot of elements to sort).
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
