jackylk commented on a change in pull request #3514: [FAQ]add faq for how to deal with trailing task URL: https://github.com/apache/carbondata/pull/3514#discussion_r361909663
########## File path: docs/faq.md ########## @@ -227,6 +228,29 @@ This property will enable the DEBUG log for the CarbonLRUCache and UnsafeMemoryM **Note:** If `Removed entry from InMemory LRU cache` are frequently observed in logs, you may have to increase the configured LRU size. To observe the LRU cache from heap dump, check the heap used by CarbonLRUCache class. + +## How to deal with the trailing task in query? + +During the tuning process, it may be found that a few tasks slow down the overall query progress. If the amount of data processed is the same, people will naturally think about the impact of IO, CPU and network bandwidth. Usually these tests can't able to have a quick result. So we need a way to solve and deal with these problems more quickly. spark.locality.wait and spark.speculation configuration it's an attempt, which can make the task that executes overtime retry in other nodes as soon as possible, and finally the task that ends first will be used. This may lose some of the data locality, but the actual verification helps to reduce the time-consuming of the trailing task. Review comment: ```suggestion When tuning query performance, user may found that a few tasks slow down the overall query progress. To improve performance in such case, user can set spark.locality.wait and spark.speculation=true to enable speculation in spark, which will launch multiple task and get the result the one of the task which is finished first. Besides, user can also consider following configurations to further improve performance in this case. ``` ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services