keith-turner opened a new pull request, #4432: URL: https://github.com/apache/accumulo/pull/4432
For ondemand tablets the client tablet cache caches tablets w/o a location. There was a bug fixed in #4280 where the cache would do a metadata table lookup for each mutation when tablets had no location. The fix in #4280 only partially fixed the problem, after that change more metadata lookups than needed were still being done. Also there was a bug with the batchscanner that #4280 did not address. Before this change when tablets had no location, the batch scanner would do a metadata lookup for each range passed to the batch scanner (well the client tablet cache would these metadata lookups on behalf of the batch scanner). For example before this change if the batch scanner was given 10K ranges that all fell in a single tablet w/o a location, it would do 10K metadata lookups. After this change for that situation it will do a single metadata lookup. This change minimizes the metadata lookups done by the batch writer and batch scanner. The fix is to make sure that cached entries populated by looking up one range or mutation are used by subsequent range or mutations lookups, even if there is no location present. This is done by always reusing cache entries that were created after work started on a batch of mutations or ranges. Cache entries w/o a location that existed before work started on a batch are ignored. By reusing cache entries created after starting work on a batch we minimize metadata lookups. A test was also added to ensure the client tablet cache does not do excessive metadata table lookups. If this test had existed, it would have caught the problem. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: notifications-unsubscr...@accumulo.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org