On Thu, Jul 9, 2020 at 2:06 PM Chun-Yu Shei <cs...@google.com> wrote:
> Hmm, that's strange... it seems to have made it to the list archives: > https://archives.gentoo.org/gentoo-portage-dev/message/a4db905a64e3c1f6d88c4876e8291a65 > > (but it is entirely possible that I used "git send-email" incorrectly) > Ahhh it's visible there; I'll blame gMail ;) -A > > On Thu, Jul 9, 2020 at 2:04 PM Alec Warner <anta...@gentoo.org> wrote: > >> >> >> On Thu, Jul 9, 2020 at 12:03 AM Chun-Yu Shei <cs...@google.com> wrote: >> >>> Awesome! Here's a patch that adds @lru_cache to use_reduce, vercmp, and >>> catpkgsplit. use_reduce was split into 2 functions, with the outer one >>> converting lists/sets to tuples so they can be hashed and creating a >>> copy of the returned list (since the caller seems to modify it >>> sometimes). I tried to select cache sizes that minimized memory use >>> increase, >>> while still providing about the same speedup compared to a cache with >>> unbounded size. "emerge -uDvpU --with-bdeps=y @world" runtime decreases >>> from 44.32s -> 29.94s -- a 48% speedup, while the maximum value of the >>> RES column in htop increases from 280 MB -> 290 MB. >>> >>> "emerge -ep @world" time slightly decreases from 18.77s -> 17.93, while >>> max observed RES value actually decreases from 228 MB -> 214 MB (similar >>> values observed across a few before/after runs). >>> >>> Here are the cache hit stats, max observed RES memory, and runtime in >>> seconds for various sizes in the update case. Caching for each >>> function was tested independently (only 1 function with caching enabled >>> at a time): >>> >>> catpkgsplit: >>> CacheInfo(hits=1222233, misses=21419, maxsize=None, currsize=21419) >>> 270 MB >>> 39.217 >>> >>> CacheInfo(hits=1218900, misses=24905, maxsize=10000, currsize=10000) >>> 271 MB >>> 39.112 >>> >>> CacheInfo(hits=1212675, misses=31022, maxsize=5000, currsize=5000) >>> 271 MB >>> 39.217 >>> >>> CacheInfo(hits=1207879, misses=35878, maxsize=2500, currsize=2500) >>> 269 MB >>> 39.438 >>> >>> CacheInfo(hits=1199402, misses=44250, maxsize=1000, currsize=1000) >>> 271 MB >>> 39.348 >>> >>> CacheInfo(hits=1149150, misses=94610, maxsize=100, currsize=100) >>> 271 MB >>> 39.487 >>> >>> >>> use_reduce: >>> CacheInfo(hits=45326, misses=18660, maxsize=None, currsize=18561) >>> 407 MB >>> 35.77 >>> >>> CacheInfo(hits=45186, misses=18800, maxsize=10000, currsize=10000) >>> 353 MB >>> 35.52 >>> >>> CacheInfo(hits=44977, misses=19009, maxsize=5000, currsize=5000) >>> 335 MB >>> 35.31 >>> >>> CacheInfo(hits=44691, misses=19295, maxsize=2500, currsize=2500) >>> 318 MB >>> 35.85 >>> >>> CacheInfo(hits=44178, misses=19808, maxsize=1000, currsize=1000) >>> 301 MB >>> 36.39 >>> >>> CacheInfo(hits=41211, misses=22775, maxsize=100, currsize=100) >>> 299 MB >>> 37.175 >>> >>> >>> I didn't bother collecting detailed stats for vercmp, since the >>> inputs/outputs are quite small and don't cause much memory increase. >>> Please let me know if there are any other suggestions/improvements (and >>> thanks Sid for the lru_cache suggestion!). >>> >> >> I don't see a patch attached; can you link to it? >> >> -A >> >> >>> >>> Thanks, >>> Chun-Yu >>> >>> >>> >>>