niyue commented on code in PR #40041:
URL: https://github.com/apache/arrow/pull/40041#discussion_r1487603910


##########
cpp/src/gandiva/cache.cc:
##########
@@ -23,11 +23,7 @@
 
 namespace gandiva {
 
-#ifdef GANDIVA_ENABLE_OBJECT_CODE_CACHE
 static const size_t DEFAULT_CACHE_SIZE = 500000;

Review Comment:
   Indeed, the unit in question pertains to the number of entries rather than 
bytes, but there's no misunderstanding of the cache size and the original value 
of `500` was also defined in terms of the number of entries. This PR aims to 
address an oversight by removing a previously missed flag.
   
   This link [[1]](https://github.com/apache/arrow/pull/11193#issue-1001547667) 
has more data about module cache vs. object code cache in Gandiva, and in the 
limited expressions tested, the memory is down to 0.8%~6% after using object 
code cache. 
   
   The current default value 500 is likely too small in production, and it will 
probably only use 8MB of memory using the stats in link [1], but you are right 
it seems 500000 is indeed too large (I have no idea how it was defined this way 
and I should do more calculation for it). According to the stats in link [1], 
when using module cache, it will probably take 750MB memory for 500 entries.
   
   Other initiatives [2] have attempted to enhance the cache eviction 
algorithm, but such attempts were reversed due to other issues [3]. I've 
previously reviewed these efforts and believe I have distinct ideas for 
advancing Gandiva's cache. I plan to propose a PR if I fully understand the 
workflow. To avoid complicating this PR, my goal is solely to refine the 
default value. Would it be acceptable to adopt a more conservative default 
value, such as `5000`/`10000`? Thanks.
   
   [1] https://github.com/apache/arrow/pull/11193#issue-1001547667 
   [2] https://github.com/apache/arrow/pull/10465
   [3] https://github.com/apache/arrow/pull/11957



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to