rtpsw commented on PR #34392:
URL: https://github.com/apache/arrow/pull/34392#issuecomment-1532094481

   > Can you give a bit more details on what about MemoStore maintenance is not 
handles correctly? You can put code in the PR too if that is easier to align 
with the code. It's has been a while since I looked at future asof code so a 
refresher is appreciated.
   
   It would take a good amount of detail to explain the interaction between the 
various moving parts of the code. I'd prefer to add comments to the code to 
explain. Would that work?
   
   > Can you remind me why do we cache the hash in the first place. What is the 
case that we want to hash the same batch more than once?
   
   Hashes are used when the key is not fixed-width (and fits within 64-bits), 
e.g., when there are multiple key columns, and the hash for each row of the key 
columns is used as the value for grouping. For performance, the code computes 
hashes for all rows of the key columns of a given input batch once and caches 
the result. Later, whenever `GetKey` is invoked, which happens frequently, the 
key hash is quickly found in the cache.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to