phaniarnab commented on PR #2036:
URL: https://github.com/apache/systemds/pull/2036#issuecomment-2241614037

   Thank you, @Mayaryin, @ingunnaf for your contribution.
   With some changes, I see a 2x speedup for real use cases, which is very good.
   I will not merge this PR immediately, as the changes are in the critical 
path of transformencode and may impact other running projects. However, after 
improving the robustness of this feature, I will merge it before the next 
release.
   
   List of TODOs include:
   - Integrate the lineage trace of the input frame into the key of the 
metadata cache, either just by adding the hash/checksum of the lineage trace or 
by making the build tasks lineage traceable. This extension will avoid 
incorrect reuse if the input frame is modified.
   - The number of bins may need to be added to the key to avoid incorrect 
reuse for different number of bins
   - Robustness of the hash function.
   
   Future work outside the scope of this PR:
   - Caching and reuse apply task results, which requires effective output 
allocation strategy.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@systemds.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to