Github user kiszk commented on the issue:
https://github.com/apache/spark/pull/19518
Thank you for creating a PR for the latest Spark.
I think that it is great to reduce # of constant pool entries. I have one
high level comment.
IIUC, this PR **always** perform mutable state compaction. In other words,
mutable states are in arrays.
I am afraid about possible performance degradation due to increasing access
cost by putting states in arrays.
What do you think about putting mutable states into arrays (i.e. performing
mutable state compaction) only when there are many mutable states or only for
certain mutable states that are rarely accessed?
Or, can we say there is no performance degradation due to mutable state
compaction?
What do you think?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]