Joe McDonnell created IMPALA-12905:
--------------------------------------
Summary: Implement disk-based tuple caching
Key: IMPALA-12905
URL: https://issues.apache.org/jira/browse/IMPALA-12905
Project: IMPALA
Issue Type: Task
Components: Backend
Affects Versions: Impala 4.4.0
Reporter: Joe McDonnell
The TupleCacheNode caches tuples to be reused later for equivalent queries.
This tracks implementing a version that serializes tuples and stores them as
files on local disk.
This will have a few parts:
# There is a TupleCacheMgr that keeps track of what entries exist in the cache
and evicts entries as needed to make space for new entries. This will be
configured using startup flags to specify the directory, size, and cache
eviction policy.
# The TupleCacheNode will interact with the TupleCacheMgr to determine if the
entry is available. If it is, it reads the associated tuple cache file and
returns the RowBatches. If the entry does not exist, it reads RowBatches from
its child and stores them to a new file in the cache.
# The TupleReader / TupleWriter implement serialization / deserialization of
RowBatches to/from a local file. This uses the existing serialization used for
KRPC.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)