[
https://issues.apache.org/jira/browse/IMPALA-12905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Joe McDonnell resolved IMPALA-12905.
------------------------------------
Fix Version/s: Impala 4.4.0
Resolution: Fixed
> Implement disk-based tuple caching
> ----------------------------------
>
> Key: IMPALA-12905
> URL: https://issues.apache.org/jira/browse/IMPALA-12905
> Project: IMPALA
> Issue Type: Task
> Components: Backend
> Affects Versions: Impala 4.4.0
> Reporter: Joe McDonnell
> Assignee: Joe McDonnell
> Priority: Major
> Fix For: Impala 4.4.0
>
>
> The TupleCacheNode caches tuples to be reused later for equivalent queries.
> This tracks implementing a version that serializes tuples and stores them as
> files on local disk.
> This will have a few parts:
> # There is a TupleCacheMgr that keeps track of what entries exist in the
> cache and evicts entries as needed to make space for new entries. This will
> be configured using startup flags to specify the directory, size, and cache
> eviction policy.
> # The TupleCacheNode will interact with the TupleCacheMgr to determine if
> the entry is available. If it is, it reads the associated tuple cache file
> and returns the RowBatches. If the entry does not exist, it reads RowBatches
> from its child and stores them to a new file in the cache.
> # The TupleReader / TupleWriter implement serialization / deserialization of
> RowBatches to/from a local file. This uses the existing serialization used
> for KRPC.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]