[jira] [Created] (IMPALA-13306) Store resources attached to row batches per-tuple descriptor

Csaba Ringhofer (Jira) Fri, 16 Aug 2024 04:11:12 -0700

Csaba Ringhofer created IMPALA-13306:
----------------------------------------


             Summary: Store resources attached to row batches per-tuple 
descriptor
                 Key: IMPALA-13306
                 URL: https://issues.apache.org/jira/browse/IMPALA-13306
             Project: IMPALA
          Issue Type: Improvement
          Components: Backend
            Reporter: Csaba Ringhofer


Currently RowBatch handles resource related info (e.g. FlushMode) globally 
while it may be different for each tuple descriptor.

An example is a row that comes from a join that didn't spill. In this case the 
memory of the build side tuple remains valid until the join node is closed, 
while the probe side can change more often, e.g. when the scratch batch in the 
Parquet scanner gets full and is attached to the row batch.

Some operators could benefit from knowing that some tuple pointers remain valid 
fog longer. An example is tuple deduplication KrpcDataStream sender - if more 
than one row batches could be sent in a single OutboundRowBatch, then it would 
be important to know which whether the same tuple pointer really means the same 
tuple in the new RowBatch.
 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (IMPALA-13306) Store resources attached to row batches per-tuple descriptor

Reply via email to