Quanlong Huang created ORC-1143:
-----------------------------------
Summary: [C++] Support reading the PRESENT stream without reading
the column data
Key: ORC-1143
URL: https://issues.apache.org/jira/browse/ORC-1143
Project: ORC
Issue Type: New Feature
Components: C++
Reporter: Quanlong Huang
Queries like "select count(a) from tbl" just requires checking whether the
column value is not NULL. ORC files already have the PRESENT stream for each
column (though it's optional). We can serve the request by just reading the
PRESENT stream.
Currently, ReadIntent has two items:
{code:java}
enum ReadIntent {
ReadIntent_ALL = 0,
// Only read the offsets of selected type. Do not read the children types.
ReadIntent_OFFSETS = 1
};{code}
We can extend it to add an item like ReadIntent_PRESENT. The corresponding
ColumnVectorBatch will only have valid notNull results.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)