Copilot commented on code in PR #12211:
URL: https://github.com/apache/gluten/pull/12211#discussion_r3435356462
##########
cpp/velox/operators/serializer/VeloxColumnarBatchSerializer.h:
##########
@@ -57,11 +58,30 @@ class VeloxColumnarBatchSerializer : public
ColumnarBatchSerializer {
// false so the JVM side falls back to pass-through in buildFilter.
std::vector<ColumnStats> computeStats(facebook::velox::RowVectorPtr
rowVector);
- // Returns framed bytes: [STATS_FRAMED_MAGIC: 4B] [statsLen: u32 LE]
[statsBlob] [bytesLen: u32 LE]
- // [bytesBlob]. statsBlob layout matches the JVM-side reader
(CachedColumnarBatchKryoSerializer
- // .deserializeStats). Magic 0xFE 0xCA 0x53 0x02 aligns with the JVM Kryo
STATS_FRAMED_MAGIC.
+ // V2: Returns framed bytes [STATS_FRAMED_MAGIC=0x02: 4B][statsLen: u32
LE][statsBlob]
+ // [bytesLen: u32 LE][bytesBlob]. statsBlob matches JVM
CachedColumnarBatchKryoSerializer.
std::vector<uint8_t> framedSerializeWithStats(const
std::shared_ptr<ColumnarBatch>& batch) override;
+ // V3: Per-column framed bytes [STATS_FRAMED_MAGIC_V3=0x03: 4B][statsLen=0:
u32 LE]
+ // [numRows: u32 LE][numCols: u32 LE][per-col: colLen(u32 LE) + colBytes].
+ // Each colBytes is produced by PrestoVectorSerde::serializeSingleColumn
(self-contained).
+ std::vector<uint8_t> framedSerializeV3(const std::shared_ptr<ColumnarBatch>&
batch) override;
+
+ // V3: Per-column framed bytes [STATS_FRAMED_MAGIC_V3=0x03: 4B][statsLen:
u32 LE][statsBlob]
+ // [numRows: u32 LE][numCols: u32 LE][per-col: colLen(u32 LE) + colBytes].
+ // Each colBytes is produced by PrestoVectorSerde::serializeSingleColumn
(self-contained).
+ std::vector<uint8_t> framedSerializeWithStatsV3(const
std::shared_ptr<ColumnarBatch>& batch) override;
+
+ // V3: Deserialize with column projection; returns M-column RowVector.
+ // requestedColumns: nullopt=all columns, optional<vector{}>=zero cols,
optional<vector{i,j}>=M cols.
+ std::shared_ptr<ColumnarBatch>
+ deserializeV3(uint8_t* data, int32_t size, const
std::optional<std::vector<int32_t>>& requestedColumns) override;
+
+ private:
+ // Extract statsBlob from per-column stats (shared by V2 and V3 write paths).
+ std::vector<uint8_t> buildStatsBlob(const std::vector<ColumnStats>& perCol,
uint32_t numRows, uint32_t numCols);
Review Comment:
The comment says buildStatsBlob is “shared by V2 and V3 write paths”, but in
the current implementation it’s only used from the V3 include-stats path (V2
still builds the stats blob inline). This is a small documentation mismatch
that can confuse future refactors/bugfixes.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]