Arnab Karmakar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/24392 )

Change subject: IMPALA-15067: Add VariantValue and basic decoding functions
......................................................................


Patch Set 2:

(4 comments)

Thanks for working on this!

http://gerrit.cloudera.org:8080/#/c/24392/2/be/src/runtime/variant-value.cc
File be/src/runtime/variant-value.cc:

http://gerrit.cloudera.org:8080/#/c/24392/2/be/src/runtime/variant-value.cc@43
PS2, Line 43:   offset_size_ = ((header >> 5) & 0x03) + 1;
from Metadata encoding grammar:
header: 1 byte (<version> | <sorted_strings> << 4 | (<offset_size_minus_one> << 
6))
bits 6-7 should be used for offset_size_minus_one.


http://gerrit.cloudera.org:8080/#/c/24392/2/be/src/runtime/variant-value.cc@203
PS2, Line 203: int VariantValue::ObjectFieldIdSize() const {
             :   DCHECK_EQ(GetBasicType(), VariantBasicType::OBJECT);
             :   return ((data_[0] >> 2) & 0x03) + 1;
             : }
             :
             : int VariantValue::ObjectOffsetSize() const {
             :   DCHECK_EQ(GetBasicType(), VariantBasicType::OBJECT);
             :   return ((data_[0] >> 4) & 0x03) + 1;
             : }
As per the variant encoding spec 
https://github.com/apache/parquet-format/blob/master/VariantEncoding.md:
object_header: (is_large << 4 | field_id_size_minus_one << 2 | 
field_offset_size_minus_one)

I think the Object header field_id/offset sizes are swapped.
The test helpers everywhere have header 0x02, and thats why they passed.


http://gerrit.cloudera.org:8080/#/c/24392/2/be/src/util/variant-util-test.cc
File be/src/util/variant-util-test.cc:

PS2:
we could cover some gaps in testing:
1. metadata with offset_size > 1
2. objects with field_id_size != offset_size
3. maybe also long STRING primitive (basic_type=0)


http://gerrit.cloudera.org:8080/#/c/24392/2/be/src/util/variant-util.cc
File be/src/util/variant-util.cc:

http://gerrit.cloudera.org:8080/#/c/24392/2/be/src/util/variant-util.cc@21
PS2, Line 21: #include <sstream>
Unused import.



--
To view, visit http://gerrit.cloudera.org:8080/24392
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I904618570e8c21d099c9a96b496d85e9246483de
Gerrit-Change-Number: 24392
Gerrit-PatchSet: 2
Gerrit-Owner: Zoltan Borok-Nagy <[email protected]>
Gerrit-Reviewer: Arnab Karmakar <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Comment-Date: Mon, 08 Jun 2026 18:44:59 +0000
Gerrit-HasComments: Yes

Reply via email to