WZhuo commented on code in PR #746:
URL: https://github.com/apache/iceberg-cpp/pull/746#discussion_r3417952138
##########
src/iceberg/schema_field.cc:
##########
@@ -55,13 +58,86 @@ bool SchemaField::optional() const { return optional_; }
std::string_view SchemaField::doc() const { return doc_; }
+std::optional<std::reference_wrapper<const Literal>>
SchemaField::initial_default()
+ const {
+ if (initial_default_ == nullptr) {
+ return std::nullopt;
+ }
+ return std::cref(*initial_default_);
+}
+
+std::optional<std::reference_wrapper<const Literal>>
SchemaField::write_default() const {
+ if (write_default_ == nullptr) {
+ return std::nullopt;
+ }
+ return std::cref(*write_default_);
+}
+
+SchemaField SchemaField::WithInitialDefault(Literal initial_default) const {
+ SchemaField copy = *this;
+ copy.initial_default_ = std::make_shared<const
Literal>(std::move(initial_default));
+ return copy;
+}
Review Comment:
I don't think it's need to copy the whole `SchemaField`, can we just set the
`initial_default_` field and return *this.
Also the following `With` methods.
##########
src/iceberg/schema_field.h:
##########
@@ -100,6 +142,11 @@ class ICEBERG_EXPORT SchemaField : public
iceberg::util::Formattable {
std::shared_ptr<Type> type_;
bool optional_;
std::string doc_;
+ // Default values are owned by this field and never mutated after being set;
copies
+ // of the field share the same payload (reference-counted) instead of
deep-copying,
+ // like `type_` above. Sharing is unobservable because the payload is
immutable.
+ std::shared_ptr<const Literal> initial_default_;
+ std::shared_ptr<const Literal> write_default_;
Review Comment:
`ReassignField` constructs a new `SchemaField` via the 5-argument
constructor which initializes `initial_default_` and `write_default_` to
`nullptr`. When schema IDs are reassigned (e.g., copying a schema with fresh
IDs via the `Schema(get_id)` path), all default values on fields are silently
lost. We should copy all field properties including `initialDefault` and
`writeDefault`.
##########
src/iceberg/json_serde.cc:
##########
@@ -552,9 +562,23 @@ Result<std::unique_ptr<SchemaField>> FieldFromJson(const
nlohmann::json& json) {
ICEBERG_ASSIGN_OR_RAISE(auto name, GetJsonValue<std::string>(json, kName));
ICEBERG_ASSIGN_OR_RAISE(auto required, GetJsonValue<bool>(json, kRequired));
ICEBERG_ASSIGN_OR_RAISE(auto doc, GetJsonValueOrDefault<std::string>(json,
kDoc));
-
- return std::make_unique<SchemaField>(field_id, std::move(name),
std::move(type),
- !required, doc);
+ ICEBERG_ASSIGN_OR_RAISE(std::optional<nlohmann::json> initial_default_json,
+ GetJsonValueOptional<nlohmann::json>(json,
kInitialDefault));
+ ICEBERG_ASSIGN_OR_RAISE(std::optional<nlohmann::json> write_default_json,
+ GetJsonValueOptional<nlohmann::json>(json,
kWriteDefault));
+
+ SchemaField field(field_id, std::move(name), std::move(type), !required,
doc);
+ if (initial_default_json.has_value()) {
+ ICEBERG_ASSIGN_OR_RAISE(Literal literal,
+ LiteralFromJson(*initial_default_json,
field.type().get()));
+ field = field.WithInitialDefault(std::move(literal));
+ }
+ if (write_default_json.has_value()) {
+ ICEBERG_ASSIGN_OR_RAISE(Literal literal,
+ LiteralFromJson(*write_default_json,
field.type().get()));
+ field = field.WithWriteDefault(std::move(literal));
+ }
Review Comment:
The deserialization first constructs a bare `SchemaField`, then
conditionally calls `WithInitialDefault`/`WithWriteDefault`, each of which
copies the entire field (including the `shared_ptr<Type>`). This is an
unnecessary intermediate copy.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]