wgtmac commented on code in PR #268:
URL: https://github.com/apache/iceberg-cpp/pull/268#discussion_r2446579643


##########
src/iceberg/table_metadata.h:
##########
@@ -73,6 +73,9 @@ struct ICEBERG_EXPORT TableMetadata {
   static constexpr int64_t kInitialSequenceNumber = 0;
   static constexpr int64_t kInvalidSequenceNumber = -1;
   static constexpr int64_t kInitialRowId = 0;
+  static constexpr int32_t kInitialSpecId = 0;
+  static constexpr int32_t kInitialSortOrderId = 1;

Review Comment:
   ditto, we have `SortOrder::kInitialSortOrderId`.



##########
src/iceberg/table_metadata.h:
##########
@@ -73,6 +73,9 @@ struct ICEBERG_EXPORT TableMetadata {
   static constexpr int64_t kInitialSequenceNumber = 0;
   static constexpr int64_t kInvalidSequenceNumber = -1;
   static constexpr int64_t kInitialRowId = 0;
+  static constexpr int32_t kInitialSpecId = 0;

Review Comment:
   We already have `PartitionSpec::kInitialSpecId`. Should we reuse it?



##########
src/iceberg/table_metadata.h:
##########
@@ -73,6 +73,9 @@ struct ICEBERG_EXPORT TableMetadata {
   static constexpr int64_t kInitialSequenceNumber = 0;
   static constexpr int64_t kInvalidSequenceNumber = -1;
   static constexpr int64_t kInitialRowId = 0;
+  static constexpr int32_t kInitialSpecId = 0;
+  static constexpr int32_t kInitialSortOrderId = 1;
+  static constexpr int64_t kInvalidSnapshotId = -1;

Review Comment:
   We have `Snapshot::kInvalidSnapshotId` too.



##########
src/iceberg/table_metadata.cc:
##########
@@ -201,13 +203,46 @@ Status TableMetadataUtil::Write(FileIO& io, const 
std::string& location,
 
 // TableMetadataBuilder implementation
 
-struct TableMetadataBuilder::Impl {};
+struct TableMetadataBuilder::Impl {
+  // Base metadata (nullptr for new tables)
+  const TableMetadata* base;
+
+  // Working metadata copy
+  TableMetadata metadata;
+
+  // Change tracking
+  std::vector<std::unique_ptr<TableUpdate>> changes;
+
+  // Error collection (since methods return *this and cannot throw)
+  std::vector<Status> errors;

Review Comment:
   ```suggestion
     std::vector<Error> errors;
   ```



##########
src/iceberg/table_metadata.cc:
##########
@@ -238,12 +273,35 @@ TableMetadataBuilder& 
TableMetadataBuilder::SetPreviousMetadataLocation(
 }
 
 TableMetadataBuilder& TableMetadataBuilder::AssignUUID() {
-  throw IcebergError(std::format("{} not implemented", __FUNCTION__));
+  if (impl_->metadata.table_uuid.empty()) {
+    // Generate a random UUID
+    return AssignUUID(Uuid::GenerateV4().ToString());
+  }
+
+  return *this;
 }
 
 TableMetadataBuilder& TableMetadataBuilder::AssignUUID(std::string_view uuid) {
-  throw IcebergError(std::format("{} not implemented", __FUNCTION__));
-  ;
+  std::string uuid_str(uuid);
+
+  // Validation: UUID cannot be null or empty
+  if (uuid_str.empty()) {
+    impl_->errors.emplace_back(InvalidArgument("Cannot assign null or empty 
UUID"));
+    return *this;
+  }
+
+  // Check if UUID is already set to the same value (no-op)
+  if (StringUtils::EqualsIgnoreCase(impl_->metadata.table_uuid, uuid_str)) {

Review Comment:
   nit: is it possible to refactor it to use std::string_view as input?



##########
src/iceberg/table_metadata.cc:
##########
@@ -378,11 +436,50 @@ TableMetadataBuilder& 
TableMetadataBuilder::RemoveEncryptionKey(std::string_view
 }
 
 TableMetadataBuilder& TableMetadataBuilder::DiscardChanges() {
-  throw IcebergError(std::format("{} not implemented", __FUNCTION__));
+  // Clear all changes and errors
+  impl_->changes.clear();
+  impl_->errors.clear();
+
+  // Reset metadata to base state
+  if (impl_->base != nullptr) {
+    impl_->metadata = *impl_->base;
+  } else {
+    // Reset to initial state for new table
+    *impl_ = Impl(impl_->metadata.format_version);
+  }
+
+  return *this;
 }
 
 Result<std::unique_ptr<TableMetadata>> TableMetadataBuilder::Build() {
-  return NotImplemented("TableMetadataBuilder::Build not implemented");
+  // 1. Check for accumulated errors

Review Comment:
   An unrelated comment: we might need to add function `Status 
TableMetadata::Validate() const` to make sure all fields are populated 
correctly. We can add this in a separate PR.



##########
src/iceberg/table_metadata.cc:
##########
@@ -378,11 +436,50 @@ TableMetadataBuilder& 
TableMetadataBuilder::RemoveEncryptionKey(std::string_view
 }
 
 TableMetadataBuilder& TableMetadataBuilder::DiscardChanges() {
-  throw IcebergError(std::format("{} not implemented", __FUNCTION__));
+  // Clear all changes and errors
+  impl_->changes.clear();
+  impl_->errors.clear();
+
+  // Reset metadata to base state

Review Comment:
   I think this is inconsistent with the Java impl where `changes` list is 
ignored but the updated fields are preserved to create the `TableMetadata`.



##########
src/iceberg/table_metadata.cc:
##########
@@ -238,12 +273,35 @@ TableMetadataBuilder& 
TableMetadataBuilder::SetPreviousMetadataLocation(
 }
 
 TableMetadataBuilder& TableMetadataBuilder::AssignUUID() {
-  throw IcebergError(std::format("{} not implemented", __FUNCTION__));
+  if (impl_->metadata.table_uuid.empty()) {
+    // Generate a random UUID
+    return AssignUUID(Uuid::GenerateV4().ToString());
+  }
+
+  return *this;
 }
 
 TableMetadataBuilder& TableMetadataBuilder::AssignUUID(std::string_view uuid) {
-  throw IcebergError(std::format("{} not implemented", __FUNCTION__));
-  ;
+  std::string uuid_str(uuid);
+
+  // Validation: UUID cannot be null or empty
+  if (uuid_str.empty()) {
+    impl_->errors.emplace_back(InvalidArgument("Cannot assign null or empty 
UUID"));

Review Comment:
   ```suggestion
       impl_->errors.emplace_back(InvalidArgument("Cannot assign empty UUID"));
   ```



##########
src/iceberg/test/table_metadata_builder_test.cc:
##########
@@ -0,0 +1,311 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+#include <memory>
+#include <string>
+
+#include <gtest/gtest.h>
+
+#include "gmock/gmock-matchers.h"
+#include "iceberg/table_metadata.h"
+#include "iceberg/table_requirement.h"
+#include "iceberg/table_requirements.h"
+#include "iceberg/table_update.h"
+#include "iceberg/test/matchers.h"
+
+namespace iceberg {
+
+// Helper functions to reduce test boilerplate
+namespace {
+
+// Build and assert success, returning the metadata
+std::unique_ptr<TableMetadata> BuildAndExpectSuccess(TableMetadataBuilder& 
builder) {
+  auto result = builder.Build();
+  EXPECT_TRUE(result.has_value()) << "Build failed: " << 
result.error().message;
+  if (!result.has_value()) {
+    return nullptr;
+  }
+  return std::move(result.value());
+}
+
+// Build and expect failure with message containing substring
+void BuildAndExpectFailure(TableMetadataBuilder& builder,
+                           const std::string& expected_message_substr) {
+  auto result = builder.Build();
+  ASSERT_FALSE(result.has_value()) << "Build should have failed";
+  EXPECT_TRUE(result.error().message.find(expected_message_substr) != 
std::string::npos)

Review Comment:
   Use `EXPECT_THAT` with `HasErrorMessage`?



##########
src/iceberg/test/table_metadata_builder_test.cc:
##########
@@ -0,0 +1,311 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+#include <memory>
+#include <string>
+
+#include <gtest/gtest.h>
+
+#include "gmock/gmock-matchers.h"

Review Comment:
   ```suggestion
   #include <gmock/gmock-matchers.h>
   ```



##########
src/iceberg/table_metadata.cc:
##########
@@ -201,13 +203,46 @@ Status TableMetadataUtil::Write(FileIO& io, const 
std::string& location,
 
 // TableMetadataBuilder implementation
 
-struct TableMetadataBuilder::Impl {};
+struct TableMetadataBuilder::Impl {
+  // Base metadata (nullptr for new tables)
+  const TableMetadata* base;
+
+  // Working metadata copy
+  TableMetadata metadata;
+
+  // Change tracking
+  std::vector<std::unique_ptr<TableUpdate>> changes;
+
+  // Error collection (since methods return *this and cannot throw)
+  std::vector<Status> errors;
+
+  // Metadata location tracking
+  std::optional<std::string> metadata_location;
+  std::optional<std::string> previous_metadata_location;
+
+  // Constructor for new table
+  explicit Impl(int8_t format_version) : base(nullptr), metadata{} {
+    metadata.format_version = format_version;
+    metadata.last_sequence_number = TableMetadata::kInitialSequenceNumber;
+    metadata.last_updated_ms = TimePointMs{std::chrono::milliseconds(0)};

Review Comment:
   We may either follow the same pattern of Java to explicitly declare fields 
in the builder, or initialize fields to invalid values?



##########
src/iceberg/table_metadata.cc:
##########
@@ -378,11 +436,50 @@ TableMetadataBuilder& 
TableMetadataBuilder::RemoveEncryptionKey(std::string_view
 }
 
 TableMetadataBuilder& TableMetadataBuilder::DiscardChanges() {
-  throw IcebergError(std::format("{} not implemented", __FUNCTION__));
+  // Clear all changes and errors
+  impl_->changes.clear();
+  impl_->errors.clear();
+
+  // Reset metadata to base state
+  if (impl_->base != nullptr) {
+    impl_->metadata = *impl_->base;
+  } else {
+    // Reset to initial state for new table
+    *impl_ = Impl(impl_->metadata.format_version);
+  }
+
+  return *this;
 }
 
 Result<std::unique_ptr<TableMetadata>> TableMetadataBuilder::Build() {
-  return NotImplemented("TableMetadataBuilder::Build not implemented");
+  // 1. Check for accumulated errors
+  if (!impl_->errors.empty()) {
+    std::string error_msg = "Failed to build TableMetadata due to validation 
errors:\n";
+    for (const auto& error : impl_->errors) {
+      error_msg += "  - " + error.error().message + "\n";
+    }
+    return CommitFailed("{}", error_msg);
+  }
+
+  // 2. Validate metadata consistency
+
+  // Validate UUID exists for format version > 1
+  if (impl_->metadata.format_version > 1 && 
impl_->metadata.table_uuid.empty()) {
+    return InvalidArgument("UUID is required for format version {}",
+                           impl_->metadata.format_version);
+  }
+
+  // 3. Update last_updated_ms if there are changes
+  if (!impl_->changes.empty() && impl_->base != nullptr) {
+    impl_->metadata.last_updated_ms =
+        TimePointMs{std::chrono::duration_cast<std::chrono::milliseconds>(
+            std::chrono::system_clock::now().time_since_epoch())};
+  }

Review Comment:
   Why do we need this? I didn't see these on the Java side.



##########
src/iceberg/table_requirement.cc:
##########
@@ -20,15 +20,28 @@
 #include "iceberg/table_requirement.h"
 
 #include "iceberg/table_metadata.h"
+#include "util/string_util.h"

Review Comment:
   ```suggestion
   #include "iceberg/util/string_util.h"
   ```



##########
src/iceberg/test/CMakeLists.txt:
##########
@@ -84,6 +84,8 @@ add_iceberg_test(table_test
                  table_test.cc
                  schema_json_test.cc)
 
+add_iceberg_test(table_metadata_builder_test SOURCES 
table_metadata_builder_test.cc)

Review Comment:
   How about adding it to the `table_test` above?



##########
src/iceberg/table_update.cc:
##########
@@ -21,18 +21,32 @@
 
 #include "iceberg/exception.h"
 #include "iceberg/table_metadata.h"
+#include "iceberg/table_requirement.h"
 #include "iceberg/table_requirements.h"
 
 namespace iceberg::table {
 
 // AssignUUID
 
 void AssignUUID::ApplyTo(TableMetadataBuilder& builder) const {
-  throw IcebergError(std::format("{} not implemented", __FUNCTION__));
+  builder.AssignUUID(uuid_);
 }
 
 Status AssignUUID::GenerateRequirements(TableUpdateContext& context) const {

Review Comment:
   It seems that this function is not called any where?



##########
src/iceberg/table_metadata.cc:
##########
@@ -238,12 +273,35 @@ TableMetadataBuilder& 
TableMetadataBuilder::SetPreviousMetadataLocation(
 }
 
 TableMetadataBuilder& TableMetadataBuilder::AssignUUID() {
-  throw IcebergError(std::format("{} not implemented", __FUNCTION__));
+  if (impl_->metadata.table_uuid.empty()) {
+    // Generate a random UUID
+    return AssignUUID(Uuid::GenerateV4().ToString());
+  }
+
+  return *this;
 }
 
 TableMetadataBuilder& TableMetadataBuilder::AssignUUID(std::string_view uuid) {
-  throw IcebergError(std::format("{} not implemented", __FUNCTION__));
-  ;
+  std::string uuid_str(uuid);
+
+  // Validation: UUID cannot be null or empty
+  if (uuid_str.empty()) {
+    impl_->errors.emplace_back(InvalidArgument("Cannot assign null or empty 
UUID"));
+    return *this;
+  }
+
+  // Check if UUID is already set to the same value (no-op)
+  if (StringUtils::EqualsIgnoreCase(impl_->metadata.table_uuid, uuid_str)) {
+    return *this;
+  }
+
+  // Update the metadata
+  impl_->metadata.table_uuid = uuid_str;
+
+  // Record the change
+  impl_->changes.push_back(std::make_unique<table::AssignUUID>(uuid_str));

Review Comment:
   ```suggestion
     
impl_->changes.push_back(std::make_unique<table::AssignUUID>(std::move(uuid_str)));
   ```



##########
src/iceberg/table_requirements.cc:
##########
@@ -26,11 +26,11 @@
 namespace iceberg {
 
 void TableUpdateContext::AddRequirement(std::unique_ptr<TableRequirement> 
requirement) {
-  throw IcebergError("TableUpdateContext::AddRequirement not implemented");
+  requirements_.emplace_back(std::move(requirement));

Review Comment:
   Should we add `AssertUUID` to ForReplaceTable/ForUpdateTable below?



##########
src/iceberg/table_metadata.cc:
##########
@@ -201,13 +203,46 @@ Status TableMetadataUtil::Write(FileIO& io, const 
std::string& location,
 
 // TableMetadataBuilder implementation
 
-struct TableMetadataBuilder::Impl {};
+struct TableMetadataBuilder::Impl {
+  // Base metadata (nullptr for new tables)
+  const TableMetadata* base;
+
+  // Working metadata copy
+  TableMetadata metadata;
+
+  // Change tracking
+  std::vector<std::unique_ptr<TableUpdate>> changes;
+
+  // Error collection (since methods return *this and cannot throw)
+  std::vector<Status> errors;
+
+  // Metadata location tracking
+  std::optional<std::string> metadata_location;
+  std::optional<std::string> previous_metadata_location;
+
+  // Constructor for new table
+  explicit Impl(int8_t format_version) : base(nullptr), metadata{} {
+    metadata.format_version = format_version;
+    metadata.last_sequence_number = TableMetadata::kInitialSequenceNumber;
+    metadata.last_updated_ms = TimePointMs{std::chrono::milliseconds(0)};

Review Comment:
   The Java version only initializes these fields:
   
   ```java
       private Builder(int formatVersion) {
         this.base = null;
         this.formatVersion = formatVersion;
         this.lastSequenceNumber = INITIAL_SEQUENCE_NUMBER;
         this.uuid = UUID.randomUUID().toString();
         this.currentSnapshotId = -1;
         this.nextRowId = INITIAL_ROW_ID;
       }
   ```
   
   There are some inconsistencies from the Java impl. For instance 
`last_updated_ms`, the Java impl declares it as `long` in the TableMetadata but 
`Long` in the TableMetadataBuilder to make it optional. In the Java ctor, 
`lastUpdatedMillis` is initialized to null.
   
   If we want to follow the same logic, we may not directly use `TableMetadata 
metadata` directly since we don't know whether some fields have been updated or 
not.



##########
src/iceberg/test/table_metadata_builder_test.cc:
##########
@@ -0,0 +1,311 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+#include <memory>
+#include <string>
+
+#include <gtest/gtest.h>
+
+#include "gmock/gmock-matchers.h"
+#include "iceberg/table_metadata.h"
+#include "iceberg/table_requirement.h"
+#include "iceberg/table_requirements.h"
+#include "iceberg/table_update.h"
+#include "iceberg/test/matchers.h"
+
+namespace iceberg {
+
+// Helper functions to reduce test boilerplate
+namespace {
+
+// Build and assert success, returning the metadata
+std::unique_ptr<TableMetadata> BuildAndExpectSuccess(TableMetadataBuilder& 
builder) {
+  auto result = builder.Build();
+  EXPECT_TRUE(result.has_value()) << "Build failed: " << 
result.error().message;
+  if (!result.has_value()) {
+    return nullptr;
+  }
+  return std::move(result.value());
+}
+
+// Build and expect failure with message containing substring
+void BuildAndExpectFailure(TableMetadataBuilder& builder,
+                           const std::string& expected_message_substr) {
+  auto result = builder.Build();
+  ASSERT_FALSE(result.has_value()) << "Build should have failed";
+  EXPECT_TRUE(result.error().message.find(expected_message_substr) != 
std::string::npos)
+      << "Expected error message to contain: " << expected_message_substr
+      << "\nActual error: " << result.error().message;
+}
+
+// Generate requirements and return them
+std::vector<std::unique_ptr<TableRequirement>> GenerateRequirements(
+    const TableUpdate& update, const TableMetadata* base) {
+  TableUpdateContext context(base, false);

Review Comment:
   ```suggestion
     TableUpdateContext context(base, /*is_replace=*/false);
   ```



##########
src/iceberg/test/table_metadata_builder_test.cc:
##########
@@ -0,0 +1,311 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+#include <memory>
+#include <string>
+
+#include <gtest/gtest.h>
+
+#include "gmock/gmock-matchers.h"
+#include "iceberg/table_metadata.h"
+#include "iceberg/table_requirement.h"
+#include "iceberg/table_requirements.h"
+#include "iceberg/table_update.h"
+#include "iceberg/test/matchers.h"
+
+namespace iceberg {
+
+// Helper functions to reduce test boilerplate
+namespace {
+
+// Build and assert success, returning the metadata
+std::unique_ptr<TableMetadata> BuildAndExpectSuccess(TableMetadataBuilder& 
builder) {

Review Comment:
   Why not directly use `ICEBERG_ASSIGN_OR_RAISE` macro or write a similar one?



##########
src/iceberg/table_metadata.h:
##########
@@ -73,6 +73,9 @@ struct ICEBERG_EXPORT TableMetadata {
   static constexpr int64_t kInitialSequenceNumber = 0;
   static constexpr int64_t kInvalidSequenceNumber = -1;
   static constexpr int64_t kInitialRowId = 0;
+  static constexpr int32_t kInitialSpecId = 0;
+  static constexpr int32_t kInitialSortOrderId = 1;
+  static constexpr int64_t kInvalidSnapshotId = -1;

Review Comment:
   An alternative is to move these commonly used constants to 
`iceberg/constant.h`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to