chaokunyang commented on code in PR #3455:
URL: https://github.com/apache/fory/pull/3455#discussion_r2894921550


##########
cpp/fory/serialization/collection_serializer.h:
##########
@@ -677,6 +691,13 @@ struct Serializer<
     if (FORY_PREDICT_FALSE(ctx.has_error())) {
       return std::vector<T, Alloc>();
     }
+
+    if (FORY_PREDICT_FALSE(length > ctx.config().max_collection_size)) {
+      ctx.set_error(
+          Error::invalid_data("Collection length exceeds 
max_collection_size"));
+      return std::vector<T, Alloc>();
+    }
+

Review Comment:
   Nice addition on the read() path. One gap remains: several read_data() paths 
still trust payload size without max_collection_size checks (for example 
vector<T> non-arithmetic, vector<bool>, list, deque, and forward_list). Those 
paths are used in native/nested deserialization, so guardrails can still be 
bypassed there.



##########
cpp/fory/serialization/collection_serializer.h:
##########
@@ -1161,6 +1189,13 @@ template <typename T, typename Alloc> struct 
Serializer<std::deque<T, Alloc>> {
     if (FORY_PREDICT_FALSE(ctx.has_error())) {
       return std::deque<T, Alloc>();
     }
+
+    if (FORY_PREDICT_FALSE(length > ctx.config().max_collection_size)) {
+      ctx.set_error(
+          Error::invalid_data("Collection length exceeds 
max_collection_size"));
+      return std::deque<T, Alloc>();
+    }
+

Review Comment:
   There is still an unguarded preallocation in forward_list::read: 
temp.reserve(length) happens before any max_collection_size validation. A 
malicious length can force a large allocation before we fail. Please validate 
length immediately after reading it, before reserve.



##########
cpp/fory/serialization/collection_serializer_test.cc:
##########
@@ -620,6 +620,47 @@ TEST(CollectionSerializerTest, ForwardListEmptyRoundTrip) {
   EXPECT_TRUE(deserialized.strings.empty());
 }
 
+// Test max_collection_size using objects (e.g., strings)
+TEST(CollectionSerializerTest, MaxCollectionSizeGuardrail) {
+  auto fory = Fory::builder()

Review Comment:
   These tests are helpful, but they only cover xlang(true). The new limits 
should also be validated in xlang(false)/read_data paths, otherwise we can miss 
regressions in native mode (especially for vector/list/deque/forward_list).



##########
cpp/fory/serialization/context.h:
##########
@@ -361,6 +361,9 @@ class WriteContext {
   /// reset context for reuse.
   void reset();
 
+  /// get associated configuration.
+  inline const Config &config() const { return *config_; }

Review Comment:
   Read-side code now uses config(), but I do not see write-side usage of 
WriteContext::config(). If no write-path consumer is planned, we can drop this 
accessor to keep the API surface smaller.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to