CodiumAI-Agent commented on PR #9030: URL: https://github.com/apache/incubator-gluten/pull/9030#issuecomment-2756280146
## PR Reviewer Guide ๐ Here are some key observations to aid the review process: <table> <tr><td> **๐ซ Ticket compliance analysis โ ** **[9020](https://github.com/apache/incubator-gluten/issues/9020) - PR Code Verified** Compliant requirements: - Support bitmapaggregator in delta upsert queries - Allow multiple expression expansions concurrently - Remove query context dependency during DeltaDVRoaringBitmapArray initialization Requires further human verification: - Integration testing of newly registered expression extensions in diverse runtime scenarios - Validation of serialization/deserialization with various deletion vector files </td></tr> <tr><td>โฑ๏ธ <strong>Estimated effort to review</strong>: 4 ๐ต๐ต๐ต๐ตโช</td></tr> <tr><td>๐งช <strong>PR contains tests</strong></td></tr> <tr><td>๐ <strong>No security concerns identified</strong></td></tr> <tr><td>โก <strong>Recommended focus areas for review</strong><br><br> <details><summary><a href='https://github.com/apache/incubator-gluten/pull/9030/files#diff-4d597d84a685592f6dbbe889572da5fcbf0f275fe1bdeb67245ddebd067ff24cR67-R80'><strong>Thread Safety</strong></a> The new extension registration mechanism appends new transformers to a mutable list without explicit synchronization. Verify that concurrent registrations or accesses cannot lead to race conditions. </summary> ```scala private var expressionExtensionTransformers: Seq[ExpressionExtensionTrait] = Seq.apply(DefaultExpressionExtensionTransformer()) private var expressionExtensionSig = Seq.empty[Sig] def expressionExtensionSigList: Seq[Sig] = expressionExtensionSig def findExpressionExtension(clazz: Class[_]): Option[ExpressionExtensionTrait] = { expressionExtensionTransformers.find(_.extensionExpressionsMapping.contains(clazz)) } def registerExpressionExtension(expressionExtension: ExpressionExtensionTrait): Unit = { expressionExtensionTransformers = expressionExtensionTransformers :+ expressionExtension expressionExtensionSig = expressionExtensionTransformers.flatMap(_.expressionSigList) } ``` </details> <details><summary><a href='https://github.com/apache/incubator-gluten/pull/9030/files#diff-f1625a6ac2a39d4b79842f262935a102a793681c99382f3439566aa10c12f776R157-R199'><strong>Serialization Robustness</strong></a> The updated serialize and deserialize methods introduce critical logic for data consistency. Confirm that error handling covers any possible mismatches (e.g., data size and magic number) and that the methods are robust under malformed input. </summary> ```c++ String DeltaDVRoaringBitmapArray::serialize() const { DB::WriteBufferFromOwnString out; constexpr Int32 magic_number = 1681511377; writeBinaryLittleEndian(magic_number, out); Int64 size = roaring_bitmap_array.size(); writeBinaryLittleEndian(size, out); for (Int32 i = 0; i < roaring_bitmap_array.size(); ++i) { writeBinaryLittleEndian(i, out); std::unique_ptr<roaring::Roaring> bitmap = std::make_unique<roaring::Roaring>(roaring_bitmap_array.at(i)); bitmap->runOptimize(); auto size_in_bytes = bitmap->getSizeInBytes(); std::unique_ptr<char[]> buf(new char[size_in_bytes]); bitmap->write(buf.get()); out.write(buf.get(), size_in_bytes); } return out.str(); } void DeltaDVRoaringBitmapArray::deserialize(DB::ReadBuffer & buf) { Int32 magic_num; readBinaryLittleEndian(magic_num, buf); if (magic_num != 1681511377) throw DB::Exception(DB::ErrorCodes::BAD_ARGUMENTS, "The magic num is mismatch."); int64_t bitmap_array_size; readBinaryLittleEndian(bitmap_array_size, buf); roaring_bitmap_array.reserve(bitmap_array_size); for (size_t i = 0; i < bitmap_array_size; ++i) { int bitmap_index; readBinaryLittleEndian(bitmap_index, buf); roaring::Roaring r = roaring::Roaring::read(buf.position()); size_t current_bitmap_size = r.getSizeInBytes(); buf.ignore(current_bitmap_size); roaring_bitmap_array.push_back(r); } } ``` </details> </td></tr> </table> -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
