rustyconover commented on code in PR #136:
URL: https://github.com/apache/arrow-swift/pull/136#discussion_r2835745584


##########
Sources/Arrow/ArrowReader.swift:
##########
@@ -221,12 +221,26 @@ public class ArrowReader { // swiftlint:disable:this 
type_body_length
         }
     }
 
+    private func extractCustomMetadata(
+        from message: org_apache_arrow_flatbuf_Message
+    ) -> [String: String] {
+        var metadata: [String: String] = [:]
+        for index in 0..<message.customMetadataCount {
+            if let kv = message.customMetadata(at: index),
+               let key = kv.key {
+                metadata[key] = kv.value ?? ""

Review Comment:
   Considered this, but decided to keep `kv.value ?? ""`. The Arrow FlatBuffers 
spec doesn't mark `value` as `required`, so nil is a valid representation. 
Silently dropping entries would be worse — it loses data without signaling the 
caller. Coercing nil to empty string is more permissive and avoids that problem.



##########
Sources/Arrow/ArrowWriter.swift:
##########
@@ -243,11 +243,28 @@ public class ArrowWriter { // swiftlint:disable:this 
type_body_length
         org_apache_arrow_flatbuf_RecordBatch.add(length: Int64(batch.length), 
&fbb)
         let recordBatchOffset = 
org_apache_arrow_flatbuf_RecordBatch.endRecordBatch(&fbb, start: startRb)
         let bodySize = Int64(bufferOffset)
+
+        var customMetadataOffset = Offset()
+        if !batch.customMetadata.isEmpty {
+            var kvOffsets = [Offset]()
+            for (key, value) in batch.customMetadata {
+                let keyOffset = fbb.create(string: key)
+                let valueOffset = fbb.create(string: value)
+                kvOffsets.append(
+                    org_apache_arrow_flatbuf_KeyValue.createKeyValue(
+                        &fbb, keyOffset: keyOffset, valueOffset: valueOffset))
+            }
+            customMetadataOffset = fbb.createVector(ofOffsets: kvOffsets)

Review Comment:
   Good catch. Fixed in 082a8d7 — metadata keys are now sorted before 
serialization.



##########
Sources/Arrow/ArrowTable.swift:
##########
@@ -165,6 +169,18 @@ public class RecordBatch {
             return self
         }
 
+        @discardableResult
+        public func addMetadata(_ key: String, value: String) -> Builder {
+            self.customMetadata[key] = value
+            return self
+        }
+
+        @discardableResult
+        public func addMetadata(_ metadata: [String: String]) -> Builder {
+            self.customMetadata.merge(metadata) { _, new in new }
+            return self
+        }
+
         public func finish() -> Result<RecordBatch, ArrowError> {
             if columns.count > 0 {
                 let columnLength = columns[0].length

Review Comment:
   Fixed `Builder.finish()` to return `.failure(.arrayHasNoElements)` when no 
columns have been added in 082a8d7. The public `RecordBatch.init` has the same 
vulnerability but that's pre-existing — keeping it out of scope for this PR.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to