rustyconover commented on code in PR #136:
URL: https://github.com/apache/arrow-swift/pull/136#discussion_r2835745584
##########
Sources/Arrow/ArrowReader.swift:
##########
@@ -221,12 +221,26 @@ public class ArrowReader { // swiftlint:disable:this
type_body_length
}
}
+ private func extractCustomMetadata(
+ from message: org_apache_arrow_flatbuf_Message
+ ) -> [String: String] {
+ var metadata: [String: String] = [:]
+ for index in 0..<message.customMetadataCount {
+ if let kv = message.customMetadata(at: index),
+ let key = kv.key {
+ metadata[key] = kv.value ?? ""
Review Comment:
Considered this, but decided to keep `kv.value ?? ""`. The Arrow FlatBuffers
spec doesn't mark `value` as `required`, so nil is a valid representation.
Silently dropping entries would be worse — it loses data without signaling the
caller. Coercing nil to empty string is more permissive and avoids that problem.
##########
Sources/Arrow/ArrowWriter.swift:
##########
@@ -243,11 +243,28 @@ public class ArrowWriter { // swiftlint:disable:this
type_body_length
org_apache_arrow_flatbuf_RecordBatch.add(length: Int64(batch.length),
&fbb)
let recordBatchOffset =
org_apache_arrow_flatbuf_RecordBatch.endRecordBatch(&fbb, start: startRb)
let bodySize = Int64(bufferOffset)
+
+ var customMetadataOffset = Offset()
+ if !batch.customMetadata.isEmpty {
+ var kvOffsets = [Offset]()
+ for (key, value) in batch.customMetadata {
+ let keyOffset = fbb.create(string: key)
+ let valueOffset = fbb.create(string: value)
+ kvOffsets.append(
+ org_apache_arrow_flatbuf_KeyValue.createKeyValue(
+ &fbb, keyOffset: keyOffset, valueOffset: valueOffset))
+ }
+ customMetadataOffset = fbb.createVector(ofOffsets: kvOffsets)
Review Comment:
Good catch. Fixed in 082a8d7 — metadata keys are now sorted before
serialization.
##########
Sources/Arrow/ArrowTable.swift:
##########
@@ -165,6 +169,18 @@ public class RecordBatch {
return self
}
+ @discardableResult
+ public func addMetadata(_ key: String, value: String) -> Builder {
+ self.customMetadata[key] = value
+ return self
+ }
+
+ @discardableResult
+ public func addMetadata(_ metadata: [String: String]) -> Builder {
+ self.customMetadata.merge(metadata) { _, new in new }
+ return self
+ }
+
public func finish() -> Result<RecordBatch, ArrowError> {
if columns.count > 0 {
let columnLength = columns[0].length
Review Comment:
Fixed `Builder.finish()` to return `.failure(.arrayHasNoElements)` when no
columns have been added in 082a8d7. The public `RecordBatch.init` has the same
vulnerability but that's pre-existing — keeping it out of scope for this PR.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]