zixi-bwang commented on a change in pull request #12068:
URL: https://github.com/apache/arrow/pull/12068#discussion_r781623395
##########
File path: csharp/examples/IoTDataPipelineExample/Program.cs
##########
@@ -75,14 +75,23 @@ public static async Task Main(string[] args)
+ recordBatch.Column(j).Data.NullCount);
}
- if (recordBatch.Schema.HasMetadata &&
recordBatch.Schema.Metadata.TryGetValue("SubjectId", out string subjectId))
+ var col = (Int32Array)recordBatch.Column(0);
+ var subjectId = col.Values[0].ToString();
+
+ if (!recordBatchDict.ContainsKey(subjectId))
{
- if (!recordBatchDict.ContainsKey(subjectId))
- {
- recordBatchDict.Add(subjectId, new
List<RecordBatch>());
- }
- recordBatchDict[subjectId].Add(recordBatch);
+ recordBatchDict.Add(subjectId, new
List<RecordBatch>());
}
+ recordBatchDict[subjectId].Add(recordBatch);
+
+ //if (recordBatch.Schema.HasMetadata &&
recordBatch.Schema.Metadata.TryGetValue("SubjectId", out string subjectId))
Review comment:
@eerhardt,
It seems that there is something weird with the schema metadata part:
When building record batches, each record batch has an unique SubjectId as
the schema metadata value, however, when writing the Arrow file and using the
schema of the first record batch as the default arrow file schema, it seems all
record batches' SubjectIds in their schema metadata are overridden by first
record batch's SubjectId.
I'm not sure what's going on here, could you review this?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]