[GitHub] [arrow] zixi-bwang commented on a change in pull request #12068: ARROW-15037: [C#] A stream processing example of IoT sensor data

GitBox Mon, 10 Jan 2022 14:30:51 -0800


zixi-bwang commented on a change in pull request #12068:
URL: https://github.com/apache/arrow/pull/12068#discussion_r781623395




##########
File path: csharp/examples/IoTDataPipelineExample/Program.cs
##########
@@ -75,14 +75,23 @@ public static async Task Main(string[] args)
                                     + recordBatch.Column(j).Data.NullCount);
                             }
 
-                            if (recordBatch.Schema.HasMetadata && 
recordBatch.Schema.Metadata.TryGetValue("SubjectId", out string subjectId))
+                            var col = (Int32Array)recordBatch.Column(0);
+                            var subjectId = col.Values[0].ToString();
+
+                            if (!recordBatchDict.ContainsKey(subjectId))
                             {
-                                if (!recordBatchDict.ContainsKey(subjectId))
-                                {
-                                    recordBatchDict.Add(subjectId, new 
List<RecordBatch>());
-                                }
-                                recordBatchDict[subjectId].Add(recordBatch);
+                                recordBatchDict.Add(subjectId, new 
List<RecordBatch>());
                             }
+                            recordBatchDict[subjectId].Add(recordBatch);
+
+                            //if (recordBatch.Schema.HasMetadata && 
recordBatch.Schema.Metadata.TryGetValue("SubjectId", out string subjectId))

Review comment:
       @eerhardt,  
   
   It seems that there is something weird with the schema metadata part: 
   
   When building record batches, each record batch has an unique SubjectId as 
the schema metadata value, however, when writing the Arrow file and using the 
schema of the first record batch as the default arrow file schema, it seems all 
record batches' SubjectIds in their schema metadata are overridden by first 
record batch's SubjectId.
   
   I'm not sure what's going on here, could you review this?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [arrow] zixi-bwang commented on a change in pull request #12068: ARROW-15037: [C#] A stream processing example of IoT sensor data

Reply via email to