westonpace commented on code in PR #13245:
URL: https://github.com/apache/arrow/pull/13245#discussion_r885060427


##########
cpp/src/arrow/ipc/reader.cc:
##########
@@ -1121,6 +1121,29 @@ class ARROW_EXPORT SelectiveIpcFileRecordBatchGenerator {
   int index_;
 };
 
+struct AtomicReadStats {
+  std::atomic<int64_t> num_messages{0};
+  std::atomic<int64_t> num_record_batches{0};
+  std::atomic<int64_t> num_dictionary_batches{0};
+  std::atomic<int64_t> num_dictionary_deltas{0};
+  std::atomic<int64_t> num_replaced_dictionaries{0};
+
+  /// \brief Capture a copy of the current counters
+  ///
+  /// It's possible to get inconsistent values.  For example, if
+  /// this method is called in the middle of a read you might have
+  /// a case where num_messages != num_record_batches + num_dictionary_batches

Review Comment:
   Does it make a difference that this is not new behavior?  We increment 
`num_messages` as soon as we figure out there is a new message.  We don't 
increment `num_record_batches` until we have confirmed that the new message is, 
in fact, a record batch.  This happens in an entirely different method 
altogether.
   
   I can move things around to fix this but I also think this is mainly an 
internal counter used for unit tests and so I'm not very motivated to do so.  
Mainly I made this comment because I was trying to figure out if this was an 
invariant we already maintained (we didn't) that I needed to keep consistent 
with the new approach.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to