yeandy opened a new issue, #23142: URL: https://github.com/apache/beam/issues/23142
### What would you like to happen? Add the following metrics definitions to RunInference documentation. num_inferences - The cumulative count of all samples being passed to RunInference. I.e. total sum of examples across all batches. This should increase monotonically. inference_request_batch_size - The number of samples in a particular batch of examples (created from beam.BatchElements) to be passed to run_inference(). This will vary over time depending on the dynamic batching decision of BatchElements(). inference_request_batch_byte_size - The size, in bytes, of all elements for all samples in a particular batch of examples (created from beam.BatchElements) to be passed to run_inference(). This will vary over time depending on the dynamic batching decision of BatchElements(), and the particular values/dtypes of the elements. inference_batch_latency_micro_secs - The time, in microseconds, that it takes to perform the inference on the batch of examples. i.e. the time to call model_handler.run_inference(). This will vary over time depending on the dynamic batching decision of BatchElements(), and the particular values/dtypes of the elements. model_byte_size - The size, in bytes, of memory that the model takes for loading and initialization. i.e. the increase in memory usage from calling model_handler.load_model() load_model_latency_milli_secs - The time, in milliseconds, that it takes to load and initialize the model. i.e. the time it takes to call model_handler.load_model() ### Issue Priority Priority: 2 ### Issue Component Component: run-inference -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
