Copilot commented on code in PR #38822:
URL: https://github.com/apache/beam/pull/38822#discussion_r3359248856
##########
sdks/python/apache_beam/ml/inference/base_test.py:
##########
@@ -1044,6 +1044,25 @@ def test_timing_metrics(self):
self.assertEqual(load_model_latency.result.count, 1)
self.assertEqual(load_model_latency.result.mean, 500)
+ def test_setup_caches_model_load_metrics_until_finish_bundle(self):
+ fake_clock = FakeClock()
+ dofn = base._RunInferenceDoFn(
+ FakeModelHandler(clock=fake_clock), fake_clock, None)
+ metrics_collector = unittest.mock.Mock(spec=base._MetricsCollector)
+ dofn.get_metrics_collector = unittest.mock.Mock(
+ return_value=metrics_collector)
+
+ with unittest.mock.patch.object(
+ base, '_get_current_process_memory_in_bytes', side_effect=[100, 125]):
+ dofn.setup()
Review Comment:
This test is brittle because `side_effect=[100, 125]` hard-codes the exact
number of calls to `_get_current_process_memory_in_bytes()`. If the
implementation adds another memory sampling (while still computing the same
delta), this will fail with `StopIteration` even though behavior is correct.
Consider using a `side_effect` function (or an iterator that repeats the last
value) and asserting the expected calls/delta explicitly.
##########
sdks/python/apache_beam/ml/inference/base.py:
##########
@@ -2133,8 +2134,8 @@ def process(
return self._run_inference(batch, inference_args)
def finish_bundle(self):
- # TODO(https://github.com/apache/beam/issues/21435): Figure out why there
- # is a cache.
+ # setup() may load the model before bundle-scoped metrics are active, so
+ # flush the cached model load metrics once the bundle lifecycle is running.
Review Comment:
The phrase \"once the bundle lifecycle is running\" is a bit ambiguous
because `finish_bundle()` itself is part of that lifecycle. Rewording to
something like \"once bundle-scoped metrics are active\" (or similar) would
make the intent clearer to readers without needing Beam lifecycle context.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]