AnandInguva commented on code in PR #28780:
URL: https://github.com/apache/beam/pull/28780#discussion_r1346206664


##########
sdks/python/apache_beam/testing/analyzers/perf_analysis_utils.py:
##########
@@ -170,7 +170,22 @@ def find_latest_change_point_index(metric_values: 
List[Union[float, int]]):
   if not change_points_indices:
     return None
   change_points_indices.sort()
-  return change_points_indices[-1]
+  # Remove the change points that are at the edges of the data.
+  # https://github.com/apache/beam/issues/28757
+  # Remove this workaround once we have a good solution to deal
+  # with the edge change points.
+  change_point_index = change_points_indices[-1]
+  if is_edge_change_point(change_point_index,
+                          len(metric_values),
+                          constants._EDGE_SEGMENT_SIZE):
+    logging.info(
+        'The change point %s is located at the edge of the data with an edge '
+        'segment size of %s. This change point will be ignored for now, '
+        'awaiting additional data. Should the change point persist after '
+        'gathering more data, an alert will be raised.' %
+        (change_point_index, constants._EDGE_SEGMENT_SIZE))
+    return None

Review Comment:
   I think we should return the latest change point itself. Even if we ignore 
it for example 3 days, it gets filed. 
   
   We also ignore change points that are occurred 14 days before. Most often 
change_points_indices[-2] lies outside of that window or doesn't exist. So we 
could just follow the current approach.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to