Sowiks commented on code in PR #96:
URL: https://github.com/apache/otava/pull/96#discussion_r2531499127
##########
otava/analysis.py:
##########
@@ -170,10 +126,42 @@ def compare(self, left: np.ndarray, right: np.ndarray) ->
ComparativeStats:
)
else:
p = 1.0
- return ComparativeStats(mean_l, mean_r, std_l, std_r, p)
+ return TTestStats(mean_1=mean_l, mean_2=mean_r, std_1=std_l,
std_2=std_r, pvalue=p)
+
+ def change_point(
+ self, candidate: CandidateChangePoint, series:
Sequence[SupportsFloat], intervals: List[slice]
+ ) -> ChangePoint[TTestStats]:
+ """
+ Computes properties of the change point if the Candidate Change Point
based on the provided intervals.
+
+ The method works for two cases:
+ 1. Divisive algorithm.
+ if the candidate is a new potential change point, i.e., its index
is inside any interval, then
+ we split the interval by the candidate's index to get left and
right subseries.
+ 2. Merge step in t-test algorithm.
Review Comment:
I agree that terminology is sloppy here :( By "Divisive algorithm" I meant
the process of splitting the interval `(0, len(series))` into subintervals via
change points. By "Merge step" I meant the process of merging split intervals
when we eliminate weak change points. It doesn't have to be t-test
specifically, correct.
##########
otava/analysis.py:
##########
@@ -278,35 +266,31 @@ def split(series: np.array, window_len: int = 30,
max_pvalue: float = 0.001,
tester = TTestSignificanceTester(max_pvalue)
while start < len(series):
end = min(start + window_len, len(series))
- calculator = cext_calculator
- algo = EDivisive(seed=None, calculator=calculator,
significance_tester=tester)
- pts = algo.get_change_points(series[start:end])
- new_indexes = [p.index + start for p in pts]
- new_indexes.sort()
- last_new_change_point_index = next(iter(new_indexes[-1:]), 0)
+ algo = ChangePointDetector(significance_tester=tester,
calculator=PairDistanceCalculator)
+ new_change_points = algo.get_change_points(series, start, end)
+ last_new_change_point_index = new_change_points[-1].index if
new_change_points else 0
start = max(last_new_change_point_index, start + step)
# incremental Otava can duplicate an old cp
- for i in new_indexes:
- if i not in indexes:
- indexes += [i]
+ for cp in new_change_points:
+ if cp not in change_points:
+ change_points += [cp]
- window_endpoints = [0] + indexes + [len(series)]
- return [tester.change_point(i, series, window_endpoints) for i in indexes]
+ intervals = tester.get_intervals(change_points)
+ return [tester.change_point(cp.to_candidate(), series, intervals) for cp
in change_points]
-def compute_change_points_orig(series: np.array, max_pvalue: float = 0.001) ->
List[ChangePoint]:
- calculator = cext_calculator
- tester = QHatPermutationsSignificanceTester(calculator, pvalue=max_pvalue,
permutations=100)
- algo = EDivisive(seed=None, calculator=calculator,
significance_tester=tester)
- pts = algo.get_change_points(series)
- return pts, None
+def compute_change_points_orig(series: Sequence[SupportsFloat], max_pvalue:
float = 0.001, seed: Optional[int] = None) -> Tuple[PermCPList,
Optional[PermCPList]]:
+ tester = PermutationsSignificanceTester(alpha=max_pvalue,
permurations=100, calculator=PairDistanceCalculator, seed=seed)
Review Comment:
Thanks! Will correct the typo. But at least it's a consistent typo :)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]