Re: [PR] test(prophet): pin yhat_lower can be negative for negative series (#21734) [superset]

via GitHub Thu, 14 May 2026 18:51:27 -0700


codeant-ai-for-open-source[bot] commented on code in PR #40141:
URL: https://github.com/apache/superset/pull/40141#discussion_r3245318885



##########
tests/unit_tests/pandas_postprocessing/test_prophet.py:
##########
@@ -184,3 +184,112 @@ def test_prophet_incorrect_time_grain():
             periods=10,
             confidence_interval=0.8,
         )
+
+
+def test_prophet_uncertainty_lower_bound_can_be_negative_for_negative_series():
+    """
+    Regression for #21734: when the input series contains negative values,
+    the forecast's lower confidence bound (``__yhat_lower``) must be allowed
+    to go below zero. The original bug claimed Superset clipped the lower
+    bound at 0, hiding the natural shape of the uncertainty interval for
+    series like temperatures or signed deltas.
+
+    Superset's wrapper passes through Prophet's output unchanged (no
+    clipping in ``superset/utils/pandas_postprocessing/prophet.py``); this
+    test pins that contract end-to-end. If a future refactor introduces
+    a ``max(0, lower)`` clamp, this test fails immediately.
+    """
+    if find_spec("prophet") is None:
+        pytest.skip("prophet not installed")
+
+    # All-negative monthly series — any reasonable forecast must predict
+    # negative values (and therefore negative uncertainty bounds) too.
+    negative_df = pd.DataFrame(
+        {
+            DTTM_ALIAS: [datetime(2020, m, 1) for m in range(1, 13)]
+            + [datetime(2021, m, 1) for m in range(1, 13)],
+            "temperature": [
+                -5.0,
+                -7.0,
+                -3.0,
+                1.0,
+                8.0,
+                14.0,
+                17.0,
+                16.0,
+                11.0,
+                5.0,
+                -1.0,
+                -4.0,
+                -6.0,
+                -8.0,
+                -2.0,
+                2.0,
+                9.0,
+                15.0,
+                18.0,
+                17.0,
+                12.0,
+                6.0,
+                0.0,
+                -3.0,
+            ],
+        }
+    )
+
+    result = prophet(
+        df=negative_df,
+        time_grain="P1M",
+        periods=3,
+        confidence_interval=0.9,
+    )
+
+    assert "temperature__yhat_lower" in result.columns
+    # At least one forecast point's lower bound must be negative for a
+    # series whose actuals span both signs. A clamp at zero would force
+    # this assertion to fail.
+    assert (result["temperature__yhat_lower"] < 0).any(), (
+        "Forecast lower bound was non-negative everywhere despite a "
+        "series with negative actuals — suggests an unexpected clamp at "
+        "zero was reintroduced (regression of #21734)."
+    )
+
+
+def test_prophet_does_not_clamp_yhat_below_zero_for_negative_actuals():
+    """
+    Companion to the lower-bound test above: the central forecast
+    (``__yhat``) must also be allowed to go negative.
+    A bug that clamps the central forecast at zero would force the lower
+    bound non-negative as a side effect, masking the wider issue.
+    """
+    if find_spec("prophet") is None:
+        pytest.skip("prophet not installed")
+
+    negative_df = pd.DataFrame(
+        {
+            DTTM_ALIAS: [datetime(2020, m, 1) for m in range(1, 13)],
+            "balance": [
+                -100.0,
+                -110.0,
+                -95.0,
+                -120.0,
+                -130.0,
+                -125.0,
+                -140.0,
+                -135.0,
+                -150.0,
+                -145.0,
+                -160.0,
+                -155.0,
+            ],
+        }
+    )
+
+    result = prophet(
+        df=negative_df,
+        time_grain="P1M",
+        periods=2,
+        confidence_interval=0.8,
+    )
+
+    assert (result["balance__yhat"] < 0).any()

Review Comment:
   **Suggestion:** This assertion also evaluates all rows, including historical 
fitted points, rather than just predicted future points. If clamping is 
introduced only for future predictions, the test can still pass due to negative 
in-sample fitted values. Limit the check to forecast rows (the final `periods` 
rows) to correctly detect regressions. [incorrect condition logic]
   
   <details>
   <summary><b>Severity Level:</b> Major ⚠️</summary>
   
   ```mdx
   - ❌ Future-only central-forecast clamp not caught by current test.
   - ⚠️ Prophet wrapper regression for negative balances goes unguarded.
   ```
   </details>
   <details>
   <summary><b>Steps of Reproduction ✅ </b></summary>
   
   ```mdx
   1. As in suggestion 1, `_prophet_fit_and_predict()` in
   `superset/utils/pandas_postprocessing/prophet.py:75-77` builds `future =
   model.make_future_dataframe(periods=periods, freq=freq)` and joins the 
original `df` on
   `ds`, so the resulting dataframe contains both historical and forecast rows 
for each
   series.
   
   2. The `prophet()` wrapper in 
`superset/utils/pandas_postprocessing/prophet.py:80-170`
   uses that helper to produce a `target_df` where each series gets `__yhat`, 
`__yhat_lower`,
   `__yhat_upper`, and the original series column over the combined in-sample + 
future index.
   
   3. In `tests/unit_tests/pandas_postprocessing/test_prophet.py:109-145`,
   `test_prophet_does_not_clamp_yhat_below_zero_for_negative_actuals` builds an 
all-negative
   `balance` series and calls `prophet(df=negative_df, time_grain="P1M", 
periods=2,
   confidence_interval=0.8)`, so `result` again includes both the original 
balance history
   and two forecast rows.
   
   4. The final assertion at 
`tests/unit_tests/pandas_postprocessing/test_prophet.py:146`
   (`assert (result["balance__yhat"] < 0).any()`) evaluates the entire 
`"balance__yhat"`
   column; any negative in-sample fitted `yhat` value is enough for this to 
pass, so if a
   future-only clamp forced the last `periods` forecast `yhat` values to be 
non-negative, the
   regression test would still pass, meaning the condition should instead focus 
on the
   forecast horizon (e.g., the last `periods` rows).
   ```
   </details>
   
   [Fix in 
Cursor](https://app.codeant.ai/fix-in-ide?tool=cursor&prompt=This%20is%20a%20comment%20left%20during%20a%20code%20review.%0A%0A%2A%2APath%3A%2A%2A%20tests%2Funit_tests%2Fpandas_postprocessing%2Ftest_prophet.py%0A%2A%2ALine%3A%2A%2A%20295%3A295%0A%2A%2AComment%3A%2A%2A%0A%09%2AIncorrect%20Condition%20Logic%3A%20This%20assertion%20also%20evaluates%20all%20rows%2C%20including%20historical%20fitted%20points%2C%20rather%20than%20just%20predicted%20future%20points.%20If%20clamping%20is%20introduced%20only%20for%20future%20predictions%2C%20the%20test%20can%20still%20pass%20due%20to%20negative%20in-sample%20fitted%20values.%20Limit%20the%20check%20to%20forecast%20rows%20%28the%20final%20%60periods%60%20rows%29%20to%20correctly%20detect%20regressions.%0A%0AValidate%20the%20correctness%20of%20the%20flagged%20issue.%20If%20correct%2C%20How%20can%20I%20resolve%20this%3F%20If%20you%20propose%20a%20fix%2C%20implement%20it%20and%20please%20make%20it%20concise.%0AOnce%20fix%20is%20impleme
 
nted%2C%20also%20check%20other%20comments%20on%20the%20same%20PR%2C%20and%20ask%20user%20if%20the%20user%20wants%20to%20fix%20the%20rest%20of%20the%20comments%20as%20well.%20if%20said%20yes%2C%20then%20fetch%20all%20the%20comments%20validate%20the%20correctness%20and%20implement%20a%20minimal%20fix%0A)
 | [Fix in VSCode 
Claude](https://app.codeant.ai/fix-in-ide?tool=vscode-claude&prompt=This%20is%20a%20comment%20left%20during%20a%20code%20review.%0A%0A%2A%2APath%3A%2A%2A%20tests%2Funit_tests%2Fpandas_postprocessing%2Ftest_prophet.py%0A%2A%2ALine%3A%2A%2A%20295%3A295%0A%2A%2AComment%3A%2A%2A%0A%09%2AIncorrect%20Condition%20Logic%3A%20This%20assertion%20also%20evaluates%20all%20rows%2C%20including%20historical%20fitted%20points%2C%20rather%20than%20just%20predicted%20future%20points.%20If%20clamping%20is%20introduced%20only%20for%20future%20predictions%2C%20the%20test%20can%20still%20pass%20due%20to%20negative%20in-sample%20fitted%20values.%20Limit%20the%20check%20to%20forecast%20rows%
 
20%28the%20final%20%60periods%60%20rows%29%20to%20correctly%20detect%20regressions.%0A%0AValidate%20the%20correctness%20of%20the%20flagged%20issue.%20If%20correct%2C%20How%20can%20I%20resolve%20this%3F%20If%20you%20propose%20a%20fix%2C%20implement%20it%20and%20please%20make%20it%20concise.%0AOnce%20fix%20is%20implemented%2C%20also%20check%20other%20comments%20on%20the%20same%20PR%2C%20and%20ask%20user%20if%20the%20user%20wants%20to%20fix%20the%20rest%20of%20the%20comments%20as%20well.%20if%20said%20yes%2C%20then%20fetch%20all%20the%20comments%20validate%20the%20correctness%20and%20implement%20a%20minimal%20fix%0A)
   
   *(Use Cmd/Ctrl + Click for best experience)*
   <details>
   <summary><b>Prompt for AI Agent 🤖 </b></summary>
   
   ```mdx
   This is a comment left during a code review.
   
   **Path:** tests/unit_tests/pandas_postprocessing/test_prophet.py
   **Line:** 295:295
   **Comment:**
        *Incorrect Condition Logic: This assertion also evaluates all rows, 
including historical fitted points, rather than just predicted future points. 
If clamping is introduced only for future predictions, the test can still pass 
due to negative in-sample fitted values. Limit the check to forecast rows (the 
final `periods` rows) to correctly detect regressions.
   
   Validate the correctness of the flagged issue. If correct, How can I resolve 
this? If you propose a fix, implement it and please make it concise.
   Once fix is implemented, also check other comments on the same PR, and ask 
user if the user wants to fix the rest of the comments as well. if said yes, 
then fetch all the comments validate the correctness and implement a minimal fix
   ```
   </details>
   <a 
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F40141&comment_hash=75c943f6b89ce77ab4c1ab6b7ece355c082eddf8aad185b7bf5293d31bba3228&reaction=like'>👍</a>
 | <a 
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F40141&comment_hash=75c943f6b89ce77ab4c1ab6b7ece355c082eddf8aad185b7bf5293d31bba3228&reaction=dislike'>👎</a>



##########
tests/unit_tests/pandas_postprocessing/test_prophet.py:
##########
@@ -184,3 +184,112 @@ def test_prophet_incorrect_time_grain():
             periods=10,
             confidence_interval=0.8,
         )
+
+
+def test_prophet_uncertainty_lower_bound_can_be_negative_for_negative_series():
+    """
+    Regression for #21734: when the input series contains negative values,
+    the forecast's lower confidence bound (``__yhat_lower``) must be allowed
+    to go below zero. The original bug claimed Superset clipped the lower
+    bound at 0, hiding the natural shape of the uncertainty interval for
+    series like temperatures or signed deltas.
+
+    Superset's wrapper passes through Prophet's output unchanged (no
+    clipping in ``superset/utils/pandas_postprocessing/prophet.py``); this
+    test pins that contract end-to-end. If a future refactor introduces
+    a ``max(0, lower)`` clamp, this test fails immediately.
+    """
+    if find_spec("prophet") is None:
+        pytest.skip("prophet not installed")
+
+    # All-negative monthly series — any reasonable forecast must predict
+    # negative values (and therefore negative uncertainty bounds) too.
+    negative_df = pd.DataFrame(
+        {
+            DTTM_ALIAS: [datetime(2020, m, 1) for m in range(1, 13)]
+            + [datetime(2021, m, 1) for m in range(1, 13)],
+            "temperature": [
+                -5.0,
+                -7.0,
+                -3.0,
+                1.0,
+                8.0,
+                14.0,
+                17.0,
+                16.0,
+                11.0,
+                5.0,
+                -1.0,
+                -4.0,
+                -6.0,
+                -8.0,
+                -2.0,
+                2.0,
+                9.0,
+                15.0,
+                18.0,
+                17.0,
+                12.0,
+                6.0,
+                0.0,
+                -3.0,
+            ],
+        }
+    )
+
+    result = prophet(
+        df=negative_df,
+        time_grain="P1M",
+        periods=3,
+        confidence_interval=0.9,
+    )
+
+    assert "temperature__yhat_lower" in result.columns
+    # At least one forecast point's lower bound must be negative for a
+    # series whose actuals span both signs. A clamp at zero would force
+    # this assertion to fail.
+    assert (result["temperature__yhat_lower"] < 0).any(), (
+        "Forecast lower bound was non-negative everywhere despite a "
+        "series with negative actuals — suggests an unexpected clamp at "
+        "zero was reintroduced (regression of #21734)."
+    )
+

Review Comment:
   **Suggestion:** The assertion checks `temperature__yhat_lower` across the 
entire returned dataframe, which includes both historical fit rows and forecast 
rows. This can pass even if a future-only clamp is introduced, because 
historical fitted points may still be negative. Restrict the assertion to the 
forecast horizon (e.g., the last `periods` rows) so the test actually validates 
future forecast behavior. [incorrect condition logic]
   
   <details>
   <summary><b>Severity Level:</b> Major ⚠️</summary>
   
   ```mdx
   - ❌ Future-only lower-bound clamp escapes detection in regression test.
   - ⚠️ Negative historical bounds can satisfy current assertion condition.
   ```
   </details>
   <details>
   <summary><b>Steps of Reproduction ✅ </b></summary>
   
   ```mdx
   1. Observe `_prophet_fit_and_predict()` in
   `superset/utils/pandas_postprocessing/prophet.py:75-77`, which calls
   `model.make_future_dataframe(periods=periods, freq=freq)` and then 
`model.predict(future)`
   before joining the original `df` on `ds`, so the returned `forecast` 
includes both
   in-sample and forecast rows.
   
   2. The public `prophet()` wrapper in
   `superset/utils/pandas_postprocessing/prophet.py:80-170` iterates series, 
calls
   `_prophet_fit_and_predict()` with `periods` from the caller, and stores its 
result in
   `fit_df`, so each resulting dataframe used by callers also contains 
historical and future
   rows together.
   
   3. In `tests/unit_tests/pandas_postprocessing/test_prophet.py:40-106`,
   `test_prophet_uncertainty_lower_bound_can_be_negative_for_negative_series` 
calls
   `prophet(df=negative_df, time_grain="P1M", periods=3, 
confidence_interval=0.9)`, so the
   `result` dataframe has `len(negative_df) + 3` rows (24 in-sample + 3 
forecast) based on
   the implementation above.
   
   4. The assertion at 
`tests/unit_tests/pandas_postprocessing/test_prophet.py:98-106`
   (`assert (result["temperature__yhat_lower"] < 0).any()`) checks the entire
   `"temperature__yhat_lower"` column without restricting to the last `periods` 
rows, so any
   negative lower bound on the in-sample portion can make the test pass even if 
a future-only
   clamp forces all forecast lower bounds to be non-negative, meaning the 
test's condition is
   too broad for the intended regression guard.
   ```
   </details>
   
   [Fix in 
Cursor](https://app.codeant.ai/fix-in-ide?tool=cursor&prompt=This%20is%20a%20comment%20left%20during%20a%20code%20review.%0A%0A%2A%2APath%3A%2A%2A%20tests%2Funit_tests%2Fpandas_postprocessing%2Ftest_prophet.py%0A%2A%2ALine%3A%2A%2A%20253%3A256%0A%2A%2AComment%3A%2A%2A%0A%09%2AIncorrect%20Condition%20Logic%3A%20The%20assertion%20checks%20%60temperature__yhat_lower%60%20across%20the%20entire%20returned%20dataframe%2C%20which%20includes%20both%20historical%20fit%20rows%20and%20forecast%20rows.%20This%20can%20pass%20even%20if%20a%20future-only%20clamp%20is%20introduced%2C%20because%20historical%20fitted%20points%20may%20still%20be%20negative.%20Restrict%20the%20assertion%20to%20the%20forecast%20horizon%20%28e.g.%2C%20the%20last%20%60periods%60%20rows%29%20so%20the%20test%20actually%20validates%20future%20forecast%20behavior.%0A%0AValidate%20the%20correctness%20of%20the%20flagged%20issue.%20If%20correct%2C%20How%20can%20I%20resolve%20this%3F%20If%20you%20propose%20a%20fix%2C%20
 
implement%20it%20and%20please%20make%20it%20concise.%0AOnce%20fix%20is%20implemented%2C%20also%20check%20other%20comments%20on%20the%20same%20PR%2C%20and%20ask%20user%20if%20the%20user%20wants%20to%20fix%20the%20rest%20of%20the%20comments%20as%20well.%20if%20said%20yes%2C%20then%20fetch%20all%20the%20comments%20validate%20the%20correctness%20and%20implement%20a%20minimal%20fix%0A)
 | [Fix in VSCode 
Claude](https://app.codeant.ai/fix-in-ide?tool=vscode-claude&prompt=This%20is%20a%20comment%20left%20during%20a%20code%20review.%0A%0A%2A%2APath%3A%2A%2A%20tests%2Funit_tests%2Fpandas_postprocessing%2Ftest_prophet.py%0A%2A%2ALine%3A%2A%2A%20253%3A256%0A%2A%2AComment%3A%2A%2A%0A%09%2AIncorrect%20Condition%20Logic%3A%20The%20assertion%20checks%20%60temperature__yhat_lower%60%20across%20the%20entire%20returned%20dataframe%2C%20which%20includes%20both%20historical%20fit%20rows%20and%20forecast%20rows.%20This%20can%20pass%20even%20if%20a%20future-only%20clamp%20is%20introduced%2C%20because%20hi
 
storical%20fitted%20points%20may%20still%20be%20negative.%20Restrict%20the%20assertion%20to%20the%20forecast%20horizon%20%28e.g.%2C%20the%20last%20%60periods%60%20rows%29%20so%20the%20test%20actually%20validates%20future%20forecast%20behavior.%0A%0AValidate%20the%20correctness%20of%20the%20flagged%20issue.%20If%20correct%2C%20How%20can%20I%20resolve%20this%3F%20If%20you%20propose%20a%20fix%2C%20implement%20it%20and%20please%20make%20it%20concise.%0AOnce%20fix%20is%20implemented%2C%20also%20check%20other%20comments%20on%20the%20same%20PR%2C%20and%20ask%20user%20if%20the%20user%20wants%20to%20fix%20the%20rest%20of%20the%20comments%20as%20well.%20if%20said%20yes%2C%20then%20fetch%20all%20the%20comments%20validate%20the%20correctness%20and%20implement%20a%20minimal%20fix%0A)
   
   *(Use Cmd/Ctrl + Click for best experience)*
   <details>
   <summary><b>Prompt for AI Agent 🤖 </b></summary>
   
   ```mdx
   This is a comment left during a code review.
   
   **Path:** tests/unit_tests/pandas_postprocessing/test_prophet.py
   **Line:** 253:256
   **Comment:**
        *Incorrect Condition Logic: The assertion checks 
`temperature__yhat_lower` across the entire returned dataframe, which includes 
both historical fit rows and forecast rows. This can pass even if a future-only 
clamp is introduced, because historical fitted points may still be negative. 
Restrict the assertion to the forecast horizon (e.g., the last `periods` rows) 
so the test actually validates future forecast behavior.
   
   Validate the correctness of the flagged issue. If correct, How can I resolve 
this? If you propose a fix, implement it and please make it concise.
   Once fix is implemented, also check other comments on the same PR, and ask 
user if the user wants to fix the rest of the comments as well. if said yes, 
then fetch all the comments validate the correctness and implement a minimal fix
   ```
   </details>
   <a 
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F40141&comment_hash=e6c0898696431ef7ef92103ef7278bdee3fc292ae0c26a2cfd824957c6ed5e6a&reaction=like'>👍</a>
 | <a 
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F40141&comment_hash=e6c0898696431ef7ef92103ef7278bdee3fc292ae0c26a2cfd824957c6ed5e6a&reaction=dislike'>👎</a>



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] test(prophet): pin yhat_lower can be negative for negative series (#21734) [superset]

Reply via email to