(texera) branch main updated: fix(amber-python): harden Udon debug command translation against bad input (#4512)

yiconghuang Sun, 26 Apr 2026 17:00:24 -0700

This is an automated email from the ASF dual-hosted git repository.

Yicong-Huang pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/texera.git



The following commit(s) were added to refs/heads/main by this push:
     new d33603a452 fix(amber-python): harden Udon debug command translation 
against bad input (#4512)
d33603a452 is described below

commit d33603a452b7711409f7f1d9edea3d47891964c8
Author: Yicong Huang <[email protected]>
AuthorDate: Sun Apr 26 17:00:09 2026 -0700

    fix(amber-python): harden Udon debug command translation against bad input 
(#4512)
    
    ### What changes were proposed in this PR?
    
    Fixes four correctness gaps in
    `WorkerDebugCommandHandler.translate_debug_command` that were pinned as
    quirks by the unit tests added in #4510:
    
    | Input | Before | After |
    | --- | --- | --- |
    | `""` / `" "` | `ValueError: not enough values to unpack` |
    `ValueError("debug command cannot be empty")` |
    | `b foo.py:5` | `b my_udf:foo.py:5` (rejected by pdb) | `b foo.py:5`
    (passed through) |
    | `b my_func` | `b my_udf:my_func` (rejected by pdb — filename prefix
    requires lineno) | `b my_func` (passed through; pdb resolves the symbol)
    |
    | `b 5` with `operator_module_name = None` | `b None:5` (no such module)
    | `ValueError("executor module not initialized; cannot set breakpoint")`
    |
    
    The rule is now: **only prepend `module:` when the target is a numeric
    line in the operator's own UDF module.** Function-name and explicit
    `filename:lineno` forms — both of which pdb already accepts — pass
    through unchanged.
    
    ### Any related issues, documentation, discussions?
    
    Closes #4511. Follow-up to #4510 (which introduced the tests that
    exposed these gaps).
    
    ### How was this PR tested?
    
    - Updated the four pinning tests from #4510
    (`test_break_with_function_name_*`,
    `test_break_with_explicit_filename_*`, `test_module_name_none_*`,
    `test_empty_command_*`) to assert the new intentional behaviors.
    - Added one new test for the function-name fallback when the executor
    module isn't initialized yet.
    - Local: `python -m pytest
    core/architecture/managers/test_debug_manager.py
    core/architecture/handlers/control/test_debug_command_handler.py` — 35
    passed.
    - Local: `ruff format --check .` and `ruff check .` — clean.
    
    ### Was this PR authored or co-authored using generative AI tooling?
    
    Generated-by: Claude Code (Opus 4.7)
---
 .../handlers/control/debug_command_handler.py      | 45 +++++++++++++------
 .../handlers/control/test_debug_command_handler.py | 50 +++++++++++-----------
 2 files changed, 56 insertions(+), 39 deletions(-)

diff --git 
a/amber/src/main/python/core/architecture/handlers/control/debug_command_handler.py
 
b/amber/src/main/python/core/architecture/handlers/control/debug_command_handler.py
index e7d35a6417..cf041b47ea 100644
--- 
a/amber/src/main/python/core/architecture/handlers/control/debug_command_handler.py
+++ 
b/amber/src/main/python/core/architecture/handlers/control/debug_command_handler.py
@@ -41,25 +41,42 @@ class WorkerDebugCommandHandler(ControlHandler):
     @staticmethod
     def translate_debug_command(command: str, context: Context) -> str:
         """
-        This method cleans up, reformats, and then translates a debug command 
into
-        a command that can be understood by the debugger.
+        Cleans up and translates a debug command into one pdb can consume.
 
-        For example, it adds the UDF code context.
+        For `b`/`break` with a numeric line target, the operator's UDF module
+        name is prepended so the breakpoint lands inside the user's code:
+        ``b 5`` becomes ``b my_udf:5``.
 
-        :param command:
-        :param context:
-        :return:
+        Three forms are passed through unchanged because pdb already accepts
+        them and the module rewrite would corrupt them:
+
+        - bare ``b`` / ``break`` with no args
+        - ``b <function_name>`` (pdb resolves the symbol itself)
+        - ``b <filename>:<lineno>`` (the user already specified a file)
+
+        :raises ValueError: if the command is empty/whitespace-only, or if a
+            ``b``/``break`` with a numeric target is issued before the
+            operator module has been initialized.
         """
-        debug_command, *debug_args = command.strip().split()
-        module_name = context.executor_manager.operator_module_name
-        if debug_command in ["b", "break"] and len(debug_args) > 0:
-            # b(reak) ([filename:]lineno | function) [, condition]¶
-            translated_command = (
+        parts = command.strip().split()
+        if not parts:
+            raise ValueError("debug command cannot be empty")
+        debug_command, *debug_args = parts
+
+        is_break_with_lineno = (
+            debug_command in ("b", "break") and debug_args and 
debug_args[0].isdigit()
+        )
+        if is_break_with_lineno:
+            module_name = context.executor_manager.operator_module_name
+            if module_name is None:
+                raise ValueError(
+                    "executor module not initialized; cannot set breakpoint"
+                )
+            translated = (
                 f"{debug_command} {module_name}:{debug_args[0]} "
                 f"{' '.join(debug_args[1:])}"
             )
         else:
-            translated_command = f"{debug_command} {' '.join(debug_args)}"
+            translated = f"{debug_command} {' '.join(debug_args)}"
 
-        translated_command = translated_command.strip()
-        return translated_command
+        return translated.strip()
diff --git 
a/amber/src/main/python/core/architecture/handlers/control/test_debug_command_handler.py
 
b/amber/src/main/python/core/architecture/handlers/control/test_debug_command_handler.py
index f6e3bfa641..382d30412d 100644
--- 
a/amber/src/main/python/core/architecture/handlers/control/test_debug_command_handler.py
+++ 
b/amber/src/main/python/core/architecture/handlers/control/test_debug_command_handler.py
@@ -91,21 +91,16 @@ class TestTranslateDebugCommand:
 
     # ----- edge cases / invalid input -----
 
-    def test_empty_command_raises_value_error(self, context):
-        # `command.strip().split()` on "" returns [], so the unpacking
-        #     debug_command, *debug_args = ...
-        # raises ValueError. The handler does not guard against this — the
-        # frontend is expected to never send empty commands. Pin the current
-        # behavior so any future guard is a deliberate change.
-        with pytest.raises(ValueError):
+    def test_empty_command_raises_descriptive_error(self, context):
+        with pytest.raises(ValueError, match="cannot be empty"):
             WorkerDebugCommandHandler.translate_debug_command("", context)
 
-    def test_whitespace_only_command_raises_value_error(self, context):
-        with pytest.raises(ValueError):
+    def test_whitespace_only_command_raises_descriptive_error(self, context):
+        with pytest.raises(ValueError, match="cannot be empty"):
             WorkerDebugCommandHandler.translate_debug_command("   \t  ", 
context)
 
     def test_uppercase_break_is_not_recognized(self, context):
-        # The match list is case-sensitive: ["b", "break"]. "BREAK" / "B" fall
+        # The match list is case-sensitive: ("b", "break"). "BREAK" / "B" fall
         # through to the pass-through branch and won't get the module prefix.
         assert (
             WorkerDebugCommandHandler.translate_debug_command("BREAK 5", 
context)
@@ -115,31 +110,36 @@ class TestTranslateDebugCommand:
             WorkerDebugCommandHandler.translate_debug_command("B 5", context) 
== "B 5"
         )
 
-    def test_break_with_function_name_is_also_module_prefixed(self, context):
-        # pdb's `b` accepts either a lineno or a function name. The
-        # translation prefixes the module unconditionally; document that.
+    def test_break_with_function_name_passes_through(self, context):
+        # pdb's `b` accepts a bare function name and resolves it itself; the
+        # `module:funcname` form is invalid (pdb expects a lineno after a
+        # filename prefix). So we leave function-name args unchanged.
         assert (
             WorkerDebugCommandHandler.translate_debug_command("b my_func", 
context)
-            == "b my_udf:my_func"
+            == "b my_func"
         )
 
-    def test_break_with_explicit_filename_is_re_prefixed(self, context):
-        # If the user already typed `b foo.py:5`, the translator naively
-        # prepends the module again, yielding `b my_udf:foo.py:5`. Pin this.
+    def test_break_with_explicit_filename_passes_through(self, context):
+        # The user already typed `filename:lineno` — don't double-prefix.
         assert (
             WorkerDebugCommandHandler.translate_debug_command("b foo.py:5", 
context)
-            == "b my_udf:foo.py:5"
+            == "b foo.py:5"
         )
 
-    def test_module_name_none_is_rendered_as_string_none(self, context):
-        # If the executor hasn't been initialized yet, operator_module_name is
-        # None; the f-string interpolates it as the literal "None". The
-        # frontend isn't expected to send debug commands in this state, but
-        # if it does, this is what comes out.
+    def test_break_with_lineno_before_module_init_raises(self, context):
+        # Without an initialized executor module we cannot construct
+        # `module:lineno`, so refuse instead of emitting `b None:5`.
         context.executor_manager.operator_module_name = None
-        assert (
+        with pytest.raises(ValueError, match="executor module not 
initialized"):
             WorkerDebugCommandHandler.translate_debug_command("b 5", context)
-            == "b None:5"
+
+    def test_break_with_function_name_before_module_init_passes_through(self, 
context):
+        # Function-name and filename:lineno forms don't need the module name,
+        # so they should still work even before the executor is initialized.
+        context.executor_manager.operator_module_name = None
+        assert (
+            WorkerDebugCommandHandler.translate_debug_command("b my_func", 
context)
+            == "b my_func"
         )

(texera) branch main updated: fix(amber-python): harden Udon debug command translation against bad input (#4512)

Reply via email to