codeant-ai-for-open-source[bot] commented on code in PR #38077:
URL: https://github.com/apache/superset/pull/38077#discussion_r2824273998
##########
tests/unit_tests/utils/json_tests.py:
##########
@@ -264,6 +266,119 @@ def test_json_int_dttm_ser():
json.json_int_dttm_ser(np.datetime64())
[email protected](
+ "payload,path_values,expected_result",
+ [
+ (
+ {
+ "credentials_info": {
+ "type": "service_account",
+ "private_key": "XXXXXXXXXX",
+ },
+ },
+ {"$.credentials_info.private_key": "NEW_KEY"},
+ {
+ "credentials_info": {
+ "type": "service_account",
+ "private_key": "NEW_KEY",
+ },
+ },
+ ),
+ (
+ {
+ "auth_params": {
+ "privatekey_body": "XXXXXXXXXX",
+ "privatekey_pass": "XXXXXXXXXX",
+ },
+ "other": "value",
+ },
+ {
+ "$.auth_params.privatekey_body": "-----BEGIN PRIVATE KEY-----",
+ "$.auth_params.privatekey_pass": "passphrase",
+ },
+ {
+ "auth_params": {
+ "privatekey_body": "-----BEGIN PRIVATE KEY-----",
+ "privatekey_pass": "passphrase",
+ },
+ "other": "value",
+ },
+ ),
+ (
+ {"existing": "value"},
+ {"$.nonexistent.path": "new_value"},
+ {"existing": "value"},
+ ),
+ ],
+)
+def test_set_masked_fields(
+ payload: dict[str, Any],
+ path_values: dict[str, Any],
+ expected_result: dict[str, Any],
+) -> None:
+ """
+ Test setting a value at a JSONPath location.
+ """
+ result = json.set_masked_fields(payload, path_values)
+ assert result == expected_result
+
+
[email protected](
+ "payload,sensitive_fields,expected_result",
+ [
+ (
+ {
+ "credentials_info": {
+ "type": "service_account",
+ "private_key": PASSWORD_MASK,
+ },
+ },
+ {"$.credentials_info.private_key", "$.credentials_info.type"},
+ ["$.credentials_info.private_key"],
+ ),
+ (
+ {
+ "credentials_info": {
+ "private_key": "ACTUAL_KEY",
+ },
+ },
+ {"$.credentials_info.private_key"},
+ [],
+ ),
+ (
+ {
+ "auth_params": {
+ "privatekey_body": PASSWORD_MASK,
+ "privatekey_pass": "actual_pass",
+ },
+ "oauth2_client_info": {
+ "secret": PASSWORD_MASK,
+ },
+ },
+ {
+ "$.auth_params.privatekey_body",
+ "$.auth_params.privatekey_pass",
+ "$.oauth2_client_info.secret",
+ },
+ [
+ "$.auth_params.privatekey_body",
+ "$.oauth2_client_info.secret",
+ ],
+ ),
+ ],
+)
+def test_get_masked_fields(
+ payload: dict[str, Any],
+ sensitive_fields: set[str],
+ expected_result: dict[str, Any],
+) -> None:
+ """
+ Test that get_masked_fields returns paths where value equals PASSWORD_MASK.
+ """
+ masked = json.get_masked_fields(payload, sensitive_fields)
+ assert masked == sorted(expected_result)
Review Comment:
**Suggestion:** The assertion in `test_get_masked_fields` compares the list
returned by `get_masked_fields` directly against a sorted `expected_result`,
but since `get_masked_fields` iterates over a `set` of sensitive fields, the
order of items in `masked` is non-deterministic, which can cause this test to
fail intermittently depending on set iteration order; both sides should be
order-independent in the comparison. [logic error]
<details>
<summary><b>Severity Level:</b> Major ⚠️</summary>
```mdx
- ⚠️ Flaky unit test `test_get_masked_fields` in json_tests.
- ⚠️ CI runs may intermittently fail on unchanged logic.
- ⚠️ Reduces trust in tests around sensitive-field masking.
```
</details>
```suggestion
assert sorted(masked) == sorted(expected_result)
```
<details>
<summary><b>Steps of Reproduction ✅ </b></summary>
```mdx
1. Note implementation of `get_masked_fields` in
`superset/utils/json.py:307-325`, which
iterates over `sensitive_fields: set[str]` and appends matching JSONPath
strings to a
Python list `masked` in the order yielded by the set.
2. In `tests/unit_tests/utils/json_tests.py:326-379`, observe
`test_get_masked_fields`
parametrization where `sensitive_fields` is passed as a `set[str]` (see
lines 336, 345,
358-362) and `expected_result` is a list of JSONPath strings (lines 337,
346, 363-366).
3. At `tests/unit_tests/utils/json_tests.py:378-379` the test calls `masked =
json.get_masked_fields(payload, sensitive_fields)` and asserts `masked ==
sorted(expected_result)`, i.e. it compares an unsorted list produced in
set-iteration
order against a lexicographically sorted list.
4. Run `pytest tests/unit_tests/utils/json_tests.py::test_get_masked_fields`
multiple
times with varying `PYTHONHASHSEED` (set by the test runner or environment);
because
Python set iteration order is not guaranteed, the order of items in `masked`
can differ
from `sorted(expected_result)` (for example `["$.oauth2_client_info.secret",
"$.auth_params.privatekey_body"]` vs `["$.auth_params.privatekey_body",
"$.oauth2_client_info.secret"]`), causing intermittent assertion failures
even though the
underlying `get_masked_fields` behavior is correct.
```
</details>
<details>
<summary><b>Prompt for AI Agent 🤖 </b></summary>
```mdx
This is a comment left during a code review.
**Path:** tests/unit_tests/utils/json_tests.py
**Line:** 379:379
**Comment:**
*Logic Error: The assertion in `test_get_masked_fields` compares the
list returned by `get_masked_fields` directly against a sorted
`expected_result`, but since `get_masked_fields` iterates over a `set` of
sensitive fields, the order of items in `masked` is non-deterministic, which
can cause this test to fail intermittently depending on set iteration order;
both sides should be order-independent in the comparison.
Validate the correctness of the flagged issue. If correct, How can I resolve
this? If you propose a fix, implement it and please make it concise.
```
</details>
<a
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F38077&comment_hash=e01ff3a92833de2458a2a2c1879b6377e112e696ffa80c9b57c296348ed23895&reaction=like'>👍</a>
| <a
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F38077&comment_hash=e01ff3a92833de2458a2a2c1879b6377e112e696ffa80c9b57c296348ed23895&reaction=dislike'>👎</a>
##########
superset/databases/api.py:
##########
@@ -1591,6 +1591,14 @@ def import_(self) -> Response:
the private_key should be provided in the following
format:
`{"databases/MyDatabase.yaml":
"my_private_key_password"}`.
type: string
+ encrypted_extra_secrets:
+ description: >-
+ JSON map of sensitive values for
masked_encrypted_extra fields.
+ Keys are file paths (e.g., "databases/db.yaml") and
values
+ are JSON objects with the restricted fields
+ (e.g., `{"databases/MyDatabase.yaml":
+ {"credentials_info": {"private_key": "actual_key"}}}`).
Review Comment:
**Suggestion:** The OpenAPI documentation for `encrypted_extra_secrets`
describes the payload as a nested JSON object keyed by field names (e.g.
`{"credentials_info": {"private_key": "actual_key"}}`), but the backend
`set_masked_fields` helper expects a map of JSONPath strings to values (e.g.
`"$.credentials_info.private_key": "actual_key"`); clients following the
current docs will send a structure that doesn't match what the backend expects,
so their secrets may not be applied correctly during import. [logic error]
<details>
<summary><b>Severity Level:</b> Major ⚠️</summary>
```mdx
- ⚠️ Database import endpoint `/import/` misdocuments secret payload
structure.
- ⚠️ Importers following docs fail to update masked_encrypted_extra secrets.
- ⚠️ Imported database connections may use stale or missing credentials.
```
</details>
```suggestion
Keys are file paths (e.g.,
"databases/MyDatabase.yaml") and values
are JSON objects mapping JSONPath expressions to
secret values
(e.g., `{"databases/MyDatabase.yaml":
{"$.credentials_info.private_key": "actual_key"}}`).
```
<details>
<summary><b>Steps of Reproduction ✅ </b></summary>
```mdx
1. Call the database import endpoint handled by `DatabaseRestApi.import_` in
`superset/databases/api.py:1535-1669` (exposed at `@expose("/import/",
methods=("POST",))`) with a multipart/form-data request.
2. In the form data, include a ZIP file under `formData` containing a
database YAML config
at `databases/MyDatabase.yaml` that has a `masked_encrypted_extra` field
(this is later
read and processed in `superset/commands/importers/v1/utils.load_configs` at
lines
196-211).
3. Following the current OpenAPI docs in
`superset/databases/api.py:1555-1601`, set the
`encrypted_extra_secrets` form field to a JSON string like:
`{"databases/MyDatabase.yaml": {"credentials_info": {"private_key":
"actual_key"}}}`
(i.e. nested by field names, not JSONPath).
4. On the server, `DatabaseRestApi.import_` parses this via
`json.loads(request.form["encrypted_extra_secrets"])` at
`superset/databases/api.py:1653-1655` and passes it to
`ImportDatabasesCommand(...,
encrypted_extra_secrets=encrypted_extra_secrets)` at line 1659, which in
turn calls
`load_configs(..., encrypted_extra_secrets=...)` in
`superset/commands/importers/v1/utils.py:105-116`.
5. Inside `load_configs`, at `utils.py:196-207`, the code assumes
`encrypted_extra_secrets[file_name]` is a mapping from JSONPath to scalar
values (see the
comment and example `{"$.oauth2_client_info.secret": "actual_value"}`), and
passes it to
`json.set_masked_fields` in `superset/utils/json.py:328-346`, which
explicitly expects
`path_values: dict[str, Any]` where keys are JSONPath expressions.
6. Because the structure sent by the client (per current docs) is nested
JSON keyed by
field names (`"credentials_info": {"private_key": ...}`) instead of JSONPath
strings
(`"$.credentials_info.private_key": "..."`), `set_masked_fields` operates on
unexpected
keys/values, so the intended secret locations in `masked_encrypted_extra`
are not updated
correctly. The import completes, but the masked encrypted fields either
remain masked or
are misapplied, contrary to what the OpenAPI documentation promises.
```
</details>
<details>
<summary><b>Prompt for AI Agent 🤖 </b></summary>
```mdx
This is a comment left during a code review.
**Path:** superset/databases/api.py
**Line:** 1597:1600
**Comment:**
*Logic Error: The OpenAPI documentation for `encrypted_extra_secrets`
describes the payload as a nested JSON object keyed by field names (e.g.
`{"credentials_info": {"private_key": "actual_key"}}`), but the backend
`set_masked_fields` helper expects a map of JSONPath strings to values (e.g.
`"$.credentials_info.private_key": "actual_key"`); clients following the
current docs will send a structure that doesn't match what the backend expects,
so their secrets may not be applied correctly during import.
Validate the correctness of the flagged issue. If correct, How can I resolve
this? If you propose a fix, implement it and please make it concise.
```
</details>
<a
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F38077&comment_hash=bbaf6db3fe187981d36ed1ecaf2168ae6856570382d9906415671008a0b9dd28&reaction=like'>👍</a>
| <a
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F38077&comment_hash=bbaf6db3fe187981d36ed1ecaf2168ae6856570382d9906415671008a0b9dd28&reaction=dislike'>👎</a>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]