gnodet commented on PR #22344:
URL: https://github.com/apache/camel/pull/22344#issuecomment-4168380573
_Claude Code on behalf of Guillaume Nodet_
A broader suggestion on the sanitization approach — rather than tag-name
heuristics, consider a **location-aware** approach:
### Current approach (tag-name regex)
Matches any XML element whose name contains "password", "token", etc. This
leads to false positives
(`<token-refresh-interval>300</token-refresh-interval>`) and false negatives
(credentials in elements with non-obvious names).
### Suggested approach (known locations + property tracing)
1. **Define known sensitive POM paths** — a curated list of XML locations
that can hold credentials in a Maven POM:
- `//plugin/configuration/password`, `//plugin/configuration/passphrase`
- `//scm//password`
- Plugin-specific configs (e.g., database, deployment, JNDI plugins)
2. **Trace property references** — parse the POM as XML, find `${prop.name}`
references in those sensitive locations, resolve them back to `<properties>`,
and mask only the property values that flow into credential fields.
### Example
```xml
<properties>
<db.password>superSecret</db.password> <!-- MASK: referenced
in sensitive location -->
<token-refresh-interval>300</token-refresh-interval> <!-- KEEP: not
referenced in any sensitive location -->
</properties>
<build>
<plugins>
<plugin>
<configuration>
<password>${db.password}</password> <!-- known sensitive
path -->
</configuration>
</plugin>
</plugins>
</build>
```
### Benefits
- **No false positives**: config values like
`<password-policy>strict</password-policy>` are never masked
- **Catches indirect secrets**: a property with an innocent name is still
flagged if it flows into a `<password>` config
- **No need to strip `<servers>` or `<distributionManagement>`** — neither
belongs in a POM or contains credentials
### Limitations
- Single-file parsing only (no parent POM / effective POM resolution), but
that's acceptable for a safety net
- Requires maintaining a list of known sensitive paths, but that list is
well-defined in the Maven ecosystem
This would be more code but significantly more accurate, and avoids the
false positive/negative issues of the current regex approach.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]