gnodet commented on PR #22344:
URL: https://github.com/apache/camel/pull/22344#issuecomment-4168380573

   _Claude Code on behalf of Guillaume Nodet_
   
   A broader suggestion on the sanitization approach — rather than tag-name 
heuristics, consider a **location-aware** approach:
   
   ### Current approach (tag-name regex)
   Matches any XML element whose name contains "password", "token", etc. This 
leads to false positives 
(`<token-refresh-interval>300</token-refresh-interval>`) and false negatives 
(credentials in elements with non-obvious names).
   
   ### Suggested approach (known locations + property tracing)
   
   1. **Define known sensitive POM paths** — a curated list of XML locations 
that can hold credentials in a Maven POM:
      - `//plugin/configuration/password`, `//plugin/configuration/passphrase`
      - `//scm//password`
      - Plugin-specific configs (e.g., database, deployment, JNDI plugins)
   
   2. **Trace property references** — parse the POM as XML, find `${prop.name}` 
references in those sensitive locations, resolve them back to `<properties>`, 
and mask only the property values that flow into credential fields.
   
   ### Example
   ```xml
   <properties>
       <db.password>superSecret</db.password>           <!-- MASK: referenced 
in sensitive location -->
       <token-refresh-interval>300</token-refresh-interval>  <!-- KEEP: not 
referenced in any sensitive location -->
   </properties>
   <build>
       <plugins>
           <plugin>
               <configuration>
                   <password>${db.password}</password>   <!-- known sensitive 
path -->
               </configuration>
           </plugin>
       </plugins>
   </build>
   ```
   
   ### Benefits
   - **No false positives**: config values like 
`<password-policy>strict</password-policy>` are never masked
   - **Catches indirect secrets**: a property with an innocent name is still 
flagged if it flows into a `<password>` config
   - **No need to strip `<servers>` or `<distributionManagement>`** — neither 
belongs in a POM or contains credentials
   
   ### Limitations
   - Single-file parsing only (no parent POM / effective POM resolution), but 
that's acceptable for a safety net
   - Requires maintaining a list of known sensitive paths, but that list is 
well-defined in the Maven ecosystem
   
   This would be more code but significantly more accurate, and avoids the 
false positive/negative issues of the current regex approach.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to