andrewmusselman opened a new issue, #1256:
URL: https://github.com/apache/tooling-trusted-releases/issues/1256
### Summary
`AddProjectForm` rejects display names for real, current Apache projects
that contain `-` or `_` — including `Apache Empire-db` and every `Apache mod_*`
project. The whitelist of irregular words is honoured by the per-word case
check but not by the subsequent whole-string character check.
### Affected projects
Real TLPs/podlings that cannot currently be added or renamed to their actual
display name:
- `Apache Empire-db`
- `Apache mod_jk`, `Apache mod_perl`, `Apache mod_ftp`, `Apache mod_python`,
and the rest of the `mod_*` family
### Repro
From the repo root:
```bash
uv run python -c "
from atr.shared.projects import AddProjectForm
AddProjectForm(
csrf_token='',
committee_key='empire-db',
display_name='Apache Empire-db',
key='empire-db',
)
"
```
Produces:
```
display_name
Value error, Name must be alphanumeric and may include spaces or dots or
plus signs.
[type=value_error, input_value='Apache Empire-db', input_type=str]
```
Same error for `Apache mod_jk`, `Apache mod_perl`, etc.
### Where the bug lives
`atr/shared/projects.py`, in the display-name validation in
`AddProjectForm.validate_fields`. Two checks run in sequence:
1. **Per-word case check.** Each word after `Apache` must match
`PascalCase`, `camelCase`, or `^mod(_[0-9a-z]+)+$`, or be in
`allowed_irregular_words = {".NET", "C++", "Empire-db", "Lucene.NET", "for",
"jclouds"}`. ✅ Respects the whitelist.
2. **Whole-string character check.** ```display_name.replace(" ",
"").replace(".", "").replace("+", "").isalnum()``` must be true. ❌ Does **not**
respect the whitelist, and only strips spaces, dots, and plus signs. Hyphens
and underscores are not stripped, so `Empire-db` and `mod_jk` fail.
So `Apache Empire-db` clears check 1 (whitelist match) and dies on check 2
(the `-` is not alphanumeric). `Apache mod_jk` clears check 1 via the
`r_mod_case` regex and dies on check 2 on the `_`.
### Suggested fixes
Two options, either works:
**(a) Strip the extra characters too.** Smaller change, preserves the "every
character must be one of these" invariant:
```python
stripped = display_name.replace(" ", "").replace(".", "").replace("+",
"").replace("-", "").replace("_", "")
if not stripped.isalnum():
raise ValueError("Name must be alphanumeric and may include spaces,
dots, plus signs, hyphens, or underscores.")
```
**(b) Skip check 2 for words that already passed check 1.** Cleaner
conceptually — once a word is on the whitelist or matches a structural regex,
it's by definition allowed:
```python
for display_name_word in display_name_words[1:]:
if display_name_word in allowed_irregular_words:
continue
if r_pascal_case.match(display_name_word) or
r_camel_case.match(display_name_word) or r_mod_case.match(display_name_word):
continue
raise ValueError("Display name words must be in PascalCase, camelCase,
or mod_ case.")
# drop the .isalnum() check entirely
```
I'd lean toward (b) since check 2 is already partially redundant with check
1 — if every word individually passed a structural check, the whole string is
well-formed by construction. (a) is the safer change if you want to keep the
belt-and-braces structure.
Either way, please add `Apache Empire-db` and `Apache mod_jk` to the test
cases so this doesn't regress:
```python
def test_empire_db_is_accepted():
AddProjectForm(csrf_token='', committee_key='empire-db',
display_name='Apache Empire-db', key='empire-db')
def test_mod_jk_is_accepted():
AddProjectForm(csrf_token='', committee_key='httpd',
display_name='Apache mod_jk', key='httpd-mod_jk')
```
### Found while doing
#1254.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]