michael-s-molina commented on code in PR #38167: URL: https://github.com/apache/superset/pull/38167#discussion_r2840655549
########## docs/developer_portal/extensions/quick-start.md: ########## @@ -54,24 +54,33 @@ superset-extensions init The CLI will prompt you for information: ``` -Extension ID (unique identifier, alphanumeric only): hello_world -Extension name (human-readable display name): Hello World +Extension name (e.g. Hello World): Hello World +Extension ID [hello-world]: hello-world Initial version [0.1.0]: 0.1.0 License [Apache-2.0]: Apache-2.0 Include frontend? [Y/n]: Y Include backend? [Y/n]: Y ``` +**Important**: The extension ID must be **globally unique** across all Superset extensions and serves as the basis for all technical identifiers: Review Comment: @villebro I had a discussion with AI and here's the summary: ## Naming Convention: Flat IDs May Not Scale for a Multi-Org Marketplace This PR makes a solid improvement by enforcing consistent kebab-case IDs and deriving all technical names from a single source of truth. The validation logic and `ExtensionNames` structure are well-designed. One concern worth raising before this convention is locked in: **flat kebab-case IDs do not provide collision-safe namespacing for an open marketplace where multiple organizations publish extensions independently.** --- ### The Problem The current model places the entire collision-avoidance burden on the author: pick a unique name like `sql-visualizer` and hope no other organization has the same idea. This works for a small, centrally curated registry but breaks at marketplace scale: - Two organizations independently building a `data-quality` or `sql-visualizer` extension will collide - There is no way to attribute or verify authorship from the ID alone - Name squatting becomes possible once the marketplace is open --- ### How Other Extension Ecosystems Handle This Every major extension ecosystem solves this with namespaced IDs: | Ecosystem | Format | Example | |---|---|---| | VSCode | `publisher.extension` | `airbnb.sql-visualizer` | | npm | `@scope/package` | `@airbnb/sql-visualizer` | | IntelliJ / OSGi | `com.org.extension` | `com.airbnb.sql-visualizer` | | Maven / Gradle | `group:artifact` | `com.airbnb:sql-visualizer` | --- ### Recommendation: VSCode-style `publisher.extension-name` The two-part dot-separated model (`airbnb.sql-visualizer`) is the best fit here: - **One org segment is enough.** Full reverse-domain (`com.airbnb.extensions.sql.visualizer`) is Java-era verbosity. Modern ecosystems use one org identifier plus one extension name. - **Maps cleanly to Python namespaces.** `airbnb.sql-visualizer` → `superset_extensions.airbnb.sql_visualizer`, which is a valid two-level namespace package supported natively by Python. - **Simpler than npm scopes.** `@airbnb/sql-visualizer` is semantically equivalent but `@` and `/` require escaping in file paths, CLI args, and URLs. - **Globally unique by construction.** Owning `airbnb.*` means no other org can collide with your extensions without going through a registry-level conflict. #### Generated name variants for `airbnb.sql-visualizer` | Variant | Value | |---|---| | Display name | `SQL Visualizer` | | ID | `airbnb.sql-visualizer` | | Directory | `airbnb.sql-visualizer/` | | Python package | `superset_extensions.airbnb.sql_visualizer` | | npm name | `@airbnb/sql-visualizer` | | Module Federation | `airbnbSqlVisualizer` | --- ### What Would Need to Change The architecture introduced in this PR is well-positioned to support this — the main changes would be: 1. **`EXTENSION_NAME_PATTERN`** — allow one dot separating org and name segments 2. **`generate_extension_names()`** — split on the dot to derive `org` and `name` components separately 3. **`validate_extension_id()`** — enforce `<org>.<name>` format, validate each segment independently 4. **Backend scaffold** — create a two-level namespace: `superset_extensions/<org>/<name>/` 5. **`extension.json` template** — update `files` glob and `entryPoints` to reflect the deeper path 6. **CLI prompts** — add an `--org` option or prompt, or accept the full dot-separated ID directly The `ExtensionNames` TypedDict would grow an `org` field, but all downstream consumers already read from it by key, so the blast radius is contained. --- If the plan is to keep extensions internal or fully curated for now, the flat model is fine as a starting point. But if there is any intention to open a marketplace, it is worth establishing the two-part convention now before extension IDs appear in published `.supx` files, `extension.json` configs, and Python package names that become hard to migrate. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
