morningman opened a new pull request, #64096:
URL: https://github.com/apache/doris/pull/64096
### What problem does this PR solve?
Related PR: #63582 (P0 — SPI baseline), #63641 (P1 — nereids plugin-driven
routing)
Problem Summary:
This is **P2** of the catalog SPI migration and targets the
`branch-catalog-spi` feature branch (continuing P0 #63582 and P1 #63641). It
fully migrates `trino-connector` off the legacy in-tree
`fe-core/datasource/trinoconnector/` implementation and onto the connector SPI
module `fe-connector-trino`, making `trino-connector` the first connector to
complete the SPI consumption playbook that later connectors will reuse as a
template.
All five batches land together so there is no intermediate state where a
newly-created trino catalog cannot be serialized.
**Batch A — complete the SPI surface (`fe-connector-trino` only, no fe-core
changes)**
- `TrinoConnectorProvider.validateProperties`: enforce the required
`trino.connector.name` property at `CREATE CATALOG` time (ported from the
legacy `checkProperties`).
- `TrinoDorisConnector.preCreateValidation`: call `ensureInitialized()` so
plugin loading + connector-factory resolution happen at catalog creation
instead of being deferred to the first `SELECT`.
- `TrinoConnectorDorisMetadata.applyFilter` / `applyProjection`: bridge
Trino native filter/projection pushdown, reusing `TrinoPredicateConverter` to
translate a Doris `ConnectorExpression` into a Trino `TupleDomain`.
`remainingFilter` is conservatively returned as the original expression to
match legacy behavior (conjuncts are not stripped; BE re-evaluates them).
**Batch B — fe-core bridge for image compatibility**
- `GsonUtils`: atomically replace the three legacy `registerSubtype` entries
(`TrinoConnectorExternalCatalog` / `Database` / `Table`) with
`registerCompatibleSubtype` redirects onto the `PluginDrivenExternal*`
hierarchy. This must be atomic — `RuntimeTypeAdapterFactory` rejects duplicate
labels, so keeping both bindings would throw at static init. Mirrors what
ES/JDBC already did.
- `PluginDrivenExternalCatalog.gsonPostProcess`: extract a
`legacyLogTypeToCatalogType()` helper that maps `Type.TRINO_CONNECTOR` →
`"trino-connector"`; the generic `name().toLowerCase()` would otherwise produce
the wrong `"trino_connector"` (underscore) that `CatalogFactory` does not
recognize.
- `PluginDrivenExternalTable.getEngine()` / `getEngineTableTypeName()`: add
`trino-connector` branches that preserve the legacy engine-name / table-type
display across `SHOW TABLE STATUS` and `information_schema`.
**Batch C — flip the switch**
- Add `"trino-connector"` to `CatalogFactory.SPI_READY_TYPES` so catalog
creation routes through the SPI path.
**Batch D — remove legacy code**
- Drop the `instanceof TrinoConnectorExternalTable` scan branch in
`PhysicalPlanTranslator` (the `PluginDrivenExternalTable` SPI branch already
handles it).
- Drop `case "trino-connector"` in `CatalogFactory`.
- Delete `fe-core/datasource/trinoconnector/` (10 files) and the now-dead
legacy `TrinoConnectorPredicateTest`.
- Route the `TRINO_CONNECTOR` db-build case in `ExternalCatalog` to
`PluginDrivenExternalDatabase` (mirrors the migrated JDBC case).
- **Retained for image compatibility**: the
`InitCatalogLog.Type.TRINO_CONNECTOR` and
`TableType.TRINO_CONNECTOR_EXTERNAL_TABLE` enums, the GsonUtils redirects, and
the `MetastoreProperties` trino-connector entry.
**Batch E — tests + tracking docs**
- 29 JUnit 5 unit tests over the plugin-free converters:
- `TrinoPredicateConverterTest` — `ConnectorExpression` pushdown trees →
Trino `TupleDomain` (EQ / range / NE / IN / IS [NOT] NULL / AND / OR, Slice
encoding), plus graceful degradation to `TupleDomain.all()` on null/unsupported
input.
- `TrinoTypeMappingTest` — Trino SPI type → Doris `ConnectorType`
(scalars, decimal precision/scale, timestamp precision clamp, array/map/struct,
unsupported-type failure).
- `TrinoConnectorProviderTest` — `validateProperties` fast-fails when
`trino.connector.name` is missing/empty.
- No Trino plugin/cluster required; plugin-dependent paths remain covered
by the existing `external_table_p0/p2` `trino_connector` regression suites.
- Sync the migration tracking docs under `plan-doc/` (already carried on
this feature branch since P0).
**Net effect**: 28 files, +1025 / −2681 (~1656 LOC net removed). Old FE
images holding legacy trino catalogs / databases / tables deserialize onto the
`PluginDrivenExternal*` hierarchy through the GsonUtils string-name redirect,
with engine-name display preserved.
**Deferred (follow-ups, not in this PR)**:
- `trino_connector_migration_compat` regression test (old-image
deserialization) — requires a running cluster + Trino plugin + docker,
unavailable in this dev environment; tracked as a CI/cluster follow-up.
- The plugin-install documentation update lives in the `doris-website` repo
and is handled separately.
### Release note
None
### Check List (For Author)
- Test
- [x] Unit Test — 29 new tests in `fe-connector-trino` (predicate
converter / type mapping / property validation).
- [ ] Regression test — existing `trino_connector` suites cover plugin
paths; the new old-image compat regression is deferred to a CI/cluster
follow-up.
- [ ] Manual test (add detailed scripts or steps below)
- [ ] No need to test or manual test. Explain why:
- [ ] This is a refactor/code format and no logic has been changed.
- [ ] Previous test can cover this change.
- [ ] No code files have been changed.
- [ ] Other reason
- Behavior changed:
- [x] No. Internal routing moves from the legacy fe-core path to the SPI
path; image compatibility, engine-name display, and pushdown semantics all
mirror the legacy behavior. All batches land together, so there is no
serialization-gap window.
- Does this need documentation?
- [x] Yes. The trino-connector plugin-install doc update is a follow-up
in the `doris-website` repo.
### Check List (For Reviewer who merge this PR)
- [ ] Confirm the release note
- [ ] Confirm test cases
- [ ] Confirm document
- [ ] Add branch pick label
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]