merlimat opened a new pull request, #25917: URL: https://github.com/apache/pulsar/pull/25917
## Motivation PIP-475 added the V5 client SDK, which transparently routes against both regular and scalable topics. The `pulsar-client` CLI (`produce` / `consume` / `read`) still used the v4 client API and therefore could not work against scalable topics. This PR migrates `pulsar-client` to the V5 client API so the same commands work against regular and scalable topics out of the box. It follows the same approach already merged for `pulsar-perf` (#25887). A small set of typed V5 `Schema` factories is added because the CLI needs to build schemas from a raw definition string. They are deliberately typed — the V5 API intentionally does **not** resurrect the untyped v4 `loadConf` / `GenericSchema` plumbing. ## Modifications **`pulsar-client-api-v5` / `pulsar-client-v5`** - `SchemaInfo.of(name, type, schema, properties)` — build a schema descriptor from a raw definition (e.g. an Avro/JSON schema document). - `Schema.generic(SchemaInfo)` — a generic schema for that definition. - `Schema.autoProduceBytesOf(Schema base)` — the wrapping form of `autoProduceBytes()`: the producer sends pre-encoded bytes validated against the base schema (and the topic schema). - These bridge to the v4 `Schema.generic` / `Schema.AUTO_PRODUCE_BYTES` implementations via `SchemaAdapter`. KeyValue schemas are intentionally **not** added yet. Covered by `SchemaFactoryTest`. **`pulsar-client-tools`** - `PulsarClientTool` builds the V5 client from the typed `RootParams`. There is **no V5 `loadConf`**; the `client.conf` TLS keys that have no dedicated flag (`tlsAllowInsecureConnection`, `tlsEnableHostnameVerification`, the mTLS cert/key paths) are mapped onto a typed `TlsPolicy`. TLS is enabled only for `pulsar+ssl://` URLs or `useTls=true`, so a plaintext broker is never contacted over TLS (calling `tlsPolicy()` always flips `useTls` on). Keystore TLS has no V5 equivalent and is reported as unsupported. - `CmdProduce` → V5 `Producer`. `bytes` / `string` / `avro:<def>` / `json:<def>` value schemas via the new factories; `file://` encryption via a `PemFileKeyProvider`-backed `ProducerEncryptionPolicy`. KeyValue schemas (`--key-value-encoding-type`) are rejected with a clear message. - `CmdConsume` → V5 `QueueConsumer` for `Shared` / `Key_Shared` and `StreamConsumer` for `Exclusive` / `Failover` (V5 has no single `SubscriptionType`). `--regex` maps to a namespace subscription over the pattern's `tenant/namespace`. `--start-timestamp` / seek removed; `auto_consume` rejected (deferred); `NonDurable` / `--pool-messages` / chunked-message knobs warn. - `CmdRead` → V5 `CheckpointConsumer`. `--start-message-id` accepts only `latest` / `earliest`; the `<ledgerId>:<entryId>` form is rejected. - `AbstractCmdConsume` formats V5 `byte[]` messages; a shared name-agnostic file-decryption provider mirrors v4 `defaultCryptoKeyReader(uri)` (the producer's logical key name travels in the message metadata, so a name-keyed provider would not resolve it). - `CommanderFactory` enables case-insensitive enum parsing so the mixed-case v4 flag spellings (`Latest`, `FAIL`, ...) keep working against the uppercase V5 enums. The WebSocket `produce` / `consume` / `read` paths are unchanged (they speak HTTP and are not part of the binary-only V5 client). ### Behavior changes / deferred CLI flags with no V5 equivalent now warn or error clearly: KeyValue produce, `--subscription-mode NonDurable`, `--pool-messages`, chunked-message knobs, `--start-message-id-inclusive`, the `<ledgerId>:<entryId>` read form, `--start-timestamp` (consume seek), and `-st auto_consume` (deferred to a follow-up). Schema-aware (`auto_consume`) message formatting and KeyValue schemas are the main deferred features. ## Verifying this change - New unit tests: `SchemaFactoryTest` (V5 schema factories), updated `TestCmdProduce` / `TestCmdRead`. - Existing `pulsar-client-tools` test suite passes. ## Does this pull request potentially affect one of the following parts: - The public API: the `pulsar-client` CLI surface — flags are preserved; the behavior changes above are logged as warnings/errors. - The schema: no - The default values of configurations: no - The wire protocol: no - The rest endpoints: no - The admin cli options: no - Anything that affects deployment: no -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
