xiangfu0 opened a new pull request, #18446:
URL: https://github.com/apache/pinot/pull/18446
## Summary
Replaces the closed `PartitionFunctionFactory` enum/switch with an
annotation-based dynamic registry so plug-in partition functions no longer
require touching `pinot-segment-spi`.
- New `@PartitionFunctionType(names = {...})` annotation in `pinot-spi`.
Multiple aliases supported (e.g. `Murmur` / `Murmur2`).
- `PartitionFunctionFactory` becomes a classpath-scanning registry (regex
`.*\.partition\.function\..*`); the factory keeps its same static API, all
callers unchanged. New `init()` mirrors `FunctionRegistry.init()` and is wired
into broker / server / controller starters.
- The seven built-in impls (`Modulo`, `Murmur` / `Murmur2`, `Murmur3`,
`Fnv`, `HashCode`, `ByteArray`, `BoundedColumnValue`) move out of
`pinot-segment-spi` into `org.apache.pinot.common.partition.function.*`,
standardize on `(int numPartitions, Map<String,String> functionConfig)`
constructor, and gain the annotation.
- `PartitionFunction` interface gains a default `getPartitionIdNormalizer()`
method, and a new `PartitionIntNormalizer` enum (`POSITIVE_MODULO` / `ABS` /
`MASK`) lands in `pinot-segment-spi`. Each impl reports the closest-match
normalizer name. The framework uses this only for identity / staleness matching
between config-side and segment-side metadata; legacy impls still compute their
own modulo internally. Javadoc spells out the descriptive-only nature for
legacy functions. (This pulls in just the `getPartitionIdNormalizer` slice from
#18165 so it can land independently.)
## Why
Today, registering a custom partition function requires editing
`PartitionFunctionFactory.PartitionFunctionType` and the `switch` block in the
same SPI module. After this change, plug-ins drop a class on the classpath
under `*.partition.function.*`, implement `PartitionFunction` with the standard
ctor, and annotate `@PartitionFunctionType(names = "MyFn")` — the registry
picks it up at startup.
The `getPartitionIdNormalizer()` slice gives the framework a uniform way to
describe int-to-partition-id semantics, paving the way for the expression-mode
partition pipeline in #18165 to land cleanly without churning legacy impls
again.
## Backward compatibility
**This change is `backward-incompat`** for any external code that referenced
the moved classes:
- The seven impl classes' fully-qualified names changed:
`org.apache.pinot.segment.spi.partition.*` →
`org.apache.pinot.common.partition.function.*`.
- The legacy single-arg constructors on `ModuloPartitionFunction`,
`HashCodePartitionFunction`, and `ByteArrayPartitionFunction` are gone; all
impls now use `(int numPartitions, Map<String, String> functionConfig)` (config
may be `null`).
- `PartitionFunctionFactory.PartitionFunctionType` enum is removed (it was
an internal switch helper).
**Segment-on-disk format is unaffected.** `getName()` strings (`Modulo`,
`Murmur`, `Murmur3`, `FNV`, `HashCode`, `ByteArray`, `BoundedColumnValue`) are
unchanged, so existing segment metadata resolves through the new registry by
name.
## Plug-in path going forward
```java
package com.acme.partition.function;
@PartitionFunctionType(names = "MyFn")
public class MyPartitionFunction implements PartitionFunction {
public MyPartitionFunction(int numPartitions, @Nullable Map<String,
String> config) { ... }
// ...
}
```
Drop the jar on broker / server / controller / minion classpaths — done.
## Test plan
- [x] `./mvnw spotless:apply checkstyle:check license:format license:check
-pl
pinot-spi,pinot-segment-spi,pinot-common,pinot-broker,pinot-server,pinot-controller,pinot-segment-local`
- [x] `./mvnw -pl pinot-common -am
-Dtest=PartitionFunctionTest,PartitionFunctionFactoryTest,PartitionIntNormalizerTest
-Dsurefire.failIfNoSpecifiedTests=false test` (33 tests pass)
- [x] `./mvnw -pl pinot-segment-local -am
-Dtest=SegmentPartitionTest,ColumnMetadataTest,MutableSegmentImplDropRecordOnPartitionMismatchTest
-Dsurefire.failIfNoSpecifiedTests=false test` (18 tests pass)
- [x] `./mvnw -pl pinot-core -am -Dtest=ColumnValueSegmentPrunerTest
-Dsurefire.failIfNoSpecifiedTests=false test` (3 tests pass)
- [x] `./mvnw -pl
pinot-broker,pinot-server,pinot-controller,pinot-segment-local,pinot-core,pinot-tools
-am test-compile` (full test-compile clean)
New regression coverage:
- `PartitionFunctionFactoryTest`: registration completeness across all 7
impls; alias resolution (`Murmur` / `Murmur2`); case-insensitive lookup;
idempotent `init()`; unknown-name rejection; per-impl
`getPartitionIdNormalizer()` label.
- `PartitionIntNormalizerTest`: per-normalizer math at `Integer.MIN_VALUE` /
`Integer.MAX_VALUE` / negative / `Long.MIN_VALUE`; range invariant `[0,
numPartitions)` across all normalizers and a sweep of partition counts;
`fromConfigString` round-trip, blank / null rejection, unknown-name rejection.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]