xiangfu0 opened a new pull request, #18446:
URL: https://github.com/apache/pinot/pull/18446

   ## Summary
   
   Replaces the closed `PartitionFunctionFactory` enum/switch with an 
annotation-based dynamic registry so plug-in partition functions no longer 
require touching `pinot-segment-spi`.
   
   - New `@PartitionFunctionType(names = {...})` annotation in `pinot-spi`. 
Multiple aliases supported (e.g. `Murmur` / `Murmur2`).
   - `PartitionFunctionFactory` becomes a classpath-scanning registry (regex 
`.*\.partition\.function\..*`); the factory keeps its same static API, all 
callers unchanged. New `init()` mirrors `FunctionRegistry.init()` and is wired 
into broker / server / controller starters.
   - The seven built-in impls (`Modulo`, `Murmur` / `Murmur2`, `Murmur3`, 
`Fnv`, `HashCode`, `ByteArray`, `BoundedColumnValue`) move out of 
`pinot-segment-spi` into `org.apache.pinot.common.partition.function.*`, 
standardize on `(int numPartitions, Map<String,String> functionConfig)` 
constructor, and gain the annotation.
   - `PartitionFunction` interface gains a default `getPartitionIdNormalizer()` 
method, and a new `PartitionIntNormalizer` enum (`POSITIVE_MODULO` / `ABS` / 
`MASK`) lands in `pinot-segment-spi`. Each impl reports the closest-match 
normalizer name. The framework uses this only for identity / staleness matching 
between config-side and segment-side metadata; legacy impls still compute their 
own modulo internally. Javadoc spells out the descriptive-only nature for 
legacy functions. (This pulls in just the `getPartitionIdNormalizer` slice from 
#18165 so it can land independently.)
   
   ## Why
   
   Today, registering a custom partition function requires editing 
`PartitionFunctionFactory.PartitionFunctionType` and the `switch` block in the 
same SPI module. After this change, plug-ins drop a class on the classpath 
under `*.partition.function.*`, implement `PartitionFunction` with the standard 
ctor, and annotate `@PartitionFunctionType(names = "MyFn")` — the registry 
picks it up at startup.
   
   The `getPartitionIdNormalizer()` slice gives the framework a uniform way to 
describe int-to-partition-id semantics, paving the way for the expression-mode 
partition pipeline in #18165 to land cleanly without churning legacy impls 
again.
   
   ## Backward compatibility
   
   **This change is `backward-incompat`** for any external code that referenced 
the moved classes:
   
   - The seven impl classes' fully-qualified names changed: 
`org.apache.pinot.segment.spi.partition.*` → 
`org.apache.pinot.common.partition.function.*`.
   - The legacy single-arg constructors on `ModuloPartitionFunction`, 
`HashCodePartitionFunction`, and `ByteArrayPartitionFunction` are gone; all 
impls now use `(int numPartitions, Map<String, String> functionConfig)` (config 
may be `null`).
   - `PartitionFunctionFactory.PartitionFunctionType` enum is removed (it was 
an internal switch helper).
   
   **Segment-on-disk format is unaffected.** `getName()` strings (`Modulo`, 
`Murmur`, `Murmur3`, `FNV`, `HashCode`, `ByteArray`, `BoundedColumnValue`) are 
unchanged, so existing segment metadata resolves through the new registry by 
name.
   
   ## Plug-in path going forward
   
   ```java
   package com.acme.partition.function;
   
   @PartitionFunctionType(names = "MyFn")
   public class MyPartitionFunction implements PartitionFunction {
     public MyPartitionFunction(int numPartitions, @Nullable Map<String, 
String> config) { ... }
     // ...
   }
   ```
   
   Drop the jar on broker / server / controller / minion classpaths — done.
   
   ## Test plan
   
   - [x] `./mvnw spotless:apply checkstyle:check license:format license:check 
-pl 
pinot-spi,pinot-segment-spi,pinot-common,pinot-broker,pinot-server,pinot-controller,pinot-segment-local`
   - [x] `./mvnw -pl pinot-common -am 
-Dtest=PartitionFunctionTest,PartitionFunctionFactoryTest,PartitionIntNormalizerTest
 -Dsurefire.failIfNoSpecifiedTests=false test` (33 tests pass)
   - [x] `./mvnw -pl pinot-segment-local -am 
-Dtest=SegmentPartitionTest,ColumnMetadataTest,MutableSegmentImplDropRecordOnPartitionMismatchTest
 -Dsurefire.failIfNoSpecifiedTests=false test` (18 tests pass)
   - [x] `./mvnw -pl pinot-core -am -Dtest=ColumnValueSegmentPrunerTest 
-Dsurefire.failIfNoSpecifiedTests=false test` (3 tests pass)
   - [x] `./mvnw -pl 
pinot-broker,pinot-server,pinot-controller,pinot-segment-local,pinot-core,pinot-tools
 -am test-compile` (full test-compile clean)
   
   New regression coverage:
   - `PartitionFunctionFactoryTest`: registration completeness across all 7 
impls; alias resolution (`Murmur` / `Murmur2`); case-insensitive lookup; 
idempotent `init()`; unknown-name rejection; per-impl 
`getPartitionIdNormalizer()` label.
   - `PartitionIntNormalizerTest`: per-normalizer math at `Integer.MIN_VALUE` / 
`Integer.MAX_VALUE` / negative / `Long.MIN_VALUE`; range invariant `[0, 
numPartitions)` across all normalizers and a sweep of partition counts; 
`fromConfigString` round-trip, blank / null rejection, unknown-name rejection.
   
   🤖 Generated with [Claude Code](https://claude.com/claude-code)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to