xiangfu0 commented on PR #18869:
URL: https://github.com/apache/pinot/pull/18869#issuecomment-4860707548

   Rebased onto latest `master` and addressed all review feedback. Summary of 
changes:
   
   ### @Jackie-Jiang
   - **`UuidUtils` javadoc → markdown (`///`)** — converted throughout 
`UuidUtils` and the extracted `UuidKey`.
   - **Private constructor next to class declaration** — moved to the top of 
`UuidUtils`.
   - **Remove per-value `null` checks** — removed all `validateNotNull(...)` 
calls and the helper; callers must pass non-null (documented on the class). 
Kept only the 16-byte length validation and the `isUuid(...)` `null → false` 
contract.
   - **`toBytes(byte[])` defensive copy** — removed; it now validates the width 
and returns the caller-owned buffer as-is.
   - **Move `UuidKey` to a dedicated class** — extracted to `UuidKey.java`.
   - **`FieldSpec.getDefaultNullValue()` copy** — removed; returns the stored 
default directly (test updated to `assertSame`).
   - **Enum ordinal-append hack** — `UUID` is now placed **right after its 
stored type `BYTES`** (not at the end), and the fragile "must stay last / 
AnyValueAggregationFunction" comment is gone. Making 
`AnyValueAggregationFunction` self-contained is left as the separate follow-up 
you suggested.
   - **`case UUID` after `case BYTES`** — reordered in 
`convert()`/`convertInternal()` (the other switches already had this order).
   - **`getDefaultNullValueString` / `getStringDefaultNullValue` "why the check 
/ why specialize UUID"** — the UUID branch is required for correctness: `byte[] 
→ hex` does not re-parse, so UUID must stringify to its canonical `8-4-4-4-12` 
form to round-trip through `getDefaultNullValue(...)`. For every non-UUID type 
`_dataType.toString(v) == getStringValue(v)`, so behavior is unchanged 
elsewhere. I folded the `instanceof String` branch into 
`getStringDefaultNullValue` so `setDefaultNullValue` is simpler.
   - **`toBytes(String)` canonical check / length check as hotspots** — kept 
the validated methods as the ingestion boundary; the hot compute path goes 
through `toUUID(long,long)` / `UuidKey.fromLongs(...)`, which skip validation 
entirely. Happy to add an explicit unchecked variant if a profiled call site 
needs one, but I left it out for now to avoid an unused API.
   
   ### @gortiz
   - **`StringFunctions.fromUUIDBytes` back-compat** — restored the 
null-on-exception behavior (try/catch → `null`) and corrected the Javadoc.
   - **Time-based helper tests** — added `UuidUtilsTest` coverage: `getVersion` 
returns 4/7 for `randomV4`/`randomV7`, `getTimestampMillis(randomV7())` is 
within `[before, after]` of `System.currentTimeMillis()`, a fixed-layout v7 
assertion, and `getTimestampMillis` throws for a v4 UUID.
   - **Description bullet** — fixed: MV UUID is **allowed** (validation is 
consistent with the rest of the stack), not rejected.
   - **Nits** — `///` javadoc done; the `TableIndexingTest` comment no longer 
names tests that only exist in later PRs.
   
   ### @Copilot
   - **`PinotDataType.STRING.toUUID` trim** — now trims (matches the other 
STRING→primitive conversions and `UUID.convert(..., STRING)`).
   - **MV UUID validation** — intentional (see description fix above); behavior 
kept.
   - **`fromUUIDBytes` / version+timestamp coverage** — covered above.
   
   CI is green except the **Binary Compatibility Check**, which flags the 
additive `DataType.UUID` enum value and the `UuidUtils` API changes — expected 
for a new logical type.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to