preetiughrejiyanextgen-bit opened a new issue, #18505:
URL: https://github.com/apache/pinot/issues/18505
I want to update my offline table based on id column with comparisonColumn
as created_timestamp and routing to strictReplicaGroup. This config is working
fine for single server, but when I configured this for multiple servers. It
actually not upserting rows based on id but actually appending new rows which
ends up in giving multiple rows for same id.
So I find that as we have not partitioned rows based on id, it actually
selects random server say server1 and inserts segment into it. Now when I again
ingest same id file, it randomly selects server say server0, now because server
has not stored any metadata for that id, server stores this segment, which
results into multiple rows of same id.
So I added partition based on column id with Murmur function and
numPartitions = 3, which used to complete the SegmentGenerationAndPushTask, but
segment stays in BAD state with error like
{
"segmentName":
"RAW_WITH_PARTITION1_CLIENT_UPSERT_OFFLINE_2026-04-16_2026-04-26_0_34624b1d-c101-41aa-a374-7b11052e1abf",
"serverState": {
"Server_pinot-server-2.pinot-server-headless.cp-pinot.svc.cluster.local_8098": {
"idealState": "ONLINE",
"externalView": "ERROR",
"segmentSize": "0 bytes",
"consumerInfo": null,
"errorInfo": {
"timestamp": "2026-05-15 08:29:16 GMT",
"errorMessage": "Caught exception while adding ONLINE segment",
"stackTrace": "java.lang.NullPointerException: Failed to get
partition id for segment:
RAW_WITH_PARTITION1_CLIENT_UPSERT_OFFLINE_2026-04-16_2026-04-26_0_34624b1d-c101-41aa-a374-7b11052e1abf
(upsert-enabled table: RAW_WITH_PARTITION1_CLIENT_UPSERT_OFFLINE). Segment
must follow a naming convention that encodes partition id (e.g. LLCSegmentName,
UploadedRealtimeSegmentName), or have partition metadata configured via
SegmentPartitionConfig ...
and metadata of this segment is below:
{
"segment.start.time": "1776297600000",
"segment.time.unit": "MILLISECONDS",
"segment.size.in.bytes": "188134",
"segment.end.time": "1777161600000",
"segment.total.docs": "5000",
"segment.creation.time": "1778832621470",
"segment.push.time": "1778832621499",
"segment.end.time.raw": "1777161600000",
"segment.start.time.raw": "1776297600000",
"segment.index.version": "v3",
"segment.data.crc": "334204552",
"custom.map":
"{\"input.data.file.uri\":\"s3://pinot-dch-test/tims/with_primary_key/cdr_2026-04-26.csv\"}",
"segment.crc": "2602756192",
"segment.partition.metadata":
"{\"columnPartitionMap\":{\"client_pmn\":{\"partitions\":[0,1,2],\"functionName\":\"Murmur\",\"numPartitions\":3,\"functionConfig\":null}}}",
"segment.download.url":
"http://pinot-controller-0.pinot-controller-headless.cp-pinot.svc.cluster.local:9000/segments/RAW_WITH_PARTITION1_CLIENT_UPSERT/RAW_WITH_PARTITION1_CLIENT_UPSERT_OFFLINE_2026-04-16_2026-04-26_0_34624b1d-c101-41aa-a374-7b11052e1abf"
}
**Pinot OFFLINE Upsert Partitioning Issue**
Using Apache Pinot 1.5.0 with batch ingestion via
SegmentGenerationAndPushTask.
I observed different behavior between:
Normal OFFLINE partitioned table
OFFLINE upsert-enabled partitioned table
**Case 1 — Normal OFFLINE Partitioned Table (Without Upsert) (Works)**
Configuration:
partition column = client_pmn
numPartitions = 3
Generated segment metadata:
"partitions":[0,1,2]
Segment loads successfully and table works fine.
This indicates:
multi-partition segments are valid for normal OFFLINE tables
partition metadata generation works correctly
**Case 2 — OFFLINE Upsert Table (Fails)**
Configuration:
same partitioning setup
added:
upsertConfig
strictReplicaGroup
Generated segment metadata:
"partitions":[0,1]
Segment fails to load with:
Failed to get partition id for segment
Caught exception while adding ONLINE segment
**My understanding is:**
normal OFFLINE tables allow segments spanning multiple partitions
OFFLINE upsert may require:
for deterministic partition ownership with strictReplicaGroup
**Additional Observation**
prePartition=true appears to add partition metadata but does NOT physically
isolate rows into partition-specific segments during
SegmentGenerationAndPushTask.
**Question**
Is this expected behavior/limitation for OFFLINE upsert tables?
If yes, is external preprocessing (e.g. Spark repartition by
hash(primaryKey)%N) currently the recommended approach for scalable OFFLINE
upsert batch ingestion?
one segment == one partition
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]