ShivamS136 opened a new issue, #15162: URL: https://github.com/apache/pinot/issues/15162
## Bug Description There appears to be a critical bug in Pinot v1.3.0 related to deduplication functionality when using `forceCommit` with dedup enabled. The consuming segments enter ERROR state on all servers except one, causing segments to go into BAD state and queries to fail. ## Environment - **Pinot Version**: 1.3.0 - **Image Used**: [release-1.3.0-21-amazoncorretto](https://hub.docker.com/layers/apachepinot/pinot/release-1.3.0-21-amazoncorretto/images/sha256-663341717988f7c7fe336497d2a5cee9cbfd3a0ec8f1cb05de99bcf7abbfd05c) - **Deployment Type**: Kubernates - **Configuration**: Deduplication enabled (Irrespective of preload) ## Steps to Reproduce 1. Create a table with deduplication enabled 2. Configure dedupConfig with dedupTimeColumn and metadataTTL 3. Ingest data using Kafka 4. Perform a forceCommit operation via controller ui 5. Observe consuming segments entering ERROR state on all servers except one 6. New segments get up in CONSUMING state ## Expected Behavior - All consuming segments should remain healthy/ONLINE after forceCommit - Deduplication should work consistently across all servers ## Actual Behavior - Consuming segments enter ERROR state on all servers except one - Segments go into BAD state - Queries fail ## Workaround Downgrading to `release-1.2.0-segment-lock-fix-21-amazoncorretto` resolves the issue, though there are warnings during table addition: ```json { "unrecognizedProperties": { "/dedupConfig/dedupTimeColumn": "insertion_time", "/dedupConfig/metadataTTL": 600000 }, "status": "Table leaderboard_entries_REALTIME successfully added" } ``` ## Table Configuration <details> <summary>Table Schema</summary> ```json { "schemaName": "leaderboard_entries", "dimensionFieldSpecs": [ { "name": "leaderboard_id", "dataType": "LONG" }, { "name": "participant_id", "dataType": "STRING" }, { "name": "attempt_number", "dataType": "INT", "defaultNullValue": 1 }, { "name": "entry_meta", "dataType": "JSON", "defaultNullValue": "{}" } ], "metricFieldSpecs": [ { "name": "score", "dataType": "INT", "defaultNullValue": 0 } ], "dateTimeFieldSpecs": [ { "name": "insertion_time", "dataType": "LONG", "format": "1:MILLISECONDS:EPOCH", "granularity": "1:MILLISECONDS" }, { "name": "attempt_time", "dataType": "LONG", "format": "1:MILLISECONDS:EPOCH", "granularity": "1:MILLISECONDS" } ], "primaryKeyColumns": ["leaderboard_id", "participant_id", "attempt_number"] } ``` </details> <details> <summary>Table Config</summary> ```json { "tableName": "leaderboard_entries", "tableType": "REALTIME", "segmentsConfig": { "timeColumnName": "insertion_time", "replication": "2", "retentionTimeUnit": "DAYS", "retentionTimeValue": "90", "timeType": "MILLISECONDS" }, "query": { "timeoutMs": "5000" }, "tenants": {}, "tableIndexConfig": { "sortedColumn": ["score"] }, "fieldConfigList": [ { "name": "leaderboard_id", "indexes": { "inverted": {} } }, { "name": "participant_id", "indexes": { "bloom": {} } } ], "ingestionConfig": { "streamIngestionConfig": { "streamConfigMaps": [ { "streamType": "kafka", "stream.kafka.consumer.type": "lowlevel", "stream.kafka.topic.name": "leaderboard-entry", "stream.kafka.broker.list": "kafka:9092", "stream.kafka.decoder.class.name": "org.apache.pinot.plugin.inputformat.json.JSONMessageDecoder", "stream.kafka.consumer.factory.class.name": "org.apache.pinot.plugin.stream.kafka30.KafkaConsumerFactory", "stream.kafka.consumer.prop.auto.offset.reset": "smallest", "stream.kafka.consumer.prop.format": "JSON", "realtime.segment.flush.threshold.time": "4h", "realtime.segment.flush.threshold.rows": "0", "realtime.segment.flush.threshold.segment.rows": "0", "realtime.segment.flush.threshold.segment.size": "20M" } ] } }, "metadata": { "customConfigs": {} }, "routing": { "instanceSelectorType": "strictReplicaGroup" }, "dedupConfig": { "dedupEnabled": true, "hashFunction": "NONE", "dedupTimeColumn": "insertion_time", "metadataTTL": 600000, "enablePreload": true } } ``` </details> ## Additional Notes - `kafka30.KafkaConsumerFactory` is only available in 1.3.0. In earlier versions like 1.2.0, `kafka20.KafkaConsumerFactory` was used instead. - The issue occurs with both Kafka consumer factory classes in version 1.3.0. ## Additional Information Related Slack thread with more info attached: https://apache-pinot.slack.com/archives/C011C9JHN7R/p1740757158048619 ## Screenshots    There seems to be a testing code running in the release tool as on hover of the segment state I see just "testing" instead of proper error:  -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
