Copilot commented on code in PR #17536:
URL: https://github.com/apache/pinot/pull/17536#discussion_r2707881028
##########
pinot-core/src/main/java/org/apache/pinot/core/data/manager/offline/DimensionTableDataManager.java:
##########
@@ -320,7 +325,12 @@ private DimensionTable createMemOptimisedDimensionTable() {
Object[] primaryKey = recordReader.getRecordValues(i, pkIndexes);
long readerIdxAndDocId = (((long) readerIdx) << 32) | (i &
0xffffffffL);
- lookupTable.put(primaryKey, readerIdxAndDocId);
+ long previousValue = lookupTable.put(primaryKey,
readerIdxAndDocId);
+ if (!_enableUpsert && _errorOnDuplicatePrimaryKey && previousValue
!= Long.MIN_VALUE) {
+ throw new IllegalStateException(
+ "Caught exception while reading records from segment: " +
indexSegment.getSegmentName()
+ + "primary key already exist for: " +
Arrays.toString(primaryKey));
Review Comment:
Corrected grammar in error message from 'primary key already exist' to
'primary key already exists'.
```suggestion
+ "primary key already exists for: " +
Arrays.toString(primaryKey));
```
##########
pinot-core/src/main/java/org/apache/pinot/core/data/manager/offline/DimensionTableDataManager.java:
##########
@@ -320,7 +325,12 @@ private DimensionTable createMemOptimisedDimensionTable() {
Object[] primaryKey = recordReader.getRecordValues(i, pkIndexes);
long readerIdxAndDocId = (((long) readerIdx) << 32) | (i &
0xffffffffL);
- lookupTable.put(primaryKey, readerIdxAndDocId);
+ long previousValue = lookupTable.put(primaryKey,
readerIdxAndDocId);
+ if (!_enableUpsert && _errorOnDuplicatePrimaryKey && previousValue
!= Long.MIN_VALUE) {
Review Comment:
The duplicate detection logic assumes `Long.MIN_VALUE` is the sentinel for
'no previous value', but `LongObjectHashMap.put()` returns the previous value
or a default. If `Long.MIN_VALUE` is a legitimate doc ID or the map's default
differs, this check will fail. Verify the actual return value convention for
`LongObjectHashMap` and use the correct sentinel or check method.
```suggestion
boolean hasExistingKey = lookupTable.containsKey(primaryKey);
lookupTable.put(primaryKey, readerIdxAndDocId);
if (!_enableUpsert && _errorOnDuplicatePrimaryKey &&
hasExistingKey) {
```
##########
pinot-core/src/main/java/org/apache/pinot/core/data/manager/offline/DimensionTableDataManager.java:
##########
@@ -320,7 +325,12 @@ private DimensionTable createMemOptimisedDimensionTable() {
Object[] primaryKey = recordReader.getRecordValues(i, pkIndexes);
long readerIdxAndDocId = (((long) readerIdx) << 32) | (i &
0xffffffffL);
- lookupTable.put(primaryKey, readerIdxAndDocId);
+ long previousValue = lookupTable.put(primaryKey,
readerIdxAndDocId);
+ if (!_enableUpsert && _errorOnDuplicatePrimaryKey && previousValue
!= Long.MIN_VALUE) {
+ throw new IllegalStateException(
+ "Caught exception while reading records from segment: " +
indexSegment.getSegmentName()
+ + "primary key already exist for: " +
Arrays.toString(primaryKey));
Review Comment:
Missing space between segment name and 'primary key' in the concatenated
error message. Should be: `+ \" primary key already exist for: \"` to match the
formatting in the fast-lookup path at line 256.
```suggestion
+ " primary key already exist for: " +
Arrays.toString(primaryKey));
```
##########
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/utils/TableConfigUtils.java:
##########
@@ -385,6 +387,16 @@ private static void validateValidationConfig(TableConfig
tableConfig, Schema sch
validateRetentionConfig(tableConfig);
}
+ private static void validateDimensionTableConfig(TableConfig tableConfig) {
+ DimensionTableConfig dimensionTableConfig =
tableConfig.getDimensionTableConfig();
+ if (dimensionTableConfig == null) {
+ return;
+ }
+ Preconditions.checkState(!(dimensionTableConfig.isEnableUpsert()
+ && dimensionTableConfig.isErrorOnDuplicatePrimaryKey()),
+ "Dimension table config cannot enable upsert and
errorOnDuplicatePrimaryKey at the same time");
Review Comment:
The error message uses mixed naming styles: 'upsert' (lowercase) and
'errorOnDuplicatePrimaryKey' (camelCase). For consistency and clarity, either
use all user-facing terms ('upsert' and 'error-on-duplicate-primary-key') or
all code identifiers. Consider: 'Dimension table config cannot have both
enableUpsert and errorOnDuplicatePrimaryKey enabled'.
```suggestion
"Dimension table config cannot have both enableUpsert and
errorOnDuplicatePrimaryKey enabled");
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]