Jackie-Jiang commented on code in PR #13789:
URL: https://github.com/apache/pinot/pull/13789#discussion_r1715921352
##########
pinot-server/src/main/java/org/apache/pinot/server/api/resources/TablesResource.java:
##########
@@ -952,4 +956,42 @@ public TableSegmentValidationInfo
validateTableSegmentState(
}
return new TableSegmentValidationInfo(true, maxEndTimeMs);
}
+
+ @GET
+ @Path("/tables/{tableName}/segments/mismatch")
+ @Produces(MediaType.APPLICATION_JSON)
+ @ApiOperation(value = "Checks if there is any mismatch of columns in a
segment", notes =
+ "Returns true if reload is required on" + " any segment in a given
server")
+ @ApiResponses(value = {
+ @ApiResponse(code = 200, message = "Success", response =
TableSegments.class), @ApiResponse(code = 500,
+ message = "Server initialization error", response = ErrorInfo.class)
+ })
+ public String checkMismatchedSegments(
+ @ApiParam(value = "Table Name with type", required = true)
@PathParam("tableName") String tableName,
+ @ApiParam(value = "Column name", allowMultiple = true)
@QueryParam("columns") @DefaultValue("")
+ List<String> columns, @Context HttpHeaders headers) {
+ tableName = DatabaseUtils.translateTableName(tableName, headers);
+ TableDataManager tableDataManager =
ServerResourceUtils.checkGetTableDataManager(_serverInstance, tableName);
+ Pair<TableConfig, Schema> tableConfigSchema =
tableDataManager.fetchTableConfigAndSchema();
+ IndexLoadingConfig indexLoadingConfig =
+ tableDataManager.getIndexLoadingConfig(tableConfigSchema.getLeft(),
tableConfigSchema.getRight());
+ List<SegmentDataManager> segmentDataManagers =
tableDataManager.acquireAllSegments();
+ try {
+ boolean mismatchCheck = false;
+ for (SegmentDataManager segmentDataManager : segmentDataManagers) {
+ SegmentZKMetadata segmentZKMetadata =
tableDataManager.fetchZKMetadata(segmentDataManager.getSegmentName());
+ if (tableDataManager.checkReloadSegment(segmentZKMetadata,
Review Comment:
I'd suggest pushing down the logic completely into the `TableDataManager` so
that it is easier to manage and extend in the future. Basically add an API
`needReload()` to the `TableDataManager`
##########
pinot-server/src/main/java/org/apache/pinot/server/api/resources/TablesResource.java:
##########
@@ -952,4 +956,42 @@ public TableSegmentValidationInfo
validateTableSegmentState(
}
return new TableSegmentValidationInfo(true, maxEndTimeMs);
}
+
+ @GET
+ @Path("/tables/{tableName}/segments/mismatch")
+ @Produces(MediaType.APPLICATION_JSON)
+ @ApiOperation(value = "Checks if there is any mismatch of columns in a
segment", notes =
+ "Returns true if reload is required on" + " any segment in a given
server")
+ @ApiResponses(value = {
+ @ApiResponse(code = 200, message = "Success", response =
TableSegments.class), @ApiResponse(code = 500,
+ message = "Server initialization error", response = ErrorInfo.class)
+ })
+ public String checkMismatchedSegments(
+ @ApiParam(value = "Table Name with type", required = true)
@PathParam("tableName") String tableName,
+ @ApiParam(value = "Column name", allowMultiple = true)
@QueryParam("columns") @DefaultValue("")
+ List<String> columns, @Context HttpHeaders headers) {
+ tableName = DatabaseUtils.translateTableName(tableName, headers);
+ TableDataManager tableDataManager =
ServerResourceUtils.checkGetTableDataManager(_serverInstance, tableName);
+ Pair<TableConfig, Schema> tableConfigSchema =
tableDataManager.fetchTableConfigAndSchema();
+ IndexLoadingConfig indexLoadingConfig =
+ tableDataManager.getIndexLoadingConfig(tableConfigSchema.getLeft(),
tableConfigSchema.getRight());
Review Comment:
```suggestion
IndexLoadingConfig indexLoadingConfig =
tableDataManager.fetchIndexLoadingConfig();
```
##########
pinot-server/src/main/java/org/apache/pinot/server/api/resources/TablesResource.java:
##########
@@ -952,4 +956,42 @@ public TableSegmentValidationInfo
validateTableSegmentState(
}
return new TableSegmentValidationInfo(true, maxEndTimeMs);
}
+
+ @GET
+ @Path("/tables/{tableName}/segments/mismatch")
+ @Produces(MediaType.APPLICATION_JSON)
+ @ApiOperation(value = "Checks if there is any mismatch of columns in a
segment", notes =
+ "Returns true if reload is required on" + " any segment in a given
server")
+ @ApiResponses(value = {
+ @ApiResponse(code = 200, message = "Success", response =
TableSegments.class), @ApiResponse(code = 500,
+ message = "Server initialization error", response = ErrorInfo.class)
+ })
+ public String checkMismatchedSegments(
+ @ApiParam(value = "Table Name with type", required = true)
@PathParam("tableName") String tableName,
+ @ApiParam(value = "Column name", allowMultiple = true)
@QueryParam("columns") @DefaultValue("")
+ List<String> columns, @Context HttpHeaders headers) {
Review Comment:
Is `columns` needed?
##########
pinot-server/src/main/java/org/apache/pinot/server/api/resources/TablesResource.java:
##########
@@ -952,4 +956,42 @@ public TableSegmentValidationInfo
validateTableSegmentState(
}
return new TableSegmentValidationInfo(true, maxEndTimeMs);
}
+
+ @GET
+ @Path("/tables/{tableName}/segments/mismatch")
+ @Produces(MediaType.APPLICATION_JSON)
+ @ApiOperation(value = "Checks if there is any mismatch of columns in a
segment", notes =
Review Comment:
The API path and doc is a little bit misleading. We want to check if reload
is needed, instead of checking column mismatch
##########
pinot-core/src/main/java/org/apache/pinot/core/data/manager/BaseTableDataManager.java:
##########
@@ -1013,6 +1013,47 @@ public boolean tryLoadExistingSegment(SegmentZKMetadata
zkMetadata, IndexLoading
}
}
+ @Override
+ public boolean checkReloadSegment(SegmentZKMetadata zkMetadata,
IndexLoadingConfig indexLoadingConfig) {
+ String segmentName = zkMetadata.getSegmentName();
+
+ // Creates the SegmentDirectory object to access the segment metadata.
+ // The metadata is null if the segment doesn't exist yet.
+ SegmentDirectory segmentDirectory =
+ tryInitSegmentDirectory(segmentName,
String.valueOf(zkMetadata.getCrc()), indexLoadingConfig);
+ SegmentMetadataImpl segmentMetadata = (segmentDirectory == null) ? null :
segmentDirectory.getSegmentMetadata();
+
+ // If the segment doesn't exist on server or its CRC has changed, then we
+ // need to fall back to download the segment from deep store to load it.
+ if (segmentMetadata == null || !hasSameCRC(zkMetadata, segmentMetadata)) {
+ if (segmentMetadata == null) {
+ _logger.info("Segment: {} does not exist", segmentName);
+ } else if (!hasSameCRC(zkMetadata, segmentMetadata)) {
+ _logger.info("Segment: {} has CRC changed from: {} to: {}",
segmentName, segmentMetadata.getCrc(),
+ zkMetadata.getCrc());
+ }
+ closeSegmentDirectoryQuietly(segmentDirectory);
+ return true;
+ }
+
+ try {
+ Schema schema = indexLoadingConfig.getSchema();
+ //if the reload of the segment is not needed then return false
+ if (!ImmutableSegmentLoader.needPreprocess(segmentDirectory,
indexLoadingConfig, schema)) {
+ _logger.info("Segment: {} is consistent with latest table config and
schema", segmentName);
Review Comment:
This can flood the log (consider thousands of segments on each server)
##########
pinot-common/src/main/java/org/apache/pinot/common/restlet/resources/SegmentColumnMismatchResponse.java:
##########
@@ -0,0 +1,48 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied. See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.common.restlet.resources;
+
+import com.fasterxml.jackson.annotation.JsonCreator;
+import com.fasterxml.jackson.annotation.JsonProperty;
+
+
+/**
+ * This class gives the data of a server if there exists any segments that
need to be reloaded
+ *
+ * It has details of server id and returns true/false if there are any
segments to be reloaded or not.
+ */
+public class SegmentColumnMismatchResponse {
Review Comment:
Suggest renaming it to `SegmentReloadCheckResponse` and rename the fields
accordingly
##########
pinot-core/src/main/java/org/apache/pinot/core/data/manager/BaseTableDataManager.java:
##########
@@ -1013,6 +1013,47 @@ public boolean tryLoadExistingSegment(SegmentZKMetadata
zkMetadata, IndexLoading
}
}
+ @Override
+ public boolean checkReloadSegment(SegmentZKMetadata zkMetadata,
IndexLoadingConfig indexLoadingConfig) {
+ String segmentName = zkMetadata.getSegmentName();
+
+ // Creates the SegmentDirectory object to access the segment metadata.
+ // The metadata is null if the segment doesn't exist yet.
+ SegmentDirectory segmentDirectory =
+ tryInitSegmentDirectory(segmentName,
String.valueOf(zkMetadata.getCrc()), indexLoadingConfig);
+ SegmentMetadataImpl segmentMetadata = (segmentDirectory == null) ? null :
segmentDirectory.getSegmentMetadata();
+
+ // If the segment doesn't exist on server or its CRC has changed, then we
+ // need to fall back to download the segment from deep store to load it.
+ if (segmentMetadata == null || !hasSameCRC(zkMetadata, segmentMetadata)) {
Review Comment:
Let's not read ZK metadata or do CRC check here because it can drastically
increase the ZK load. `needPreprocess()` check should be enough
##########
pinot-common/src/main/java/org/apache/pinot/common/restlet/resources/SegmentColumnMismatchResponse.java:
##########
@@ -0,0 +1,48 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied. See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.common.restlet.resources;
+
+import com.fasterxml.jackson.annotation.JsonCreator;
+import com.fasterxml.jackson.annotation.JsonProperty;
+
+
+/**
+ * This class gives the data of a server if there exists any segments that
need to be reloaded
+ *
+ * It has details of server id and returns true/false if there are any
segments to be reloaded or not.
+ */
+public class SegmentColumnMismatchResponse {
+ boolean _isMismatch;
+ String _serverInstanceId;
Review Comment:
They can be defined as `private final`
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]