sandynz commented on code in PR #24293:
URL: https://github.com/apache/shardingsphere/pull/24293#discussion_r1113898689
##########
kernel/data-pipeline/api/src/main/java/org/apache/shardingsphere/data/pipeline/spi/sqlbuilder/PipelineSQLBuilder.java:
##########
@@ -135,6 +135,16 @@ default Optional<String> buildCreateSchemaSQL(String
schemaName) {
// TODO keep it for now, it might be used later
String buildCountSQL(String schemaName, String tableName);
+ /**
+ * Build Estimate count SQL.
+ *
+ * @param databaseName database name
+ * @param schemaName schema name
+ * @param tableName table name
+ * @return estimate count sql
+ */
+ String buildEstimateCountSQL(String databaseName, String schemaName,
String tableName);
Review Comment:
`buildEstimateCountSQL` could be `buildEstimatedCountSQL`
##########
kernel/data-pipeline/core/src/main/java/org/apache/shardingsphere/data/pipeline/core/prepare/InventoryTaskSplitter.java:
##########
@@ -171,12 +171,20 @@ private long getTableRecordsCount(final
InventoryIncrementalJobItemContext jobIt
PipelineJobConfiguration jobConfig = jobItemContext.getJobConfig();
String schemaName = dumperConfig.getSchemaName(new
LogicTableName(dumperConfig.getLogicTableName()));
String actualTableName = dumperConfig.getActualTableName();
- // TODO with a large amount of data, count the full table will have
performance problem
- String sql =
PipelineTypedSPILoader.getDatabaseTypedService(PipelineSQLBuilder.class,
jobConfig.getSourceDatabaseType()).buildCountSQL(schemaName, actualTableName);
+ PipelineSQLBuilder pipelineSQLBuilder =
PipelineTypedSPILoader.getDatabaseTypedService(PipelineSQLBuilder.class,
jobConfig.getSourceDatabaseType());
+ String sql =
pipelineSQLBuilder.buildEstimateCountSQL(dumperConfig.getDataSourceName(),
schemaName, actualTableName);
try (
Connection connection = dataSource.getConnection();
PreparedStatement preparedStatement =
connection.prepareStatement(sql)) {
+ long estimateCount;
Review Comment:
`estimateCount` could be `result`
##########
kernel/data-pipeline/dialect/postgresql/src/main/java/org/apache/shardingsphere/data/pipeline/postgresql/sqlbuilder/PostgreSQLPipelineSQLBuilder.java:
##########
@@ -85,6 +85,12 @@ private String buildConflictSQL(final DataRecord dataRecord)
{
return result.toString();
}
+ @Override
+ public String buildEstimateCountSQL(final String databaseName, final
String schemaName, final String tableName) {
+ // TODO Support estimate count later.
+ return buildCountSQL(schemaName, tableName);
+ }
Review Comment:
It's not supported, so it should return `Optional<String>`
##########
kernel/data-pipeline/core/src/main/java/org/apache/shardingsphere/data/pipeline/core/prepare/InventoryTaskSplitter.java:
##########
@@ -171,12 +171,20 @@ private long getTableRecordsCount(final
InventoryIncrementalJobItemContext jobIt
PipelineJobConfiguration jobConfig = jobItemContext.getJobConfig();
String schemaName = dumperConfig.getSchemaName(new
LogicTableName(dumperConfig.getLogicTableName()));
String actualTableName = dumperConfig.getActualTableName();
- // TODO with a large amount of data, count the full table will have
performance problem
- String sql =
PipelineTypedSPILoader.getDatabaseTypedService(PipelineSQLBuilder.class,
jobConfig.getSourceDatabaseType()).buildCountSQL(schemaName, actualTableName);
+ PipelineSQLBuilder pipelineSQLBuilder =
PipelineTypedSPILoader.getDatabaseTypedService(PipelineSQLBuilder.class,
jobConfig.getSourceDatabaseType());
+ String sql =
pipelineSQLBuilder.buildEstimateCountSQL(dumperConfig.getDataSourceName(),
schemaName, actualTableName);
try (
Connection connection = dataSource.getConnection();
PreparedStatement preparedStatement =
connection.prepareStatement(sql)) {
+ long estimateCount;
try (ResultSet resultSet = preparedStatement.executeQuery()) {
+ resultSet.next();
+ estimateCount = resultSet.getLong(1);
+ }
+ if (estimateCount > 0) {
+ return estimateCount;
+ }
+ try (ResultSet resultSet =
connection.createStatement().executeQuery(pipelineSQLBuilder.buildCountSQL(schemaName,
actualTableName))) {
Review Comment:
1, `buildCountSQL` might cost too much time, so add some log for it,
includes cost time.
2, connection will return to pool but not closed, so will
`connection.createStatement()` be closed automatically?
##########
kernel/data-pipeline/dialect/mysql/src/main/java/org/apache/shardingsphere/data/pipeline/mysql/sqlbuilder/MySQLPipelineSQLBuilder.java:
##########
@@ -89,6 +89,11 @@ public Optional<String> buildCRC32SQL(final String
schemaName, final String tabl
return Optional.of(String.format("SELECT BIT_XOR(CAST(CRC32(%s) AS
UNSIGNED)) AS checksum, COUNT(1) AS cnt FROM %s", quote(column),
quote(tableName)));
}
+ @Override
+ public String buildEstimateCountSQL(final String databaseName, final
String schemaName, final String tableName) {
+ return String.format("SELECT TABLE_ROWS from INFORMATION_SCHEMA.TABLES
WHERE TABLE_SCHEMA = '%s' AND TABLE_NAME = '%s'", databaseName,
getQualifiedTableName(schemaName, tableName));
Review Comment:
`from` could be `FROM`
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]