platinumhamburg commented on code in PR #2161:
URL: https://github.com/apache/fluss/pull/2161#discussion_r2650268505
##########
fluss-server/src/main/java/org/apache/fluss/server/kv/KvTablet.java:
##########
@@ -468,8 +474,9 @@ private long applyInsert(
PaddingRow latestSchemaRow,
long logOffset)
throws Exception {
- walBuilder.append(ChangeType.INSERT,
latestSchemaRow.replaceRow(currentValue.row));
- kvPreWriteBuffer.put(key, currentValue.encodeValue(), logOffset);
+ BinaryValue newValue = autoIncProcessor.processAutoInc(currentValue);
Review Comment:
I've been wondering about the relationship between auto-increment column
handling and the Merge Engine. I think they should probably be integrated in
the design to avoid redundant encoding of row data, which would otherwise lead
to significant performance overhead.
##########
fluss-common/src/main/java/org/apache/fluss/config/ConfigOptions.java:
##########
@@ -1424,6 +1424,13 @@ public class ConfigOptions {
+ "The `first_row` merge engine will keep
the first row of the same primary key. "
+ "The `versioned` merge engine will keep
the row with the largest version of the same primary key.");
+ public static final ConfigOption<Long> TABLE_AUTO_INC_BATCH_SIZE =
+ key("table.auto-inc.batch-size")
+ .longType()
+ .defaultValue(100000L)
+ .withDescription(
+ "The batch size of auto-increment IDs fetched from
the distributed counter each time. This value determines the length of the ID
segment cached locally. Default: 100000.");
Review Comment:
New configuration options should be documented, including recommended values
and a description of the behavioral impact associated with different settings.
##########
fluss-client/src/test/java/org/apache/fluss/client/table/FlussTableITCase.java:
##########
@@ -456,6 +457,47 @@ void testInvalidPrefixLookup() throws Exception {
+ "because the lookup columns [b, a] must
contain all bucket keys [a, b] in order.");
}
+ @Test
+ void testPutAutoIncColumnAndLookup() throws Exception {
Review Comment:
To achieve more precise test assertions, I recommend explicitly creating a
single-bucket table, writing multiple batches of data—some with duplicate
primary keys and others with distinct primary keys—and then verifying that the
auto-increment column behaves correctly.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]