JingsongLi commented on code in PR #6635:
URL: https://github.com/apache/paimon/pull/6635#discussion_r2552731215
##########
paimon-core/src/main/java/org/apache/paimon/manifest/IndexManifestEntrySerializer.java:
##########
@@ -44,6 +44,12 @@ public int getVersion() {
@Override
public InternalRow convertTo(IndexManifestEntry record) {
IndexFileMeta indexFile = record.indexFile();
+ InternalRow globalIndexRow =
Review Comment:
This row should be nullable.
##########
paimon-core/src/main/java/org/apache/paimon/index/IndexPathFactory.java:
##########
@@ -23,6 +23,8 @@
/** Path factory to create an index path. */
public interface IndexPathFactory {
+ Path getPath(String fileName);
Review Comment:
toPath, let this be unified to toPath(file)
##########
paimon-core/src/main/java/org/apache/paimon/index/IndexFileHandler.java:
##########
@@ -109,6 +110,29 @@ public List<IndexManifestEntry> scan(Snapshot snapshot,
String indexType) {
return result;
}
+ public List<IndexManifestEntry> scan(Filter<IndexManifestEntry>
readTFilter) {
Review Comment:
Please just provide scan with `Snapshot`. We should generate Snapshot in
outside to avoid load Snapshot many times.
##########
paimon-core/src/main/java/org/apache/paimon/index/IndexFileMeta.java:
##########
@@ -53,13 +55,40 @@ public class IndexFileMeta {
4,
"_DELETIONS_VECTORS_RANGES",
new ArrayType(true,
DeletionVectorMeta.SCHEMA)),
- new DataField(5, "_EXTERNAL_PATH",
newStringType(true))));
+ new DataField(5, "_EXTERNAL_PATH",
newStringType(true)),
+ new DataField(
+ 6,
+ "_GLOBAL_INDEX",
+ new RowType(
+ true,
+ Arrays.asList(
+ new DataField(
+ 0,
+ "_ROW_RANGE_START",
+ new
BigIntType(true)),
+ new DataField(
+ 1,
+ "_ROW_RANGE_END",
+ new
BigIntType(true)),
+ new DataField(
+ 2,
+ "_INDEX_FIELD_ID",
+ new IntType(true)),
+ new DataField(
+ 3,
+ "_INDEX_META",
+ new
VarBinaryType()))))));
Review Comment:
DataTypes.BYTES
##########
paimon-api/src/main/java/org/apache/paimon/CoreOptions.java:
##########
@@ -2076,6 +2076,12 @@ public InlineElement getDescription() {
.withDescription(
"Whether to write the data into fixed bucket for
batch writing a postpone bucket table.");
+ public static final ConfigOption<Long> GLOBAL_INDEX_ROW_COUNT_PER_SHARD =
+ key("global-index.row_count_per_shard")
Review Comment:
Use '-' instead of '_'
##########
paimon-core/src/main/java/org/apache/paimon/index/IndexFileMeta.java:
##########
@@ -53,13 +55,40 @@ public class IndexFileMeta {
4,
"_DELETIONS_VECTORS_RANGES",
new ArrayType(true,
DeletionVectorMeta.SCHEMA)),
- new DataField(5, "_EXTERNAL_PATH",
newStringType(true))));
+ new DataField(5, "_EXTERNAL_PATH",
newStringType(true)),
+ new DataField(
+ 6,
+ "_GLOBAL_INDEX",
+ new RowType(
+ true,
+ Arrays.asList(
+ new DataField(
+ 0,
+ "_ROW_RANGE_START",
+ new
BigIntType(true)),
+ new DataField(
+ 1,
+ "_ROW_RANGE_END",
+ new
BigIntType(true)),
+ new DataField(
+ 2,
+ "_INDEX_FIELD_ID",
+ new IntType(true)),
+ new DataField(
+ 3,
+ "_INDEX_META",
+ new
VarBinaryType()))))));
private final String indexType;
private final String fileName;
private final long fileSize;
private final long rowCount;
+ @Nullable private final Long rowRangeStart;
Review Comment:
Do not provide these fields. Introduce a `GlobalIndexMeta`
##########
paimon-core/src/main/java/org/apache/paimon/index/IndexFileMeta.java:
##########
@@ -53,13 +55,40 @@ public class IndexFileMeta {
4,
"_DELETIONS_VECTORS_RANGES",
new ArrayType(true,
DeletionVectorMeta.SCHEMA)),
- new DataField(5, "_EXTERNAL_PATH",
newStringType(true))));
+ new DataField(5, "_EXTERNAL_PATH",
newStringType(true)),
+ new DataField(
+ 6,
+ "_GLOBAL_INDEX",
Review Comment:
Extract a static field:
public static RowType GLOBAL_INDEX = XXX;
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]