steFaiz opened a new pull request, #6702:
URL: https://github.com/apache/paimon/pull/6702

   <!-- Please specify the module before the PR name: [core] ... or [flink] ... 
-->
   
   ### SST File format
   <!-- Linking this pull request to the issue -->
   Linked issue: None
   
   This PR is about to introduce an SST FileFormat for paimon, which is useful 
in below scenarios:
   1. PK Lookup accelerate
   2. BTree Global Index
   
   An SST File is designed to serve:
   1. Point queries: lookup a specified key
   2. Range queries: seek to somewhere on which the record is exactly greater 
than or equal to target key, then scan the rest of records (user can decide 
when to stop)
   3. Random access: directly return records specified by selection
   
   <!-- What is the purpose of the change -->
   
   ### Tests
   Please See:
   1. org.apache.paimon.format.sst.SstFileFormatTest
   4. org.apache.paimon.format.sst.SstFileTest
   
   <!-- List UT and IT cases to verify this change -->
   
   ### API and Format
   This PR adds a pair of new interface for FileFormat:
   ```
    public abstract class FileFormat {
   
       public FormatReaderFactory createReaderFactory(
               RowType dataSchemaRowType,
               RowType projectedRowType,
               @Nullable List<Predicate> filters,
               RowType keyType,
               RowType valueType) {
           return createReaderFactory(dataSchemaRowType, projectedRowType, 
filters);
       }
   
       public FormatWriterFactory createWriterFactory(
               RowType type, RowType keyType, RowType valueType) {
           return createWriterFactory(type);
       }
   }
   ```
   These two methods will be implemented by File Formats which need to 
distinguish keys and values from an input row.
   
   <!-- Does this change affect API or storage format -->
   
   ### Documentation
   
   Documents will be added ASAP.
   <!-- Does this change introduce a new feature -->
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to