Aitozi commented on code in PR #7886:
URL: https://github.com/apache/paimon/pull/7886#discussion_r3263122140
##########
paimon-core/src/main/java/org/apache/paimon/append/AppendOnlyWriter.java:
##########
@@ -159,11 +164,18 @@ public AppendOnlyWriter(
this.statsCollectorFactories = statsCollectorFactories;
this.maxDiskSize = maxDiskSize;
this.fileIndexOptions = fileIndexOptions;
+ this.coreOptions = coreOptions;
- this.sinkWriter =
- useWriteBuffer
- ? createBufferedSinkWriter(spillable)
- : new DirectSinkWriter<>(this::createRollingRowWriter);
+ // Determine if we need to enable sorting based on clustering
configuration
+ List<String> clusteringColumns = coreOptions.clusteringColumns();
+ this.sortEnabled =
+ coreOptions.clusteringIncrementalEnabled()
+ && coreOptions.clusteringIncrementalOptimizeWrite()
+ && coreOptions.clusteringIncrementalMode()
+ ==
CoreOptions.ClusteringIncrementalMode.LOCAL_SORT
Review Comment:
The `LOCAL_SORT` is described as Task-Level sorting, but what we have
actually implemented is File-Level sorting.
Do we need to introduce a mode similar to "file_local" to represent this
specific granularity of File-Level sorting functionality?
```
/**
* Sort rows only within each compaction task (no global shuffle).
Every output file is
* internally ordered by the clustering columns, which is sufficient
for per-file Parquet
* lookup optimizations.
*/
LOCAL_SORT(
"local-sort",
"Sort rows only within each compaction task without global
shuffle. Every output file is internally ordered.");
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]