JackieTien97 commented on code in PR #744:
URL: https://github.com/apache/tsfile/pull/744#discussion_r2944758562
##########
java/tsfile/src/main/java/org/apache/tsfile/read/common/block/TsBlockUtil.java:
##########
@@ -75,10 +76,32 @@ public static TsBlock applyFilterAndLimitOffsetToTsBlock(
TsBlockBuilder builder,
Filter pushDownFilter,
PaginationController paginationController) {
+ return applyFilterAndLimitOffsetToTsBlock(
+ unFilteredBlock, builder, pushDownFilter, paginationController, null);
+ }
+
+ public static TsBlock applyFilterAndLimitOffsetToTsBlock(
+ TsBlock unFilteredBlock,
+ TsBlockBuilder builder,
+ Filter pushDownFilter,
+ PaginationController paginationController,
+ Consumer<Long> filterRowsRecorder) {
Review Comment:
```suggestion
LongConsumer filterRowsRecorder) {
```
##########
java/tsfile/src/main/java/org/apache/tsfile/read/reader/IPageReader.java:
##########
@@ -39,6 +40,8 @@ default BatchData getAllSatisfiedPageData() throws
IOException {
TsBlock getAllSatisfiedData() throws IOException;
+ TsBlock getAllSatisfiedData(Consumer<Long> filterRowsRecorder) throws
IOException;
Review Comment:
add some java doc comments about this method, explain its difference between
original `getAllSatisfiedData()` method.
##########
java/tsfile/src/main/java/org/apache/tsfile/read/filter/basic/Filter.java:
##########
@@ -117,6 +118,44 @@ public abstract class Filter {
*/
public abstract boolean[] satisfyTsBlock(boolean[] selection, TsBlock
tsBlock);
+ public final boolean[] satisfyTsBlock(
+ boolean[] selection, TsBlock tsBlock, Consumer<Long> filterRowsRecorder)
{
Review Comment:
```suggestion
boolean[] selection, TsBlock tsBlock, LongConsumer filterRowsRecorder)
{
```
##########
java/tsfile/src/main/java/org/apache/tsfile/read/reader/page/PageReader.java:
##########
@@ -212,6 +213,11 @@ public BatchData getAllSatisfiedPageData(boolean
ascending) throws IOException {
@Override
public TsBlock getAllSatisfiedData() throws IOException {
+ return getAllSatisfiedData(null);
+ }
+
+ @Override
+ public TsBlock getAllSatisfiedData(Consumer<Long> filterRowsRecorder) throws
IOException {
Review Comment:
```suggestion
public TsBlock getAllSatisfiedData(LongConsumer filterRowsRecorder) throws
IOException {
```
##########
java/tsfile/src/main/java/org/apache/tsfile/read/reader/page/AbstractAlignedPageReader.java:
##########
@@ -215,6 +217,55 @@ public TsBlock getAllSatisfiedData() throws IOException {
unFilteredBlock, builder, pushDownFilter, paginationController);
}
+ /**
+ * get all satisfied data while record the number of filter rows. if one
tuple is not satisfy by
+ * the filter and is deleted at the same time, the tuple cannot be
considered as a filtered data.
+ */
+ @Override
+ public TsBlock getAllSatisfiedData(Consumer<Long> filterRowsRecorder) throws
IOException {
Review Comment:
```suggestion
public TsBlock getAllSatisfiedData(LongConsumer filterRowsRecorder) throws
IOException {
```
##########
java/tsfile/src/main/java/org/apache/tsfile/read/reader/IPageReader.java:
##########
@@ -39,6 +40,8 @@ default BatchData getAllSatisfiedPageData() throws
IOException {
TsBlock getAllSatisfiedData() throws IOException;
+ TsBlock getAllSatisfiedData(Consumer<Long> filterRowsRecorder) throws
IOException;
Review Comment:
```suggestion
TsBlock getAllSatisfiedData(LongConsumer filterRowsRecorder) throws
IOException;
```
##########
java/tsfile/src/main/java/org/apache/tsfile/read/filter/basic/Filter.java:
##########
@@ -117,6 +118,44 @@ public abstract class Filter {
*/
public abstract boolean[] satisfyTsBlock(boolean[] selection, TsBlock
tsBlock);
+ public final boolean[] satisfyTsBlock(
+ boolean[] selection, TsBlock tsBlock, Consumer<Long> filterRowsRecorder)
{
+
+ int inputCount = countSelectedRows(selection);
+ boolean[] result = satisfyTsBlock(selection, tsBlock);
+ int outputCount = countSelectedRows(result);
+ if (inputCount > outputCount) {
+ filterRowsRecorder.accept((long) (inputCount - outputCount));
+ }
+
+ return result;
+ }
+
+ private static int countSelectedRows(boolean[] selection) {
+ if (selection == null) return 0;
+ int count = 0;
+ int length = selection.length;
+ int i = 0;
+
+ // calculate multi times
+ for (; i < length - 7; i += 8) {
+ count +=
+ (selection[i] ? 1 : 0)
+ + (selection[i + 1] ? 1 : 0)
+ + (selection[i + 2] ? 1 : 0)
+ + (selection[i + 3] ? 1 : 0)
+ + (selection[i + 4] ? 1 : 0)
+ + (selection[i + 5] ? 1 : 0)
+ + (selection[i + 6] ? 1 : 0)
+ + (selection[i + 7] ? 1 : 0);
+ }
+ for (; i < length; i++) {
+ count += (selection[i] ? 1 : 0);
+ }
Review Comment:
On modern JVMs, a simple loop for (boolean b : selection) if (b) count++
would likely be auto-vectorized by the JIT compiler just as efficiently. The
manual unrolling adds code complexity without guaranteed benefit. Consider
simplifying, or at minimum add a comment explaining the performance rationale.
##########
java/tsfile/src/main/java/org/apache/tsfile/read/filter/basic/Filter.java:
##########
@@ -117,6 +118,44 @@ public abstract class Filter {
*/
public abstract boolean[] satisfyTsBlock(boolean[] selection, TsBlock
tsBlock);
+ public final boolean[] satisfyTsBlock(
+ boolean[] selection, TsBlock tsBlock, Consumer<Long> filterRowsRecorder)
{
+
+ int inputCount = countSelectedRows(selection);
+ boolean[] result = satisfyTsBlock(selection, tsBlock);
+ int outputCount = countSelectedRows(result);
+ if (inputCount > outputCount) {
+ filterRowsRecorder.accept((long) (inputCount - outputCount));
+ }
+
+ return result;
+ }
+
+ private static int countSelectedRows(boolean[] selection) {
+ if (selection == null) return 0;
+ int count = 0;
+ int length = selection.length;
+ int i = 0;
+
+ // calculate multi times
Review Comment:
The comment // calculate multi times (line 140) is unclear. Perhaps //
process 8 elements at a time for better throughput?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]