wgtmac commented on code in PR #1328:
URL: https://github.com/apache/parquet-mr/pull/1328#discussion_r1578917312
##########
parquet-column/src/main/java/org/apache/parquet/filter2/predicate/FilterApi.java:
##########
@@ -257,6 +266,16 @@ public static <T extends Comparable<T>, C extends
Column<T> & SupportsEqNotEq> N
return new NotIn<>(column, values);
}
+ public static <T extends Comparable<T>, C extends Column<T> &
SupportsContains> Contains<T> contains(
Review Comment:
BTW, now these new functions only accept single value. Does it make sense to
support variable number of values? I also see some databases support CONTAINS(a
OR b) and CONTAINS(a AND b). They can be composited by logical operators but
the cost would be high compared to support them natively,
##########
parquet-hadoop/src/main/java/org/apache/parquet/filter2/bloomfilterlevel/BloomFilterImpl.java:
##########
@@ -119,6 +119,16 @@ public <T extends Comparable<T>> Boolean
visit(Operators.GtEq<T> gtEq) {
return BLOCK_MIGHT_MATCH;
}
+ @Override
+ public <T extends Comparable<T>> Boolean visit(Operators.Contains<T>
contains) {
+ return BLOCK_MIGHT_MATCH;
Review Comment:
BloomFilters are created for primitive columns when dictionary encoding is
disabled. Would you try that again with dictionary encoding disabled?
##########
parquet-column/src/main/java/org/apache/parquet/filter2/predicate/FilterApi.java:
##########
@@ -257,6 +266,16 @@ public static <T extends Comparable<T>, C extends
Column<T> & SupportsEqNotEq> N
return new NotIn<>(column, values);
}
+ public static <T extends Comparable<T>, C extends Column<T> &
SupportsContains> Contains<T> contains(
Review Comment:
It seems that Contains/DoesNotContain is not a standard SQL function. In
which case they are used?
##########
parquet-column/src/main/java/org/apache/parquet/filter2/predicate/FilterApi.java:
##########
@@ -91,6 +95,11 @@ public static BinaryColumn binaryColumn(String columnPath) {
return new BinaryColumn(ColumnPath.fromDotString(columnPath));
}
+ public static <T extends Comparable<T>, C extends Column<T> &
SupportsEqNotEq> ArrayColumn<T> arrayColumn(
Review Comment:
Is it possible to reuse this for map key or value column as well?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]