[
https://issues.apache.org/jira/browse/PARQUET-1968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17420409#comment-17420409
]
ASF GitHub Bot commented on PARQUET-1968:
-----------------------------------------
huaxingao commented on a change in pull request #923:
URL: https://github.com/apache/parquet-mr/pull/923#discussion_r716278313
##########
File path:
parquet-hadoop/src/test/java/org/apache/parquet/filter2/recordlevel/TestRecordLevelFilters.java
##########
@@ -146,6 +147,33 @@ public void testAllFilter() throws Exception {
assertEquals(new ArrayList<Group>(), found);
}
+ @Test
+ public void testInFilter() throws Exception {
+ BinaryColumn name = binaryColumn("name");
+
+ HashSet<Binary> nameSet = new HashSet<>();
+ nameSet.add(Binary.fromString("thing2"));
+ nameSet.add(Binary.fromString("thing1"));
+ for (int i = 100; i < 200; i++) {
+ nameSet.add(Binary.fromString("p" + i));
+ }
+ FilterPredicate pred = in(name, nameSet);
+ List<Group> found = PhoneBookWriter.readFile(phonebookFile,
FilterCompat.get(pred));
+
+ List<String> expectedNames = new ArrayList<>();
+ expectedNames.add("thing1");
+ expectedNames.add("thing2");
+ for (int i = 100; i < 200; i++) {
+ expectedNames.add("p" + i);
+ }
+
+ assertEquals(expectedNames.get(0),
((Group)(found.get(0))).getString("name", 0));
+ assertEquals(expectedNames.get(1),
((Group)(found.get(1))).getString("name", 0));
+ for (int i = 2; i < 102; i++) {
+ assertEquals(expectedNames.get(i),
((Group)(found.get(i))).getString("name", 0));
+ }
Review comment:
I added `assert(found.size() == 102)`. Since I have already checked that
`found` contains `"thing1"`, `"thing2"` and from `"p100"` to `"p199"`, I think
this assert size is sufficient to check if `found` doesn't contain anything
else.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
> FilterApi support In predicate
> ------------------------------
>
> Key: PARQUET-1968
> URL: https://issues.apache.org/jira/browse/PARQUET-1968
> Project: Parquet
> Issue Type: Improvement
> Components: parquet-mr
> Affects Versions: 1.12.0
> Reporter: Yuming Wang
> Priority: Major
>
> FilterApi should support native In predicate.
> Spark:
> https://github.com/apache/spark/blob/d6a68e0b67ff7de58073c176dd097070e88ac831/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFilters.scala#L600-L605
> Impala:
> https://issues.apache.org/jira/browse/IMPALA-3654
--
This message was sent by Atlassian Jira
(v8.3.4#803005)