Re: Review Request 36942: HIVE-11401: Predicate push down does not work with Parquet when partitions are in the expression

Reuben Kuhnert Thu, 30 Jul 2015 09:30:54 -0700

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/36942/#review93587
-----------------------------------------------------------




ql/src/java/org/apache/hadoop/hive/ql/io/parquet/read/ParquetFilterPredicateConverter.java
 (line 54)
<https://reviews.apache.org/r/36942/#comment147977>

    If the goal here is to get just the top-level fields, can we do something 
like:
    
    ```
    for (Type field : schema.getFields()) {  
      columns.add(field.getName());
    }
    ``` 
    
    This might be a little bit clearer.



ql/src/java/org/apache/hadoop/hive/ql/io/parquet/read/ParquetFilterPredicateConverter.java
 (line 64)
<https://reviews.apache.org/r/36942/#comment147969>

    Minor nit: Since we have the opportunity to fix it, can we change 'leafs' 
to 'leaves'.



ql/src/java/org/apache/hadoop/hive/ql/io/parquet/read/ParquetFilterPredicateConverter.java
 (line 102)
<https://reviews.apache.org/r/36942/#comment147978>

    List<T> has O(N) lookup time. Can we store this in a Set<T> (O(1)) instead?


- Reuben Kuhnert


On July 30, 2015, 3:43 p.m., Sergio Pena wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/36942/
> -----------------------------------------------------------
> 
> (Updated July 30, 2015, 3:43 p.m.)
> 
> 
> Review request for hive, Aihua Xu, cheng xu, Dong Chen, and Szehon Ho.
> 
> 
> Bugs: HIVE-11401
>     https://issues.apache.org/jira/browse/HIVE-11401
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> The following patch reviews the predicate created by Hive, and removes any 
> column that does not belong to the Parquet schema, such as partitioned 
> columns. This way Parquet can filter the columns correctly.
> 
> 
> Diffs
> -----
> 
>   
> ql/src/java/org/apache/hadoop/hive/ql/io/parquet/read/ParquetFilterPredicateConverter.java
>  PRE-CREATION 
>   
> ql/src/java/org/apache/hadoop/hive/ql/io/parquet/read/ParquetRecordReaderWrapper.java
>  49e52da2e26fd7213df1db88716eaee94cb536b8 
>   
> ql/src/test/org/apache/hadoop/hive/ql/io/parquet/TestParquetRecordReaderWrapper.java
>  87dd344534f09c7fc565fdc467ac82a51f37ebba 
>   
> ql/src/test/org/apache/hadoop/hive/ql/io/parquet/read/TestParquetFilterPredicate.java
>  PRE-CREATION 
>   
> ql/src/test/org/apache/hadoop/hive/ql/io/sarg/TestConvertAstToSearchArg.java 
> 85e952fb6855a2a03902ed971f54191837b32dac 
>   ql/src/test/queries/clientpositive/parquet_predicate_pushdown.q 
> PRE-CREATION 
>   ql/src/test/results/clientpositive/parquet_predicate_pushdown.q.out 
> PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/36942/diff/
> 
> 
> Testing
> -------
> 
> Unit tests: TestParquetFilterPredicate.java
> Integration tests: parquet_predicate_pushdown.q
> 
> 
> Thanks,
> 
> Sergio Pena
> 
>

Re: Review Request 36942: HIVE-11401: Predicate push down does not work with Parquet when partitions are in the expression

Reply via email to