[jira] [Commented] (DRILL-6331) Parquet filter pushdown does not support the native hive reader

ASF GitHub Bot (JIRA) Wed, 25 Apr 2018 01:51:03 -0700

    [ 
https://issues.apache.org/jira/browse/DRILL-6331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16451893#comment-16451893
 ]


ASF GitHub Bot commented on DRILL-6331:
---------------------------------------

Github user arina-ielchiieva commented on a diff in the pull request:

    https://github.com/apache/drill/pull/1214#discussion_r183980928
  
    --- Diff: 
contrib/storage-hive/core/src/test/java/org/apache/drill/exec/TestHiveDrillNativeParquetReader.java
 ---
    @@ -0,0 +1,247 @@
    +/*
    +* Licensed to the Apache Software Foundation (ASF) under one or more
    +* contributor license agreements.  See the NOTICE file distributed with
    +* this work for additional information regarding copyright ownership.
    +* The ASF licenses this file to you under the Apache License, Version 2.0
    +* (the "License"); you may not use this file except in compliance with
    +* the License.  You may obtain a copy of the License at
    +*
    +* http://www.apache.org/licenses/LICENSE-2.0
    +*
    +* Unless required by applicable law or agreed to in writing, software
    +* distributed under the License is distributed on an "AS IS" BASIS,
    +* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    +* See the License for the specific language governing permissions and
    +* limitations under the License.
    +*/
    +package org.apache.drill.exec;
    +
    +import org.apache.drill.PlanTestBase;
    +import org.apache.drill.categories.HiveStorageTest;
    +import org.apache.drill.categories.SlowTest;
    +import org.apache.drill.common.exceptions.UserRemoteException;
    +import org.apache.drill.exec.hive.HiveTestBase;
    +import org.apache.drill.exec.planner.physical.PlannerSettings;
    +import org.hamcrest.CoreMatchers;
    +import org.joda.time.DateTime;
    +import org.junit.AfterClass;
    +import org.junit.BeforeClass;
    +import org.junit.Rule;
    +import org.junit.Test;
    +import org.junit.experimental.categories.Category;
    +import org.junit.rules.ExpectedException;
    +
    +import java.math.BigDecimal;
    +import java.sql.Date;
    +import java.sql.Timestamp;
    +
    +import static org.hamcrest.CoreMatchers.containsString;
    +import static org.junit.Assert.assertEquals;
    +
    +@Category({SlowTest.class, HiveStorageTest.class})
    +public class TestHiveDrillNativeParquetReader extends HiveTestBase {
    +
    +  @BeforeClass
    +  public static void init() {
    +    setSessionOption(ExecConstants.HIVE_OPTIMIZE_SCAN_WITH_NATIVE_READERS, 
true);
    +    setSessionOption(PlannerSettings.ENABLE_DECIMAL_DATA_TYPE_KEY, true);
    +  }
    +
    +  @AfterClass
    +  public static void cleanup() {
    +    
resetSessionOption(ExecConstants.HIVE_OPTIMIZE_SCAN_WITH_NATIVE_READERS);
    +    resetSessionOption(PlannerSettings.ENABLE_DECIMAL_DATA_TYPE_KEY);
    +  }
    +
    +  @Rule
    +  public ExpectedException thrown = ExpectedException.none();
    +
    +  @Test
    +  public void testFilterPushDownForManagedTable() throws Exception {
    +    String query = "select * from hive.kv_native where key > 1";
    +
    +    int actualRowCount = testSql(query);
    +    assertEquals("Expected and actual row count should match", 2, 
actualRowCount);
    +
    +    testPlanMatchingPatterns(query,
    +        new String[]{"HiveDrillNativeParquetScan", "numFiles=1"}, new 
String[]{});
    +  }
    +
    +  @Test
    +  public void testFilterPushDownForExternalTable() throws Exception {
    +    String query = "select * from hive.kv_native_ext where key = 1";
    +
    +    int actualRowCount = testSql(query);
    +    assertEquals("Expected and actual row count should match", 1, 
actualRowCount);
    +
    +    testPlanMatchingPatterns(query,
    +        new String[]{"HiveDrillNativeParquetScan", "numFiles=1"}, new 
String[]{});
    --- End diff --
    
    Agree, replaced with null. Not adding this method to avoid merging 
conflicts.


> Parquet filter pushdown does not support the native hive reader
> ---------------------------------------------------------------
>
>                 Key: DRILL-6331
>                 URL: https://issues.apache.org/jira/browse/DRILL-6331
>             Project: Apache Drill
>          Issue Type: Improvement
>          Components: Storage - Hive
>    Affects Versions: 1.13.0
>            Reporter: Arina Ielchiieva
>            Assignee: Arina Ielchiieva
>            Priority: Major
>             Fix For: 1.14.0
>
>
> Initially HiveDrillNativeParquetGroupScan was based mainly on HiveScan, the 
> core difference between them was
> that HiveDrillNativeParquetScanBatchCreator was creating ParquetRecordReader 
> instead of HiveReader.
> This allowed to read Hive parquet files using Drill native parquet reader but 
> did not expose Hive data to Drill optimizations.
> For example, filter push down, limit push down, count to direct scan 
> optimizations.
> Hive code had to be refactored to use the same interfaces as 
> ParquestGroupScan in order to be exposed to such optimizations.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (DRILL-6331) Parquet filter pushdown does not support the native hive reader

Reply via email to