[
https://issues.apache.org/jira/browse/HIVE-21407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16807861#comment-16807861
]
Hive QA commented on HIVE-21407:
--------------------------------
Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12964539/HIVE-21407.patch
{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.
{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 15870 tests
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.metastore.TestGetPartitionsUsingProjectionAndFilterSpecs.org.apache.hadoop.hive.metastore.TestGetPartitionsUsingProjectionAndFilterSpecs
(batchId=222)
{noformat}
Test results:
https://builds.apache.org/job/PreCommit-HIVE-Build/16822/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/16822/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-16822/
Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}
This message is automatically generated.
ATTACHMENT ID: 12964539 - PreCommit-HIVE-Build
> Parquet predicate pushdown is not working correctly for char column types
> -------------------------------------------------------------------------
>
> Key: HIVE-21407
> URL: https://issues.apache.org/jira/browse/HIVE-21407
> Project: Hive
> Issue Type: Bug
> Affects Versions: 4.0.0
> Reporter: Marta Kuczora
> Assignee: Marta Kuczora
> Priority: Major
> Attachments: HIVE-21407.patch
>
>
> If the 'hive.optimize.index.filter' parameter is false, the filter predicate
> is not pushed to parquet, so the filtering only happens within Hive. If the
> parameter is true, the filter is pushed to parquet, but for a char type, the
> value which is pushed to Parquet will be padded with spaces:
> {noformat}
> @Override
> public void setValue(String val, int len) {
> super.setValue(HiveBaseChar.getPaddedValue(val, len), -1);
> }
> {noformat}
> So if we have a char(10) column which contains the value "apple" and the
> where condition looks like 'where c='apple'', the value pushed to Paquet will
> be 'apple' followed by 5 spaces. But the stored values are not padded, so no
> rows will be returned from Parquet.
> How to reproduce:
> {noformat}
> $ create table ppd (c char(10), v varchar(10), i int) stored as parquet;
> $ insert into ppd values ('apple', 'bee', 1),('apple', 'tree', 2),('hello',
> 'world', 1),('hello','vilag',3);
> $ set hive.optimize.ppd.storage=true;
> $ set hive.vectorized.execution.enabled=true;
> $ set hive.vectorized.execution.enabled=false;
> $ set hive.optimize.ppd=true;
> $ set hive.optimize.index.filter=true;
> $ set hive.parquet.timestamp.skip.conversion=false;
> $ select * from ppd where c='apple';
> +--------+--------+--------+
> | ppd.c | ppd.v | ppd.i |
> +--------+--------+--------+
> +--------+--------+--------+
> $ set hive.optimize.index.filter=false; or set
> hive.optimize.ppd.storage=false;
> $ select * from ppd where c='apple';
> +-------------+--------+--------+
> | ppd.c | ppd.v | ppd.i |
> +-------------+--------+--------+
> | apple | bee | 1 |
> | apple | tree | 2 |
> +-------------+--------+--------+
> {noformat}
> The issue surfaced after uploading the fix for
> [HIVE-21327|https://issues.apache.org/jira/browse/HIVE-21327] was uploaded
> upstream. Before the HIVE-21327 fix, setting the parameter
> 'hive.parquet.timestamp.skip.conversion' to true in the parquet_ppd_char.q
> test hid this issue.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)