[jira] [Commented] (HIVE-7239) Fix bug in HiveIndexedInputFormat implementation that causes incorrect query result when input backed by Sequence/RC files

Hive QA (JIRA) Mon, 16 Jun 2014 21:49:23 -0700

    [ 
https://issues.apache.org/jira/browse/HIVE-7239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14033432#comment-14033432
 ]


Hive QA commented on HIVE-7239:
-------------------------------



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12650622/HIVE-7239.patch

{color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 5611 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_columnar
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynpart_sort_optimization
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_ctas
org.apache.hadoop.hive.ql.exec.tez.TestTezTask.testSubmit
org.apache.hive.hcatalog.pig.TestHCatLoader.testReadDataPrimitiveTypes
org.apache.hive.jdbc.miniHS2.TestHiveServer2.testConnection
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/487/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/487/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-487/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 7 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12650622

> Fix bug in HiveIndexedInputFormat implementation that causes incorrect query 
> result when input backed by Sequence/RC files
> --------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HIVE-7239
>                 URL: https://issues.apache.org/jira/browse/HIVE-7239
>             Project: Hive
>          Issue Type: Bug
>          Components: Indexing
>    Affects Versions: 0.13.1
>            Reporter: Sumit Kumar
>            Assignee: Sumit Kumar
>         Attachments: HIVE-7239.patch
>
>
> In case of sequence files, it's crucial that splits are calculated around the 
> boundaries enforced by the input sequence file. However by default hadoop 
> creates input splits depending on the configuration parameters which may not 
> match the boundaries for the input sequence file. Hive provides 
> HiveIndexedInputFormat that provides extra logic and recalculates the split 
> boundaries for each split depending on the sequence file's boundaries.
> However we noticed this behavior of "over" reporting from data backed by 
> sequence file. We've a sample data on which we experimented and fixed this 
> bug, we have verified this fix by comparing the query output for input being 
> sequence file format, rc file and regular format. However we have not able to 
> find the right place to include this as a unit test that would execute as 
> part of hive tests. We tried writing a "clientpositive" test as part of ql 
> module but the output seems quite verbose and i couldn't interpret it that 
> well. Can someone please review this change and guide on how to write a test 
> that will execute as part of Hive testing?



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-7239) Fix bug in HiveIndexedInputFormat implementation that causes incorrect query result when input backed by Sequence/RC files

Reply via email to