[
https://issues.apache.org/jira/browse/HIVE-21924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16939905#comment-16939905
]
Hive QA commented on HIVE-21924:
--------------------------------
Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12981595/HIVE-21924.2.patch
{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.
{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 17011 tests
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[skiphf_aggr]
(batchId=179)
{noformat}
Test results:
https://builds.apache.org/job/PreCommit-HIVE-Build/18765/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/18765/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-18765/
Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}
This message is automatically generated.
ATTACHMENT ID: 12981595 - PreCommit-HIVE-Build
> Split text files even if header/footer exists
> ---------------------------------------------
>
> Key: HIVE-21924
> URL: https://issues.apache.org/jira/browse/HIVE-21924
> Project: Hive
> Issue Type: Improvement
> Components: File Formats
> Affects Versions: 2.4.0, 4.0.0, 3.2.0
> Reporter: Prasanth Jayachandran
> Assignee: Mustafa Iman
> Priority: Major
> Labels: pull-request-available
> Attachments: HIVE-21924.2.patch, HIVE-21924.patch
>
> Time Spent: 1h 20m
> Remaining Estimate: 0h
>
> https://github.com/apache/hive/blob/967a1cc98beede8e6568ce750ebeb6e0d048b8ea/ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java#L494-L503
>
> {code}
> int headerCount = 0;
> int footerCount = 0;
> if (table != null) {
> headerCount = Utilities.getHeaderCount(table);
> footerCount = Utilities.getFooterCount(table, conf);
> if (headerCount != 0 || footerCount != 0) {
> // Input file has header or footer, cannot be splitted.
> HiveConf.setLongVar(conf, ConfVars.MAPREDMINSPLITSIZE,
> Long.MAX_VALUE);
> }
> }
> {code}
> this piece of code makes the CSV (or any text files with header/footer) files
> not splittable if header or footer is present.
> If only header is present, we can find the offset after first line break and
> use that to split. Similarly for footer, may be read few KB's of data at the
> end and find the last line break offset. Use that to determine the data range
> which can be used for splitting. Few reads during split generation are
> cheaper than not splitting the file at all.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)