[
https://issues.apache.org/jira/browse/HIVE-17963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16238882#comment-16238882
]
Hive QA commented on HIVE-17963:
--------------------------------
Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12895954/HIVE-17963.2.patch
{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.
{color:red}ERROR:{color} -1 due to 19 failed/errored test(s), 11354 tests
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[insert_values_orig_table_use_metadata]
(batchId=62)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid_fast]
(batchId=157)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb]
(batchId=156)
org.apache.hadoop.hive.cli.TestNegativeMinimrCliDriver.testCliDriver[ct_noperm_loc]
(batchId=94)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_multi]
(batchId=111)
org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut
(batchId=206)
org.apache.hadoop.hive.llap.cache.TestBuddyAllocatorForceEvict.testMtt
(batchId=295)
org.apache.hadoop.hive.ql.exec.tez.TestWorkloadManager.testAmPoolInteractions
(batchId=281)
org.apache.hadoop.hive.ql.exec.tez.TestWorkloadManager.testApplyPlanQpChanges
(batchId=281)
org.apache.hadoop.hive.ql.exec.tez.TestWorkloadManager.testApplyPlanUserMapping
(batchId=281)
org.apache.hadoop.hive.ql.exec.tez.TestWorkloadManager.testAsyncSessionInitFailures
(batchId=281)
org.apache.hadoop.hive.ql.exec.tez.TestWorkloadManager.testClusterFractions
(batchId=281)
org.apache.hadoop.hive.ql.exec.tez.TestWorkloadManager.testDestroyAndReturn
(batchId=281)
org.apache.hadoop.hive.ql.exec.tez.TestWorkloadManager.testQueueing
(batchId=281)
org.apache.hadoop.hive.ql.exec.tez.TestWorkloadManager.testReopen (batchId=281)
org.apache.hadoop.hive.ql.exec.tez.TestWorkloadManager.testReuse (batchId=281)
org.apache.hadoop.hive.ql.exec.tez.TestWorkloadManager.testReuseWithDifferentPool
(batchId=281)
org.apache.hadoop.hive.ql.exec.tez.TestWorkloadManager.testReuseWithQueueing
(batchId=281)
org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testConstraints
(batchId=223)
{noformat}
Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7632/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7632/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7632/
Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 19 tests failed
{noformat}
This message is automatically generated.
ATTACHMENT ID: 12895954 - PreCommit-HIVE-Build
> Fix for HIVE-17113 can be improved for non-blobstore filesystems
> ----------------------------------------------------------------
>
> Key: HIVE-17963
> URL: https://issues.apache.org/jira/browse/HIVE-17963
> Project: Hive
> Issue Type: Bug
> Reporter: Jason Dere
> Assignee: Jason Dere
> Priority: Major
> Attachments: HIVE-17963.1.patch, HIVE-17963.2.patch
>
>
> HIVE-17113/HIVE-17813 fix the duplicate file issue by performing file moves
> on a file-by-file basis. For non-blobstore filesystems this results in many
> more filesystem/namenode operations compared to the previous
> Utilities.mvFileToFinalPath() behavior (dedup files in src dir, rename src
> dir to final dir).
> For non-blobstore filesystems, a better solution would be the one described
> [here|https://issues.apache.org/jira/browse/HIVE-17113?focusedCommentId=16100564&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16100564]:
> 1) Move the temp directory to a new directory name, to prevent additional
> files from being added by any runaway processes.
> 2) Run removeTempOrDuplicateFiles() on this renamed temp directory
> 3) Run renameOrMoveFiles() to move the renamed temp directory to the final
> location.
> This results in only one additional file operation in non-blobstore FSes
> compared to the original Utilities.mvFileToFinalPath() behavior.
> The proposal is to do away with the config setting
> hive.exec.move.files.from.source.dir and always have behavior that should
> take care of the duplicate file issue described in HIVE-17113. For
> non-blobstore filesystems we will do steps 1-3 described above. For blobstore
> filesystems we will do the solution done in HIVE-17113/HIVE-17813 which does
> the file-by-file copy - this should have the same number of file operations
> as doing a rename directory on blobstore, which effectively results in file
> moves on a file-by-file basis.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)