[
https://issues.apache.org/jira/browse/HIVE-15355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Vihang Karajgaonkar updated HIVE-15355:
---------------------------------------
Attachment: HIVE-15355.02.patch
Updating the patch with a simple test case to check that AclEntries within
HadoopFileStatus is unmodifiable. This will prevent inadvertent future code
changes which directly try to modify List<AclEntries> from setFullFileStatus.
Also, checked the test failures above. They seem be to be unrelated since they
are failing without the patch on the master too. Also, some of the other recent
JIRAs also show the same test failures.
[~spena] [~rajesh.balamohan] Can you please review the patch? Thanks.
> Concurrency issues during parallel moveFile due to HDFSUtils.setFullFileStatus
> ------------------------------------------------------------------------------
>
> Key: HIVE-15355
> URL: https://issues.apache.org/jira/browse/HIVE-15355
> Project: Hive
> Issue Type: Bug
> Affects Versions: 2.1.0, 2.2.0
> Reporter: Vihang Karajgaonkar
> Assignee: Vihang Karajgaonkar
> Attachments: HIVE-15355.01.patch, HIVE-15355.02.patch
>
>
> It is possible to run into concurrency issues during multi-threaded moveFile
> issued when processing queries like {{INSERT OVERWRITE TABLE ... SELECT ..}}
> when there are multiple files in the staging directory which is a
> subdirectory of the target directory. The issue is hard to reproduce but
> following stacktrace is one such example:
> {noformat}
> INFO : Loading data to table
> functional_text_gzip.alltypesaggmultifilesnopart from
> hdfs://localhost:20500/test-warehouse/alltypesaggmultifilesnopart_text_gzip/.hive-staging_hive_2016-12-01_19-58-21_712_8968735301422943318-1/-ext-10000
> ERROR : Failed with exception java.lang.ArrayIndexOutOfBoundsException
> org.apache.hadoop.hive.ql.metadata.HiveException:
> java.lang.ArrayIndexOutOfBoundsException
> at org.apache.hadoop.hive.ql.metadata.Hive.moveFile(Hive.java:2858)
> at
> org.apache.hadoop.hive.ql.metadata.Hive.replaceFiles(Hive.java:3124)
> at org.apache.hadoop.hive.ql.metadata.Hive.loadTable(Hive.java:1701)
> at org.apache.hadoop.hive.ql.exec.MoveTask.execute(MoveTask.java:313)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:214)
> at
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100)
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1976)
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1689)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1421)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1205)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1200)
> at
> org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:237)
> at
> org.apache.hive.service.cli.operation.SQLOperation.access$300(SQLOperation.java:88)
> at
> org.apache.hive.service.cli.operation.SQLOperation$3$1.run(SQLOperation.java:293)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1796)
> Getting log thread is interrupted, since query is done!
> at
> org.apache.hive.service.cli.operation.SQLOperation$3.run(SQLOperation.java:306)
> at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.ArrayIndexOutOfBoundsException
> at java.lang.System.arraycopy(Native Method)
> at java.util.ArrayList.removeRange(ArrayList.java:616)
> at java.util.ArrayList$SubList.removeRange(ArrayList.java:1021)
> at java.util.AbstractList.clear(AbstractList.java:234)
> at
> com.google.common.collect.Iterables.removeIfFromRandomAccessList(Iterables.java:213)
> at com.google.common.collect.Iterables.removeIf(Iterables.java:184)
> at
> org.apache.hadoop.hive.shims.Hadoop23Shims.removeBaseAclEntries(Hadoop23Shims.java:865)
> at
> org.apache.hadoop.hive.shims.Hadoop23Shims.setFullFileStatus(Hadoop23Shims.java:757)
> at org.apache.hadoop.hive.ql.metadata.Hive$3.call(Hive.java:2835)
> at org.apache.hadoop.hive.ql.metadata.Hive$3.call(Hive.java:2828)
> ... 4 more
> ERROR : FAILED: Execution Error, return code 1 from
> org.apache.hadoop.hive.ql.exec.MoveTask
> {noformat}
> Quick online search also shows some other instances like the one mentioned in
> http://stackoverflow.com/questions/38900333/get-concurrentmodificationexception-in-step-2-create-intermediate-flat-hive-tab
> The issue seems to be coming from the below code :
> {code}
> if (aclEnabled) {
> aclStatus = sourceStatus.getAclStatus();
> if (aclStatus != null) {
> LOG.trace(aclStatus.toString());
> aclEntries = aclStatus.getEntries();
> removeBaseAclEntries(aclEntries);
> //the ACL api's also expect the tradition user/group/other permission
> in the form of ACL
> aclEntries.add(newAclEntry(AclEntryScope.ACCESS, AclEntryType.USER,
> sourcePerm.getUserAction()));
> aclEntries.add(newAclEntry(AclEntryScope.ACCESS, AclEntryType.GROUP,
> sourcePerm.getGroupAction()));
> aclEntries.add(newAclEntry(AclEntryScope.ACCESS, AclEntryType.OTHER,
> sourcePerm.getOtherAction()));
> }
> }
> {code}
> removeBaseAclEntries removes objects from {{List<AclEntry> aclEntries}} When
> HDFSUtils.setFullFileStatus() method is called from multiple threads like
> from
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java#L2835
> it is possible that multiple threads try to modify the {{List<AclEntry>
> aclEntries}} leading to concurrency issues.
> We should either move that block into a thread-safe region or call
> setFullFileStatus when all the threads converge.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)