[jira] [Updated] (HIVE-22001) AcidUtils.getAcidState() can fail if Cleaner is removing files at the same time
[ https://issues.apache.org/jira/browse/HIVE-22001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Dere updated HIVE-22001: -- Resolution: Fixed Fix Version/s: 4.0.0 Status: Resolved (was: Patch Available) Committed to master, thanks for review [~ashutoshc] > AcidUtils.getAcidState() can fail if Cleaner is removing files at the same > time > --- > > Key: HIVE-22001 > URL: https://issues.apache.org/jira/browse/HIVE-22001 > Project: Hive > Issue Type: Bug > Components: Transactions >Reporter: Jason Dere >Assignee: Jason Dere >Priority: Major > Fix For: 4.0.0 > > Attachments: HIVE-22001.1.patch > > > Had one user hit the following error during getSplits > {noformat} > 2019-07-06T14:33:03,067 ERROR [4640181a-3eb7-4b3e-9a40-d7a8de9a570c > HiveServer2-HttpHandler-Pool: Thread-415519]: SessionState > (SessionState.java:printError(1247)) - Vertex failed, vertexName=Map 1, > vertexId=vertex_1560947172646_2452_6199_00, diagnostics=[Vertex > vertex_1560947172646_2452_6199_00 [Map 1] killed/failed due > to:ROOT_INPUT_INIT_FAILURE, Vertex Input: hive_table initializer failed, > vertex=vertex_1560947172646_2452_6199_00 [Map 1], java.lang.RuntimeException: > ORC split generation failed with exception: java.io.FileNotFoundException: > File hdfs://path/to/hive_table/oiddatemmdd=20190706/delta_0987070_0987070 > does not exist. > at > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.generateSplitsInfo(OrcInputFormat.java:1870) > at > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getSplits(OrcInputFormat.java:1958) > at > org.apache.hadoop.hive.ql.io.HiveInputFormat.addSplitsForGroup(HiveInputFormat.java:524) > at > org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:779) > at > org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.initialize(HiveSplitGenerator.java:243) > at > org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:278) > at > org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:269) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730) > at > org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:269) > at > org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:253) > at > com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:108) > at > com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:41) > at > com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:77) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > Caused by: java.util.concurrent.ExecutionException: > java.io.FileNotFoundException: File > hdfs://path/to/hive_table/oiddatemmdd=20190706/delta_0987070_0987070 does > not exist. > at java.util.concurrent.FutureTask.report(FutureTask.java:122) > at java.util.concurrent.FutureTask.get(FutureTask.java:192) > at > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.generateSplitsInfo(OrcInputFormat.java:1809) > ... 17 more > Caused by: java.io.FileNotFoundException: File > hdfs://path/to/hive_table/oiddatemmdd=20190706/delta_0987070_0987070 does > not exist. > at > org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:1059) > at > org.apache.hadoop.hdfs.DistributedFileSystem.access$1000(DistributedFileSystem.java:131) > at > org.apache.hadoop.hdfs.DistributedFileSystem$24.doCall(DistributedFileSystem.java:1119) > at > org.apache.hadoop.hdfs.DistributedFileSystem$24.doCall(DistributedFileSystem.java:1116) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:1126) > at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1868) > at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1953) > at >
[jira] [Updated] (HIVE-22001) AcidUtils.getAcidState() can fail if Cleaner is removing files at the same time
[ https://issues.apache.org/jira/browse/HIVE-22001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Dere updated HIVE-22001: -- Attachment: HIVE-22001.1.patch > AcidUtils.getAcidState() can fail if Cleaner is removing files at the same > time > --- > > Key: HIVE-22001 > URL: https://issues.apache.org/jira/browse/HIVE-22001 > Project: Hive > Issue Type: Bug > Components: Transactions >Reporter: Jason Dere >Assignee: Jason Dere >Priority: Major > Attachments: HIVE-22001.1.patch > > > Had one user hit the following error during getSplits > {noformat} > 2019-07-06T14:33:03,067 ERROR [4640181a-3eb7-4b3e-9a40-d7a8de9a570c > HiveServer2-HttpHandler-Pool: Thread-415519]: SessionState > (SessionState.java:printError(1247)) - Vertex failed, vertexName=Map 1, > vertexId=vertex_1560947172646_2452_6199_00, diagnostics=[Vertex > vertex_1560947172646_2452_6199_00 [Map 1] killed/failed due > to:ROOT_INPUT_INIT_FAILURE, Vertex Input: hive_table initializer failed, > vertex=vertex_1560947172646_2452_6199_00 [Map 1], java.lang.RuntimeException: > ORC split generation failed with exception: java.io.FileNotFoundException: > File hdfs://path/to/hive_table/oiddatemmdd=20190706/delta_0987070_0987070 > does not exist. > at > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.generateSplitsInfo(OrcInputFormat.java:1870) > at > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getSplits(OrcInputFormat.java:1958) > at > org.apache.hadoop.hive.ql.io.HiveInputFormat.addSplitsForGroup(HiveInputFormat.java:524) > at > org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:779) > at > org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.initialize(HiveSplitGenerator.java:243) > at > org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:278) > at > org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:269) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730) > at > org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:269) > at > org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:253) > at > com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:108) > at > com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:41) > at > com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:77) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > Caused by: java.util.concurrent.ExecutionException: > java.io.FileNotFoundException: File > hdfs://path/to/hive_table/oiddatemmdd=20190706/delta_0987070_0987070 does > not exist. > at java.util.concurrent.FutureTask.report(FutureTask.java:122) > at java.util.concurrent.FutureTask.get(FutureTask.java:192) > at > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.generateSplitsInfo(OrcInputFormat.java:1809) > ... 17 more > Caused by: java.io.FileNotFoundException: File > hdfs://path/to/hive_table/oiddatemmdd=20190706/delta_0987070_0987070 does > not exist. > at > org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:1059) > at > org.apache.hadoop.hdfs.DistributedFileSystem.access$1000(DistributedFileSystem.java:131) > at > org.apache.hadoop.hdfs.DistributedFileSystem$24.doCall(DistributedFileSystem.java:1119) > at > org.apache.hadoop.hdfs.DistributedFileSystem$24.doCall(DistributedFileSystem.java:1116) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:1126) > at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1868) > at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1953) > at > org.apache.hadoop.hive.ql.io.AcidUtils$MetaDataFile.chooseFile(AcidUtils.java:1903) > at >
[jira] [Updated] (HIVE-22001) AcidUtils.getAcidState() can fail if Cleaner is removing files at the same time
[ https://issues.apache.org/jira/browse/HIVE-22001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Dere updated HIVE-22001: -- Status: Patch Available (was: Open) > AcidUtils.getAcidState() can fail if Cleaner is removing files at the same > time > --- > > Key: HIVE-22001 > URL: https://issues.apache.org/jira/browse/HIVE-22001 > Project: Hive > Issue Type: Bug > Components: Transactions >Reporter: Jason Dere >Assignee: Jason Dere >Priority: Major > Attachments: HIVE-22001.1.patch > > > Had one user hit the following error during getSplits > {noformat} > 2019-07-06T14:33:03,067 ERROR [4640181a-3eb7-4b3e-9a40-d7a8de9a570c > HiveServer2-HttpHandler-Pool: Thread-415519]: SessionState > (SessionState.java:printError(1247)) - Vertex failed, vertexName=Map 1, > vertexId=vertex_1560947172646_2452_6199_00, diagnostics=[Vertex > vertex_1560947172646_2452_6199_00 [Map 1] killed/failed due > to:ROOT_INPUT_INIT_FAILURE, Vertex Input: hive_table initializer failed, > vertex=vertex_1560947172646_2452_6199_00 [Map 1], java.lang.RuntimeException: > ORC split generation failed with exception: java.io.FileNotFoundException: > File hdfs://path/to/hive_table/oiddatemmdd=20190706/delta_0987070_0987070 > does not exist. > at > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.generateSplitsInfo(OrcInputFormat.java:1870) > at > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getSplits(OrcInputFormat.java:1958) > at > org.apache.hadoop.hive.ql.io.HiveInputFormat.addSplitsForGroup(HiveInputFormat.java:524) > at > org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:779) > at > org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.initialize(HiveSplitGenerator.java:243) > at > org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:278) > at > org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:269) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730) > at > org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:269) > at > org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:253) > at > com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:108) > at > com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:41) > at > com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:77) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > Caused by: java.util.concurrent.ExecutionException: > java.io.FileNotFoundException: File > hdfs://path/to/hive_table/oiddatemmdd=20190706/delta_0987070_0987070 does > not exist. > at java.util.concurrent.FutureTask.report(FutureTask.java:122) > at java.util.concurrent.FutureTask.get(FutureTask.java:192) > at > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.generateSplitsInfo(OrcInputFormat.java:1809) > ... 17 more > Caused by: java.io.FileNotFoundException: File > hdfs://path/to/hive_table/oiddatemmdd=20190706/delta_0987070_0987070 does > not exist. > at > org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:1059) > at > org.apache.hadoop.hdfs.DistributedFileSystem.access$1000(DistributedFileSystem.java:131) > at > org.apache.hadoop.hdfs.DistributedFileSystem$24.doCall(DistributedFileSystem.java:1119) > at > org.apache.hadoop.hdfs.DistributedFileSystem$24.doCall(DistributedFileSystem.java:1116) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:1126) > at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1868) > at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1953) > at > org.apache.hadoop.hive.ql.io.AcidUtils$MetaDataFile.chooseFile(AcidUtils.java:1903) > at >