[jira] [Commented] (HIVE-13278) Avoid FileNotFoundException when map/reduce.xml is not available
[ https://issues.apache.org/jira/browse/HIVE-13278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15760001#comment-15760001 ] Rui Li commented on HIVE-13278: --- Thanks [~csun] for the update. The latest patch looks good to me. +1 I also tried some test failures locally and they can't be reproduced. An improvement is maybe we can make TestInputOutputFormat also use Utilities to set the MapWork? We can leave it as a follow on. > Avoid FileNotFoundException when map/reduce.xml is not available > > > Key: HIVE-13278 > URL: https://issues.apache.org/jira/browse/HIVE-13278 > Project: Hive > Issue Type: Bug > Environment: Hive on Spark engine > Found based on : > Apache Hive 2.0.0 > Apache Spark 1.6.0 >Reporter: Xin Hao >Assignee: Chao Sun >Priority: Minor > Attachments: HIVE-13278.1.patch, HIVE-13278.2.patch, > HIVE-13278.3.patch, HIVE-13278.4.patch, HIVE-13278.5.patch, HIVE-13278.6.patch > > > Many redundant 'File not found' messages appeared in container log during > query execution with Hive on Spark. > Certainly, it doesn't prevent the query from running successfully. So mark it > as Minor currently. > Error message example: > {noformat} > 16/03/14 01:45:06 INFO exec.Utilities: File not found: File does not exist: > /tmp/hive/hadoop/2d378538-f5d3-493c-9276-c62dd6634fb4/hive_2016-03-14_01-44-16_835_623058724409492515-6/-mr-10010/0a6d0cae-1eb3-448c-883b-590b3b198a73/reduce.xml > at > org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:66) > at > org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:56) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsUpdateTimes(FSNamesystem.java:1932) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:1873) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1853) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1825) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:565) > at > org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.getBlockLocations(AuthorizationProviderProxyClientProtocol.java:87) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:363) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1060) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2086) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2082) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2080) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13278) Avoid FileNotFoundException when map/reduce.xml is not available
[ https://issues.apache.org/jira/browse/HIVE-13278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15759684#comment-15759684 ] Hive QA commented on HIVE-13278: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12843770/HIVE-13278.6.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 13 failed/errored test(s), 10821 tests executed *Failed tests:* {noformat} TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) (batchId=234) TestVectorizedColumnReaderBase - did not produce a TEST-*.xml file (likely timed out) (batchId=251) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample2] (batchId=5) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample4] (batchId=15) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample6] (batchId=61) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample7] (batchId=60) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample9] (batchId=39) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[udf_sort_array] (batchId=59) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_basic] (batchId=133) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[transform_ppr2] (batchId=135) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[stats_based_fetch_decision] (batchId=151) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] (batchId=93) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_5] (batchId=92) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2628/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/2628/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-2628/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 13 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12843770 - PreCommit-HIVE-Build > Avoid FileNotFoundException when map/reduce.xml is not available > > > Key: HIVE-13278 > URL: https://issues.apache.org/jira/browse/HIVE-13278 > Project: Hive > Issue Type: Bug > Environment: Hive on Spark engine > Found based on : > Apache Hive 2.0.0 > Apache Spark 1.6.0 >Reporter: Xin Hao >Assignee: Chao Sun >Priority: Minor > Attachments: HIVE-13278.1.patch, HIVE-13278.2.patch, > HIVE-13278.3.patch, HIVE-13278.4.patch, HIVE-13278.5.patch, HIVE-13278.6.patch > > > Many redundant 'File not found' messages appeared in container log during > query execution with Hive on Spark. > Certainly, it doesn't prevent the query from running successfully. So mark it > as Minor currently. > Error message example: > {noformat} > 16/03/14 01:45:06 INFO exec.Utilities: File not found: File does not exist: > /tmp/hive/hadoop/2d378538-f5d3-493c-9276-c62dd6634fb4/hive_2016-03-14_01-44-16_835_623058724409492515-6/-mr-10010/0a6d0cae-1eb3-448c-883b-590b3b198a73/reduce.xml > at > org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:66) > at > org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:56) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsUpdateTimes(FSNamesystem.java:1932) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:1873) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1853) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1825) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:565) > at > org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.getBlockLocations(AuthorizationProviderProxyClientProtocol.java:87) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:363) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1060) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2086) > at
[jira] [Commented] (HIVE-13278) Avoid FileNotFoundException when map/reduce.xml is not available
[ https://issues.apache.org/jira/browse/HIVE-13278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15755255#comment-15755255 ] Hive QA commented on HIVE-13278: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12843621/HIVE-13278.5.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 20 failed/errored test(s), 10820 tests executed *Failed tests:* {noformat} TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) (batchId=234) TestVectorizedColumnReaderBase - did not produce a TEST-*.xml file (likely timed out) (batchId=251) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample2] (batchId=5) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample4] (batchId=15) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample6] (batchId=61) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample7] (batchId=60) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample9] (batchId=39) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[udf_sort_array] (batchId=59) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[transform_ppr2] (batchId=135) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[metadataonly1] (batchId=150) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[stats_based_fetch_decision] (batchId=151) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_1] (batchId=92) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] (batchId=93) org.apache.hadoop.hive.ql.io.orc.TestInputOutputFormat.testCombinationInputFormat (batchId=254) org.apache.hadoop.hive.ql.io.orc.TestInputOutputFormat.testCombinationInputFormatWithAcid (batchId=254) org.apache.hadoop.hive.ql.io.orc.TestInputOutputFormat.testVectorReaderFooterSerialize (batchId=254) org.apache.hadoop.hive.ql.io.orc.TestInputOutputFormat.testVectorReaderNoFooterSerialize (batchId=254) org.apache.hadoop.hive.ql.io.orc.TestInputOutputFormat.testVectorization (batchId=254) org.apache.hadoop.hive.ql.io.orc.TestInputOutputFormat.testVectorizationWithAcid (batchId=254) org.apache.hadoop.hive.ql.io.orc.TestInputOutputFormat.testVectorizationWithBuckets (batchId=254) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2613/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/2613/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-2613/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 20 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12843621 - PreCommit-HIVE-Build > Avoid FileNotFoundException when map/reduce.xml is not available > > > Key: HIVE-13278 > URL: https://issues.apache.org/jira/browse/HIVE-13278 > Project: Hive > Issue Type: Bug > Environment: Hive on Spark engine > Found based on : > Apache Hive 2.0.0 > Apache Spark 1.6.0 >Reporter: Xin Hao >Assignee: Chao Sun >Priority: Minor > Attachments: HIVE-13278.1.patch, HIVE-13278.2.patch, > HIVE-13278.3.patch, HIVE-13278.4.patch, HIVE-13278.5.patch > > > Many redundant 'File not found' messages appeared in container log during > query execution with Hive on Spark. > Certainly, it doesn't prevent the query from running successfully. So mark it > as Minor currently. > Error message example: > {noformat} > 16/03/14 01:45:06 INFO exec.Utilities: File not found: File does not exist: > /tmp/hive/hadoop/2d378538-f5d3-493c-9276-c62dd6634fb4/hive_2016-03-14_01-44-16_835_623058724409492515-6/-mr-10010/0a6d0cae-1eb3-448c-883b-590b3b198a73/reduce.xml > at > org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:66) > at > org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:56) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsUpdateTimes(FSNamesystem.java:1932) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:1873) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1853) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1825) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:565) > at >
[jira] [Commented] (HIVE-13278) Avoid FileNotFoundException when map/reduce.xml is not available
[ https://issues.apache.org/jira/browse/HIVE-13278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15753143#comment-15753143 ] Rui Li commented on HIVE-13278: --- Hi [~csun], sorry maybe I was being misleading. What I have in mind is something like this: {code} // In Utilities::setMapWork public static Path setMapWork(Configuration conf, MapWork w, Path hiveScratchDir, boolean useCache) { conf.setBoolean(HAS_REDUCE_WORK, true); return setBaseWork(conf, w, hiveScratchDir, MAP_PLAN_NAME, useCache); } // In Utilities::getMapWork public static MapWork getMapWork(Configuration conf) { if (!conf.getBoolean(HAS_MAP_WORK, false)) { return null; } {code} Similar for set/get ReduceWork. So if we haven't called set work, we'll just get null when getting the work. Do you think it makes sense? > Avoid FileNotFoundException when map/reduce.xml is not available > > > Key: HIVE-13278 > URL: https://issues.apache.org/jira/browse/HIVE-13278 > Project: Hive > Issue Type: Bug > Environment: Hive on Spark engine > Found based on : > Apache Hive 2.0.0 > Apache Spark 1.6.0 >Reporter: Xin Hao >Assignee: Chao Sun >Priority: Minor > Attachments: HIVE-13278.1.patch, HIVE-13278.2.patch, > HIVE-13278.3.patch, HIVE-13278.4.patch > > > Many redundant 'File not found' messages appeared in container log during > query execution with Hive on Spark. > Certainly, it doesn't prevent the query from running successfully. So mark it > as Minor currently. > Error message example: > {noformat} > 16/03/14 01:45:06 INFO exec.Utilities: File not found: File does not exist: > /tmp/hive/hadoop/2d378538-f5d3-493c-9276-c62dd6634fb4/hive_2016-03-14_01-44-16_835_623058724409492515-6/-mr-10010/0a6d0cae-1eb3-448c-883b-590b3b198a73/reduce.xml > at > org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:66) > at > org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:56) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsUpdateTimes(FSNamesystem.java:1932) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:1873) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1853) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1825) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:565) > at > org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.getBlockLocations(AuthorizationProviderProxyClientProtocol.java:87) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:363) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1060) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2086) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2082) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2080) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13278) Avoid FileNotFoundException when map/reduce.xml is not available
[ https://issues.apache.org/jira/browse/HIVE-13278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15753034#comment-15753034 ] Hive QA commented on HIVE-13278: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12843458/HIVE-13278.4.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 13 failed/errored test(s), 10788 tests executed *Failed tests:* {noformat} TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) (batchId=234) TestMiniLlapLocalCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=144) [vectorized_rcfile_columnar.q,vector_elt.q,explainuser_1.q,multi_insert.q,tez_dml.q,vector_bround.q,schema_evol_orc_acid_table.q,vector_when_case_null.q,orc_ppd_schema_evol_1b.q,vector_join30.q,vectorization_11.q,cte_3.q,update_tmp_table.q,vector_decimal_cast.q,groupby_grouping_id2.q,vector_decimal_round.q,tez_smb_empty.q,orc_merge6.q,vector_decimal_trailing.q,cte_5.q,tez_union.q,cbo_rp_subq_not_in.q,vector_decimal_2.q,columnStatsUpdateForStatsOptimizer_1.q,vector_outer_join3.q,schema_evol_text_vec_part_all_complex.q,tez_dynpart_hashjoin_2.q,auto_sortmerge_join_12.q,offset_limit.q,tez_union_multiinsert.q] TestVectorizedColumnReaderBase - did not produce a TEST-*.xml file (likely timed out) (batchId=251) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample2] (batchId=5) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample4] (batchId=15) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample6] (batchId=61) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample7] (batchId=60) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample9] (batchId=38) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[udf_sort_array] (batchId=59) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[transform_ppr2] (batchId=135) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[metadataonly1] (batchId=150) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[stats_based_fetch_decision] (batchId=151) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] (batchId=93) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2599/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/2599/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-2599/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 13 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12843458 - PreCommit-HIVE-Build > Avoid FileNotFoundException when map/reduce.xml is not available > > > Key: HIVE-13278 > URL: https://issues.apache.org/jira/browse/HIVE-13278 > Project: Hive > Issue Type: Bug > Environment: Hive on Spark engine > Found based on : > Apache Hive 2.0.0 > Apache Spark 1.6.0 >Reporter: Xin Hao >Assignee: Chao Sun >Priority: Minor > Attachments: HIVE-13278.1.patch, HIVE-13278.2.patch, > HIVE-13278.3.patch, HIVE-13278.4.patch > > > Many redundant 'File not found' messages appeared in container log during > query execution with Hive on Spark. > Certainly, it doesn't prevent the query from running successfully. So mark it > as Minor currently. > Error message example: > {noformat} > 16/03/14 01:45:06 INFO exec.Utilities: File not found: File does not exist: > /tmp/hive/hadoop/2d378538-f5d3-493c-9276-c62dd6634fb4/hive_2016-03-14_01-44-16_835_623058724409492515-6/-mr-10010/0a6d0cae-1eb3-448c-883b-590b3b198a73/reduce.xml > at > org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:66) > at > org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:56) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsUpdateTimes(FSNamesystem.java:1932) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:1873) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1853) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1825) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:565) > at > org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.getBlockLocations(AuthorizationProviderProxyClientProtocol.java:87) > at >
[jira] [Commented] (HIVE-13278) Avoid FileNotFoundException when map/reduce.xml is not available
[ https://issues.apache.org/jira/browse/HIVE-13278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15751899#comment-15751899 ] Xuefu Zhang commented on HIVE-13278: Yeah, let's try [~lirui]'s idea to cover more cases if possible. It's important to note that the current patch is at least improving Hive (might incomplete) and does no harm. > Avoid FileNotFoundException when map/reduce.xml is not available > > > Key: HIVE-13278 > URL: https://issues.apache.org/jira/browse/HIVE-13278 > Project: Hive > Issue Type: Bug > Environment: Hive on Spark engine > Found based on : > Apache Hive 2.0.0 > Apache Spark 1.6.0 >Reporter: Xin Hao >Assignee: Chao Sun >Priority: Minor > Attachments: HIVE-13278.1.patch, HIVE-13278.2.patch, > HIVE-13278.3.patch > > > Many redundant 'File not found' messages appeared in container log during > query execution with Hive on Spark. > Certainly, it doesn't prevent the query from running successfully. So mark it > as Minor currently. > Error message example: > {noformat} > 16/03/14 01:45:06 INFO exec.Utilities: File not found: File does not exist: > /tmp/hive/hadoop/2d378538-f5d3-493c-9276-c62dd6634fb4/hive_2016-03-14_01-44-16_835_623058724409492515-6/-mr-10010/0a6d0cae-1eb3-448c-883b-590b3b198a73/reduce.xml > at > org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:66) > at > org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:56) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsUpdateTimes(FSNamesystem.java:1932) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:1873) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1853) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1825) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:565) > at > org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.getBlockLocations(AuthorizationProviderProxyClientProtocol.java:87) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:363) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1060) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2086) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2082) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2080) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13278) Avoid FileNotFoundException when map/reduce.xml is not available
[ https://issues.apache.org/jira/browse/HIVE-13278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15750856#comment-15750856 ] Rui Li commented on HIVE-13278: --- Hi [~csun], I think Tez also calls setMapWork/setReduceWork to associate the work with the JobConf. Could you have a try? If it works, the fix will be transparent to different tasks. Thanks! > Avoid FileNotFoundException when map/reduce.xml is not available > > > Key: HIVE-13278 > URL: https://issues.apache.org/jira/browse/HIVE-13278 > Project: Hive > Issue Type: Bug > Environment: Hive on Spark engine > Found based on : > Apache Hive 2.0.0 > Apache Spark 1.6.0 >Reporter: Xin Hao >Assignee: Chao Sun >Priority: Minor > Attachments: HIVE-13278.1.patch, HIVE-13278.2.patch, > HIVE-13278.3.patch > > > Many redundant 'File not found' messages appeared in container log during > query execution with Hive on Spark. > Certainly, it doesn't prevent the query from running successfully. So mark it > as Minor currently. > Error message example: > {noformat} > 16/03/14 01:45:06 INFO exec.Utilities: File not found: File does not exist: > /tmp/hive/hadoop/2d378538-f5d3-493c-9276-c62dd6634fb4/hive_2016-03-14_01-44-16_835_623058724409492515-6/-mr-10010/0a6d0cae-1eb3-448c-883b-590b3b198a73/reduce.xml > at > org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:66) > at > org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:56) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsUpdateTimes(FSNamesystem.java:1932) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:1873) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1853) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1825) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:565) > at > org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.getBlockLocations(AuthorizationProviderProxyClientProtocol.java:87) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:363) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1060) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2086) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2082) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2080) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13278) Avoid FileNotFoundException when map/reduce.xml is not available
[ https://issues.apache.org/jira/browse/HIVE-13278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15750793#comment-15750793 ] Chao Sun commented on HIVE-13278: - [~lirui] sorry I didn't see your post about using the RS for this. I think this solution also looks good. Note that for the flag approach is we can also set it in Task#initialize(), instead of setting it in different tasks. This could make it cleaner. For setting the default value to false, I think it may interrupt some Tez queries, which also calls the {{getMapRedWork}} method. This patch doesn't handle Tez. > Avoid FileNotFoundException when map/reduce.xml is not available > > > Key: HIVE-13278 > URL: https://issues.apache.org/jira/browse/HIVE-13278 > Project: Hive > Issue Type: Bug > Environment: Hive on Spark engine > Found based on : > Apache Hive 2.0.0 > Apache Spark 1.6.0 >Reporter: Xin Hao >Assignee: Chao Sun >Priority: Minor > Attachments: HIVE-13278.1.patch, HIVE-13278.2.patch, > HIVE-13278.3.patch > > > Many redundant 'File not found' messages appeared in container log during > query execution with Hive on Spark. > Certainly, it doesn't prevent the query from running successfully. So mark it > as Minor currently. > Error message example: > {noformat} > 16/03/14 01:45:06 INFO exec.Utilities: File not found: File does not exist: > /tmp/hive/hadoop/2d378538-f5d3-493c-9276-c62dd6634fb4/hive_2016-03-14_01-44-16_835_623058724409492515-6/-mr-10010/0a6d0cae-1eb3-448c-883b-590b3b198a73/reduce.xml > at > org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:66) > at > org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:56) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsUpdateTimes(FSNamesystem.java:1932) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:1873) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1853) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1825) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:565) > at > org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.getBlockLocations(AuthorizationProviderProxyClientProtocol.java:87) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:363) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1060) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2086) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2082) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2080) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13278) Avoid FileNotFoundException when map/reduce.xml is not available
[ https://issues.apache.org/jira/browse/HIVE-13278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15750696#comment-15750696 ] Rui Li commented on HIVE-13278: --- Hi [~xuefuz], I just think it'll be even simpler to go the checking RS way - we can constrain the fix in just one method {{HiveOutputFormatImpl.checkOutputSpecs}}, rather than making changes to all these different tasks. Besides, with the flag it seems we're adding extra burden to ourselves to keep the logic consistent during plan generation. On the other hand, if we decide to add the flag, I also have one suggestion. We can make {{has.map/reduce.work}} default to false. And we set them to true respectively in {{Utilities::setMapWork/setReduceWork}}. The logic behind this is if you haven't set a work with the JobConf, you shouldn't try to get one from it. Does this make sense? > Avoid FileNotFoundException when map/reduce.xml is not available > > > Key: HIVE-13278 > URL: https://issues.apache.org/jira/browse/HIVE-13278 > Project: Hive > Issue Type: Bug > Environment: Hive on Spark engine > Found based on : > Apache Hive 2.0.0 > Apache Spark 1.6.0 >Reporter: Xin Hao >Assignee: Chao Sun >Priority: Minor > Attachments: HIVE-13278.1.patch, HIVE-13278.2.patch, > HIVE-13278.3.patch > > > Many redundant 'File not found' messages appeared in container log during > query execution with Hive on Spark. > Certainly, it doesn't prevent the query from running successfully. So mark it > as Minor currently. > Error message example: > {noformat} > 16/03/14 01:45:06 INFO exec.Utilities: File not found: File does not exist: > /tmp/hive/hadoop/2d378538-f5d3-493c-9276-c62dd6634fb4/hive_2016-03-14_01-44-16_835_623058724409492515-6/-mr-10010/0a6d0cae-1eb3-448c-883b-590b3b198a73/reduce.xml > at > org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:66) > at > org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:56) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsUpdateTimes(FSNamesystem.java:1932) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:1873) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1853) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1825) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:565) > at > org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.getBlockLocations(AuthorizationProviderProxyClientProtocol.java:87) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:363) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1060) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2086) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2082) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2080) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13278) Avoid FileNotFoundException when map/reduce.xml is not available
[ https://issues.apache.org/jira/browse/HIVE-13278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15750533#comment-15750533 ] Xuefu Zhang commented on HIVE-13278: [~lirui], the concern is valid and shared, but on the other hand the current approach is simple and easy to understand. At least, the could be new cases where the problem may appear, but this doesn't make it worse and we don't expect too many such cases now or in the future. Further thoughts? > Avoid FileNotFoundException when map/reduce.xml is not available > > > Key: HIVE-13278 > URL: https://issues.apache.org/jira/browse/HIVE-13278 > Project: Hive > Issue Type: Bug > Environment: Hive on Spark engine > Found based on : > Apache Hive 2.0.0 > Apache Spark 1.6.0 >Reporter: Xin Hao >Assignee: Chao Sun >Priority: Minor > Attachments: HIVE-13278.1.patch, HIVE-13278.2.patch, > HIVE-13278.3.patch > > > Many redundant 'File not found' messages appeared in container log during > query execution with Hive on Spark. > Certainly, it doesn't prevent the query from running successfully. So mark it > as Minor currently. > Error message example: > {noformat} > 16/03/14 01:45:06 INFO exec.Utilities: File not found: File does not exist: > /tmp/hive/hadoop/2d378538-f5d3-493c-9276-c62dd6634fb4/hive_2016-03-14_01-44-16_835_623058724409492515-6/-mr-10010/0a6d0cae-1eb3-448c-883b-590b3b198a73/reduce.xml > at > org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:66) > at > org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:56) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsUpdateTimes(FSNamesystem.java:1932) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:1873) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1853) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1825) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:565) > at > org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.getBlockLocations(AuthorizationProviderProxyClientProtocol.java:87) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:363) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1060) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2086) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2082) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2080) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13278) Avoid FileNotFoundException when map/reduce.xml is not available
[ https://issues.apache.org/jira/browse/HIVE-13278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15750452#comment-15750452 ] Rui Li commented on HIVE-13278: --- Sorry about the delay. I have a concern about using flag: it seems difficult to make it exhaustive and maintain. What about the solution Xuefu mentioned: bq. Following your idea, can we first check if the mapwork ends a RS and use this to determine if reduce.xml is expected? Will this be cleaner and more reliable? > Avoid FileNotFoundException when map/reduce.xml is not available > > > Key: HIVE-13278 > URL: https://issues.apache.org/jira/browse/HIVE-13278 > Project: Hive > Issue Type: Bug > Environment: Hive on Spark engine > Found based on : > Apache Hive 2.0.0 > Apache Spark 1.6.0 >Reporter: Xin Hao >Assignee: Chao Sun >Priority: Minor > Attachments: HIVE-13278.1.patch, HIVE-13278.2.patch, > HIVE-13278.3.patch > > > Many redundant 'File not found' messages appeared in container log during > query execution with Hive on Spark. > Certainly, it doesn't prevent the query from running successfully. So mark it > as Minor currently. > Error message example: > {noformat} > 16/03/14 01:45:06 INFO exec.Utilities: File not found: File does not exist: > /tmp/hive/hadoop/2d378538-f5d3-493c-9276-c62dd6634fb4/hive_2016-03-14_01-44-16_835_623058724409492515-6/-mr-10010/0a6d0cae-1eb3-448c-883b-590b3b198a73/reduce.xml > at > org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:66) > at > org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:56) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsUpdateTimes(FSNamesystem.java:1932) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:1873) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1853) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1825) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:565) > at > org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.getBlockLocations(AuthorizationProviderProxyClientProtocol.java:87) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:363) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1060) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2086) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2082) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2080) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)