[jira] [Commented] (HIVE-9580) Server returns incorrect result from JOIN ON VARCHAR columns
[ https://issues.apache.org/jira/browse/HIVE-9580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14496203#comment-14496203 ] Aihua Xu commented on HIVE-9580: Attached the new patch to fix testCliDriver_mapjoin_decimal unit test failure. The other failures seem to be unrelated. Server returns incorrect result from JOIN ON VARCHAR columns Key: HIVE-9580 URL: https://issues.apache.org/jira/browse/HIVE-9580 Project: Hive Issue Type: Bug Components: HiveServer2 Affects Versions: 0.12.0, 0.13.0, 0.14.0 Reporter: Mike Assignee: Aihua Xu Attachments: HIVE-9580.patch The database erroneously returns rows when joining two tables which each contain a VARCHAR column and the join's ON condition uses the equality operator on the VARCHAR columns. **The following JDBC method exhibits the problem: static void joinIssue() throws SQLException { String sql; int rowsAffected; ResultSet rs; Statement stmt = con.createStatement(); String table1_Name = blahtab1; String table1A_Name = blahtab1A; String table1B_Name = blahtab1B; String table2_Name = blahtab2; try { sql = drop table + table1_Name; System.out.println(\nsql= + sql); rowsAffected = stmt.executeUpdate(sql); } catch (SQLException se) { println(Drop table error: + se.getMessage()); } try { sql = CREATE TABLE + table1_Name + ( + VCHARCOL VARCHAR(10) + ,INTEGERCOL INT + ) ; System.out.println(\nsql= + sql); rowsAffected = stmt.executeUpdate(sql); } catch (SQLException se) { println(create table error: + se.getMessage()); } sql = insert into + table1_Name + values ('jklmnopqrs', 99); System.out.println(\nsql= + sql); stmt.executeUpdate(sql); System.out.println(===); try { sql = drop table + table1A_Name; System.out.println(\nsql= + sql); rowsAffected = stmt.executeUpdate(sql); } catch (SQLException se) { println(Drop table error: + se.getMessage()); } try { sql = CREATE TABLE + table1A_Name + ( + VCHARCOL VARCHAR(10) + ) ; System.out.println(\nsql= + sql); rowsAffected = stmt.executeUpdate(sql); } catch (SQLException se) { println(create table error: + se.getMessage()); } sql = insert into + table1A_Name + values ('jklmnopqrs'); System.out.println(\nsql= + sql); stmt.executeUpdate(sql); System.out.println(===); try { sql = drop table + table1B_Name; System.out.println(\nsql= + sql); rowsAffected = stmt.executeUpdate(sql); } catch (SQLException se) { println(Drop table error: + se.getMessage()); } try { sql = CREATE TABLE + table1B_Name + ( + VCHARCOL VARCHAR(11) + ,INTEGERCOL INT + ) ; System.out.println(\nsql= + sql); rowsAffected = stmt.executeUpdate(sql); } catch (SQLException se) { println(create table error: + se.getMessage()); } sql = insert into + table1B_Name + values ('jklmnopqrs', 99);
[jira] [Commented] (HIVE-9252) Linking custom SerDe jar to table definition.
[ https://issues.apache.org/jira/browse/HIVE-9252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14496092#comment-14496092 ] Niels Basjes commented on HIVE-9252: After the initial patch I no longer see anything happening. What is the status? Linking custom SerDe jar to table definition. - Key: HIVE-9252 URL: https://issues.apache.org/jira/browse/HIVE-9252 Project: Hive Issue Type: New Feature Components: Serializers/Deserializers Reporter: Niels Basjes Assignee: Ferdinand Xu Attachments: HIVE-9252.1.patch In HIVE-6047 the option was created that a jar file can be hooked to the definition of a function. (See: [Language Manual DDL: Permanent Functions|https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-PermanentFunctions] ) I propose to add something similar that can be used when defining an external table that relies on a custom Serde (I expect to usually only have the Deserializer). Something like this: {code} CREATE [TEMPORARY] [EXTERNAL] TABLE [IF NOT EXISTS] [db_name.]table_name ... STORED BY 'storage.handler.class.name' [WITH SERDEPROPERTIES (...)] [USING JAR|FILE|ARCHIVE 'file_uri' [, JAR|FILE|ARCHIVE 'file_uri'] ]; {code} Using this you can define (and share !!!) a Hive table on top of a custom fileformat without the need to let the IT operations people deploy a custom SerDe jar file on all nodes. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10306) We need to print tez summary when hive.server2.logging.level = PERFORMANCE.
[ https://issues.apache.org/jira/browse/HIVE-10306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14496097#comment-14496097 ] Hive QA commented on HIVE-10306: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12725477/HIVE-10306.4.patch {color:red}ERROR:{color} -1 due to 14 failed/errored test(s), 8694 tests executed *Failed tests:* {noformat} TestMinimrCliDriver-bucketmapjoin6.q-constprog_partitioner.q-infer_bucket_sort_dyn_part.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-external_table_with_space_in_location_path.q-infer_bucket_sort_merge.q-auto_sortmerge_join_16.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-groupby2.q-import_exported_table.q-bucketizedhiveinputformat.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-index_bitmap3.q-stats_counter_partitioned.q-temp_table_external.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_map_operators.q-join1.q-bucketmapjoin7.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_num_buckets.q-disable_merge_for_bucketing.q-uber_reduce.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_reducers_power_two.q-scriptfile1.q-scriptfile1_win.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-leftsemijoin_mr.q-load_hdfs_file_with_space_in_the_name.q-root_dir_external_table.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-list_bucket_dml_10.q-bucket_num_reducers.q-bucket6.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-load_fs2.q-file_with_header_footer.q-ql_rewrite_gbtoidx_cbo_1.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-parallel_orderby.q-reduce_deduplicate.q-ql_rewrite_gbtoidx_cbo_2.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-ql_rewrite_gbtoidx.q-smb_mapjoin_8.q - did not produce a TEST-*.xml file TestMinimrCliDriver-schemeAuthority2.q-bucket4.q-input16_cc.q-and-1-more - did not produce a TEST-*.xml file TestOperationLoggingAPIBase - did not produce a TEST-*.xml file {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3441/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3441/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3441/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 14 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12725477 - PreCommit-HIVE-TRUNK-Build We need to print tez summary when hive.server2.logging.level = PERFORMANCE. - Key: HIVE-10306 URL: https://issues.apache.org/jira/browse/HIVE-10306 Project: Hive Issue Type: Bug Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Hari Sankar Sivarama Subramaniyan Attachments: HIVE-10306.1.patch, HIVE-10306.2.patch, HIVE-10306.3.patch, HIVE-10306.4.patch We need to print tez summary when hive.server2.logging.level = PERFORMANCE. We introduced this parameter via HIVE-10119. The logging param for levels is only relevant to HS2, so for hive-cli users the hive.tez.exec.print.summary still makes sense. We can check for log-level param as well, in places we are checking value of hive.tez.exec.print.summary. Ie, consider hive.tez.exec.print.summary=true if log.level = PERFORMANCE. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9917) After HIVE-3454 is done, make int to timestamp conversion configurable
[ https://issues.apache.org/jira/browse/HIVE-9917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aihua Xu updated HIVE-9917: --- Attachment: HIVE-9917.patch After HIVE-3454 is done, make int to timestamp conversion configurable -- Key: HIVE-9917 URL: https://issues.apache.org/jira/browse/HIVE-9917 Project: Hive Issue Type: Improvement Reporter: Aihua Xu Assignee: Aihua Xu Attachments: HIVE-9917.patch After HIVE-3454 is fixed, we will have correct behavior of converting int to timestamp. While the customers are using such incorrect behavior for so long, better to make it configurable so that in one release, it will default to old/inconsistent way and the next release will default to new/consistent way. And then we will deprecate it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10036) Writing ORC format big table causes OOM - too many fixed sized stream buffers
[ https://issues.apache.org/jira/browse/HIVE-10036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Damien Carol updated HIVE-10036: Labels: orcfile (was: ) Writing ORC format big table causes OOM - too many fixed sized stream buffers - Key: HIVE-10036 URL: https://issues.apache.org/jira/browse/HIVE-10036 Project: Hive Issue Type: Improvement Reporter: Selina Zhang Assignee: Selina Zhang Labels: orcfile Attachments: HIVE-10036.1.patch, HIVE-10036.2.patch, HIVE-10036.3.patch, HIVE-10036.5.patch, HIVE-10036.6.patch ORC writer keeps multiple out steams for each column. Each output stream is allocated fixed size ByteBuffer (configurable, default to 256K). For a big table, the memory cost is unbearable. Specially when HCatalog dynamic partition involves, several hundreds files may be open and writing at the same time (same problems for FileSinkOperator). Global ORC memory manager controls the buffer size, but it only got kicked in at 5000 rows interval. An enhancement could be done here, but the problem is reducing the buffer size introduces worse compression and more IOs in read path. Sacrificing the read performance is always not a good choice. I changed the fixed size ByteBuffer to a dynamic growth buffer which up bound to the existing configurable buffer size. Most of the streams does not need large buffer so the performance got improved significantly. Comparing to Facebook's hive-dwrf, I monitored 2x performance gain with this fix. Solving OOM for ORC completely maybe needs lots of effort , but this is definitely a low hanging fruit. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10331) ORC : Is null SARG filters out all row groups written in old ORC format
[ https://issues.apache.org/jira/browse/HIVE-10331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14496153#comment-14496153 ] Hive QA commented on HIVE-10331: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12725481/HIVE-10331.02.patch {color:red}ERROR:{color} -1 due to 50 failed/errored test(s), 8688 tests executed *Failed tests:* {noformat} TestCustomAuthentication - did not produce a TEST-*.xml file TestMinimrCliDriver-bucketmapjoin6.q-constprog_partitioner.q-infer_bucket_sort_dyn_part.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-external_table_with_space_in_location_path.q-infer_bucket_sort_merge.q-auto_sortmerge_join_16.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-groupby2.q-import_exported_table.q-bucketizedhiveinputformat.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-index_bitmap3.q-stats_counter_partitioned.q-temp_table_external.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_map_operators.q-join1.q-bucketmapjoin7.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_num_buckets.q-disable_merge_for_bucketing.q-uber_reduce.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_reducers_power_two.q-scriptfile1.q-scriptfile1_win.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-leftsemijoin_mr.q-load_hdfs_file_with_space_in_the_name.q-root_dir_external_table.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-list_bucket_dml_10.q-bucket_num_reducers.q-bucket6.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-load_fs2.q-file_with_header_footer.q-ql_rewrite_gbtoidx_cbo_1.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-parallel_orderby.q-reduce_deduplicate.q-ql_rewrite_gbtoidx_cbo_2.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-ql_rewrite_gbtoidx.q-smb_mapjoin_8.q - did not produce a TEST-*.xml file TestMinimrCliDriver-schemeAuthority2.q-bucket4.q-input16_cc.q-and-1-more - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_insert_partition_dynamic org.apache.hadoop.hive.ql.io.orc.TestColumnStatistics.testHasNull org.apache.hadoop.hive.ql.io.orc.TestFileDump.testBloomFilter org.apache.hadoop.hive.ql.io.orc.TestFileDump.testBloomFilter2 org.apache.hadoop.hive.ql.io.orc.TestFileDump.testDictionaryThreshold org.apache.hadoop.hive.ql.io.orc.TestFileDump.testDump org.apache.hadoop.hive.ql.io.orc.TestOrcFile.test1[0] org.apache.hadoop.hive.ql.io.orc.TestOrcFile.test1[1] org.apache.hadoop.hive.ql.io.orc.TestOrcFile.testReadFormat_0_11[0] org.apache.hadoop.hive.ql.io.orc.TestOrcFile.testReadFormat_0_11[1] org.apache.hadoop.hive.ql.io.orc.TestOrcFile.testStringAndBinaryStatistics[0] org.apache.hadoop.hive.ql.io.orc.TestOrcFile.testStringAndBinaryStatistics[1] org.apache.hadoop.hive.ql.io.orc.TestOrcNullOptimization.testMultiStripeWithoutNull org.apache.hadoop.hive.ql.io.orc.TestOrcSerDeStats.testOrcSerDeStatsComplex org.apache.hadoop.hive.ql.io.orc.TestOrcSerDeStats.testOrcSerDeStatsComplexOldFormat org.apache.hadoop.hive.ql.io.orc.TestOrcSerDeStats.testSerdeStatsOldFormat org.apache.hadoop.hive.ql.io.orc.TestOrcSerDeStats.testStringAndBinaryStatistics org.apache.hadoop.hive.ql.io.orc.TestRecordReaderImpl.testBetween org.apache.hadoop.hive.ql.io.orc.TestRecordReaderImpl.testDateWritableEqualsBloomFilter org.apache.hadoop.hive.ql.io.orc.TestRecordReaderImpl.testDateWritableInBloomFilter org.apache.hadoop.hive.ql.io.orc.TestRecordReaderImpl.testDecimalEqualsBloomFilter org.apache.hadoop.hive.ql.io.orc.TestRecordReaderImpl.testDecimalInBloomFilter org.apache.hadoop.hive.ql.io.orc.TestRecordReaderImpl.testDoubleEqualsBloomFilter org.apache.hadoop.hive.ql.io.orc.TestRecordReaderImpl.testDoubleInBloomFilter org.apache.hadoop.hive.ql.io.orc.TestRecordReaderImpl.testEquals org.apache.hadoop.hive.ql.io.orc.TestRecordReaderImpl.testIn org.apache.hadoop.hive.ql.io.orc.TestRecordReaderImpl.testIntEqualsBloomFilter org.apache.hadoop.hive.ql.io.orc.TestRecordReaderImpl.testIntInBloomFilter org.apache.hadoop.hive.ql.io.orc.TestRecordReaderImpl.testIsNull org.apache.hadoop.hive.ql.io.orc.TestRecordReaderImpl.testLessThan org.apache.hadoop.hive.ql.io.orc.TestRecordReaderImpl.testLessThanEquals org.apache.hadoop.hive.ql.io.orc.TestRecordReaderImpl.testNullsInBloomFilter org.apache.hadoop.hive.ql.io.orc.TestRecordReaderImpl.testStringEqualsBloomFilter org.apache.hadoop.hive.ql.io.orc.TestRecordReaderImpl.testStringInBloomFilter org.apache.hadoop.hive.ql.io.orc.TestRecordReaderImpl.testTimestampEqualsBloomFilter org.apache.hadoop.hive.ql.io.orc.TestRecordReaderImpl.testTimestampInBloomFilter
[jira] [Updated] (HIVE-10342) Nested parenthesis for derived table in from clause - is not working
[ https://issues.apache.org/jira/browse/HIVE-10342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sanjiv singh updated HIVE-10342: Description: Hi All, Nested parenthesis for derived table in from clause, is not working. See the given query with derived table , and worling perfectly in hive. select count ( * ) from (select distinct em_last_name, em_first_name, em_d_date from employee UNION ALL select distinct cu_last_name, cu_first_name, cu_d_date from customer UNION ALL select distinct cl_last_name, cl_first_name, cl_d_date from client ) cool_cust; When I added additional parenthesis enclosing derived table , it failed in parsing. ...It seems HIVE ANTLR grammar is not compatible to such syntax. Failed Query : ### select count ( * ) from ((select distinct em_last_name, em_first_name, em_d_date from employee) UNION ALL (select distinct cu_last_name, cu_first_name, cu_d_date from customer) UNION ALL (select distinct cl_last_name, cl_first_name, cl_d_date from client) ) cool_cust; Exception : ## oViableAltException(283@[147:5: ( ( Identifier LPAREN )= partitionedTableFunction | tableSource | subQuerySource | virtualTableSource )]) at org.antlr.runtime.DFA.noViableAlt(DFA.java:158) at org.antlr.runtime.DFA.predict(DFA.java:144) at org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.fromSource(HiveParser_FromClauseParser.java:3625) at org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.joinSource(HiveParser_FromClauseParser.java:1814) at org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.fromClause(HiveParser_FromClauseParser.java:1471) at org.apache.hadoop.hive.ql.parse.HiveParser.fromClause(HiveParser.java:42804) at org.apache.hadoop.hive.ql.parse.HiveParser.singleSelectStatement(HiveParser.java:40229) at org.apache.hadoop.hive.ql.parse.HiveParser.selectStatement(HiveParser.java:39914) at org.apache.hadoop.hive.ql.parse.HiveParser.regularBody(HiveParser.java:39851) at org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpressionBody(HiveParser.java:38904) at org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpression(HiveParser.java:38780) at org.apache.hadoop.hive.ql.parse.HiveParser.execStatement(HiveParser.java:1514) at org.apache.hadoop.hive.ql.parse.HiveParser.statement(HiveParser.java:1052) at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:199) at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:166) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:389) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:303) at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1067) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1129) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1004) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:994) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:201) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:153) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:364) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:712) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:631) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:570) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.run(RunJar.java:221) at org.apache.hadoop.util.RunJar.main(RunJar.java:136) FAILED: ParseException line 1:41 cannot recognize input near '(' '(' 'SELECT' in from source was: Hi All, Nested parenthesis for derived table in from clause, is not working. See the given query with derived table , and worling perfectly in hive. select count(*) from (select distinct em_last_name, em_first_name, em_d_date from employee UNION ALL select distinct cu_last_name, cu_first_name, cu_d_date from customer UNION ALL select distinct cl_last_name, cl_first_name, cl_d_date from client ) cool_cust; When I added additional parenthesis enclosing derived table , it failed in parsing. ...It seems HIVE ANTLR grammar is not compatible to such syntax. Failed Query : ### select count(*) from ((select distinct em_last_name, em_first_name, em_d_date from employee) UNION ALL (select distinct cu_last_name,
[jira] [Commented] (HIVE-9580) Server returns incorrect result from JOIN ON VARCHAR columns
[ https://issues.apache.org/jira/browse/HIVE-9580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14496208#comment-14496208 ] Aihua Xu commented on HIVE-9580: [~szehon] Can you help review the code change? Server returns incorrect result from JOIN ON VARCHAR columns Key: HIVE-9580 URL: https://issues.apache.org/jira/browse/HIVE-9580 Project: Hive Issue Type: Bug Components: HiveServer2 Affects Versions: 0.12.0, 0.13.0, 0.14.0 Reporter: Mike Assignee: Aihua Xu Attachments: HIVE-9580.patch The database erroneously returns rows when joining two tables which each contain a VARCHAR column and the join's ON condition uses the equality operator on the VARCHAR columns. **The following JDBC method exhibits the problem: static void joinIssue() throws SQLException { String sql; int rowsAffected; ResultSet rs; Statement stmt = con.createStatement(); String table1_Name = blahtab1; String table1A_Name = blahtab1A; String table1B_Name = blahtab1B; String table2_Name = blahtab2; try { sql = drop table + table1_Name; System.out.println(\nsql= + sql); rowsAffected = stmt.executeUpdate(sql); } catch (SQLException se) { println(Drop table error: + se.getMessage()); } try { sql = CREATE TABLE + table1_Name + ( + VCHARCOL VARCHAR(10) + ,INTEGERCOL INT + ) ; System.out.println(\nsql= + sql); rowsAffected = stmt.executeUpdate(sql); } catch (SQLException se) { println(create table error: + se.getMessage()); } sql = insert into + table1_Name + values ('jklmnopqrs', 99); System.out.println(\nsql= + sql); stmt.executeUpdate(sql); System.out.println(===); try { sql = drop table + table1A_Name; System.out.println(\nsql= + sql); rowsAffected = stmt.executeUpdate(sql); } catch (SQLException se) { println(Drop table error: + se.getMessage()); } try { sql = CREATE TABLE + table1A_Name + ( + VCHARCOL VARCHAR(10) + ) ; System.out.println(\nsql= + sql); rowsAffected = stmt.executeUpdate(sql); } catch (SQLException se) { println(create table error: + se.getMessage()); } sql = insert into + table1A_Name + values ('jklmnopqrs'); System.out.println(\nsql= + sql); stmt.executeUpdate(sql); System.out.println(===); try { sql = drop table + table1B_Name; System.out.println(\nsql= + sql); rowsAffected = stmt.executeUpdate(sql); } catch (SQLException se) { println(Drop table error: + se.getMessage()); } try { sql = CREATE TABLE + table1B_Name + ( + VCHARCOL VARCHAR(11) + ,INTEGERCOL INT + ) ; System.out.println(\nsql= + sql); rowsAffected = stmt.executeUpdate(sql); } catch (SQLException se) { println(create table error: + se.getMessage()); } sql = insert into + table1B_Name + values ('jklmnopqrs', 99); System.out.println(\nsql= + sql);
[jira] [Commented] (HIVE-10288) Cannot call permanent UDFs
[ https://issues.apache.org/jira/browse/HIVE-10288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14496252#comment-14496252 ] Hive QA commented on HIVE-10288: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12725492/HIVE-10288.1.patch {color:red}ERROR:{color} -1 due to 13 failed/errored test(s), 8689 tests executed *Failed tests:* {noformat} TestMinimrCliDriver-bucketmapjoin6.q-constprog_partitioner.q-infer_bucket_sort_dyn_part.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-external_table_with_space_in_location_path.q-infer_bucket_sort_merge.q-auto_sortmerge_join_16.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-groupby2.q-import_exported_table.q-bucketizedhiveinputformat.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-index_bitmap3.q-stats_counter_partitioned.q-temp_table_external.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_map_operators.q-join1.q-bucketmapjoin7.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_num_buckets.q-disable_merge_for_bucketing.q-uber_reduce.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_reducers_power_two.q-scriptfile1.q-scriptfile1_win.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-leftsemijoin_mr.q-load_hdfs_file_with_space_in_the_name.q-root_dir_external_table.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-list_bucket_dml_10.q-bucket_num_reducers.q-bucket6.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-load_fs2.q-file_with_header_footer.q-ql_rewrite_gbtoidx_cbo_1.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-parallel_orderby.q-reduce_deduplicate.q-ql_rewrite_gbtoidx_cbo_2.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-ql_rewrite_gbtoidx.q-smb_mapjoin_8.q - did not produce a TEST-*.xml file TestMinimrCliDriver-schemeAuthority2.q-bucket4.q-input16_cc.q-and-1-more - did not produce a TEST-*.xml file {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3443/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3443/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3443/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 13 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12725492 - PreCommit-HIVE-TRUNK-Build Cannot call permanent UDFs -- Key: HIVE-10288 URL: https://issues.apache.org/jira/browse/HIVE-10288 Project: Hive Issue Type: Bug Affects Versions: 1.2.0 Reporter: Nezih Yigitbasi Assignee: Chinna Rao Lalam Attachments: HIVE-10288.1.patch, HIVE-10288.patch Just pulled the trunk and built the hive binary. If I create a permanent udf and exit the cli, and then open the cli and try calling the udf it fails with the exception below. However, the call succeeds if I call the udf right after registering the permanent udf (without exiting the cli). The call also succeeds with the apache-hive-1.0.0 release. {code} 15-04-13 17:04:54,004 INFO org.apache.hadoop.hive.ql.log.PerfLogger (PerfLogger.java:PerfLogEnd(148)) - /PERFLOG method=parse start=1428969893115 end=1428969894004 duration=889 from=org.apache.hadoop.hive.ql.Driver 2015-04-13 17:04:54,007 DEBUG org.apache.hadoop.hive.ql.Driver (Driver.java:recordValidTxns(939)) - Encoding valid txns info 9223372036854775807: 2015-04-13 17:04:54,007 INFO org.apache.hadoop.hive.ql.log.PerfLogger (PerfLogger.java:PerfLogBegin(121)) - PERFLOG method=semanticAnalyze from=org.apache.hadoop.hive.ql.Driver 2015-04-13 17:04:54,052 INFO org.apache.hadoop.hive.ql.parse.CalcitePlanner (SemanticAnalyzer.java:analyzeInternal(9997)) - Starting Semantic Analysis 2015-04-13 17:04:54,053 DEBUG org.apache.hadoop.hive.ql.exec.FunctionRegistry (FunctionRegistry.java:getGenericUDAFResolver(942)) - Looking up GenericUDAF: hour_now 2015-04-13 17:04:54,053 INFO org.apache.hadoop.hive.ql.parse.CalcitePlanner (SemanticAnalyzer.java:genResolvedParseTree(9980)) - Completed phase 1 of Semantic Analysis 2015-04-13 17:04:54,053 INFO org.apache.hadoop.hive.ql.parse.CalcitePlanner (SemanticAnalyzer.java:getMetaData(1530)) - Get metadata for source tables 2015-04-13 17:04:54,054 INFO
[jira] [Commented] (HIVE-10228) Changes to Hive Export/Import/DropTable/DropPartition to support replication semantics
[ https://issues.apache.org/jira/browse/HIVE-10228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14496376#comment-14496376 ] Sushanth Sowmyan commented on HIVE-10228: - RB link : https://reviews.apache.org/r/33224/ Changes to Hive Export/Import/DropTable/DropPartition to support replication semantics -- Key: HIVE-10228 URL: https://issues.apache.org/jira/browse/HIVE-10228 Project: Hive Issue Type: Sub-task Components: Import/Export Affects Versions: 1.2.0 Reporter: Sushanth Sowmyan Assignee: Sushanth Sowmyan Attachments: HIVE-10228.2.patch, HIVE-10228.3.patch, HIVE-10228.patch We need to update a couple of hive commands to support replication semantics. To wit, we need the following: EXPORT ... [FOR [METADATA] REPLICATION(“comment”)] Export will now support an extra optional clause to tell it that this export is being prepared for the purpose of replication. There is also an additional optional clause here, that allows for the export to be a metadata-only export, to handle cases of capturing the diff for alter statements, for example. Also, if done for replication, the non-presence of a table, or a table being a view/offline table/non-native table is not considered an error, and instead, will result in a successful no-op. IMPORT ... (as normal) – but handles new semantics No syntax changes for import, but import will have to change to be able to handle all the permutations of export dumps possible. Also, import will have to ensure that it should update the object only if the update being imported is not older than the state of the object. Also, import currently does not work with dbname.tablename kind of specification, this should be fixed to work. DROP TABLE ... FOR REPLICATION('eventid') Drop Table now has an additional clause, to specify that this drop table is being done for replication purposes, and that the dop should not actually drop the table if the table is newer than that event id specified. ALTER TABLE ... DROP PARTITION (...) FOR REPLICATION('eventid') Similarly, Drop Partition also has an equivalent change to Drop Table. = In addition, we introduce a new property repl.last.id, which when tagged on to table properties or partition properties on a replication-destination, holds the effective state identifier of the object. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10324) Hive metatool should take table_param_key to allow for changes to avro serde's schema url key
[ https://issues.apache.org/jira/browse/HIVE-10324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14496425#comment-14496425 ] Hive QA commented on HIVE-10324: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12725504/HIVE-10324.patch {color:red}ERROR:{color} -1 due to 15 failed/errored test(s), 8688 tests executed *Failed tests:* {noformat} TestCustomAuthentication - did not produce a TEST-*.xml file TestMinimrCliDriver-bucketmapjoin6.q-constprog_partitioner.q-infer_bucket_sort_dyn_part.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-external_table_with_space_in_location_path.q-infer_bucket_sort_merge.q-auto_sortmerge_join_16.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-groupby2.q-import_exported_table.q-bucketizedhiveinputformat.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-index_bitmap3.q-stats_counter_partitioned.q-temp_table_external.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_map_operators.q-join1.q-bucketmapjoin7.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_num_buckets.q-disable_merge_for_bucketing.q-uber_reduce.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_reducers_power_two.q-scriptfile1.q-scriptfile1_win.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-leftsemijoin_mr.q-load_hdfs_file_with_space_in_the_name.q-root_dir_external_table.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-list_bucket_dml_10.q-bucket_num_reducers.q-bucket6.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-load_fs2.q-file_with_header_footer.q-ql_rewrite_gbtoidx_cbo_1.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-parallel_orderby.q-reduce_deduplicate.q-ql_rewrite_gbtoidx_cbo_2.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-ql_rewrite_gbtoidx.q-smb_mapjoin_8.q - did not produce a TEST-*.xml file TestMinimrCliDriver-schemeAuthority2.q-bucket4.q-input16_cc.q-and-1-more - did not produce a TEST-*.xml file org.apache.hadoop.hive.metastore.TestHiveMetaTool.testUpdateFSRootLocation {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3444/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3444/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3444/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 15 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12725504 - PreCommit-HIVE-TRUNK-Build Hive metatool should take table_param_key to allow for changes to avro serde's schema url key - Key: HIVE-10324 URL: https://issues.apache.org/jira/browse/HIVE-10324 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 1.1.0 Reporter: Szehon Ho Assignee: Ferdinand Xu Attachments: HIVE-10324.patch, HIVE-10324.patch.WIP HIVE-3443 added support to change the serdeParams from 'metatool updateLocation' command. However, in avro it is possible to specify the schema via the tableParams: {noformat} CREATE TABLE `testavro`( `test` string COMMENT 'from deserializer') ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat' TBLPROPERTIES ( 'avro.schema.url'='hdfs://namenode:8020/tmp/test.avsc', 'kite.compression.type'='snappy', 'transient_lastDdlTime'='1427996456') {noformat} Hence for those tables the 'metatool updateLocation' will not help. This is necessary in case like upgrade the namenode to HA where the absolute paths have changed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10307) Support to use number literals in partition column
[ https://issues.apache.org/jira/browse/HIVE-10307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chaoyu Tang updated HIVE-10307: --- Attachment: HIVE-10307.1.patch Fixed for failed tests. Support to use number literals in partition column -- Key: HIVE-10307 URL: https://issues.apache.org/jira/browse/HIVE-10307 Project: Hive Issue Type: Improvement Components: Query Processor Affects Versions: 1.0.0 Reporter: Chaoyu Tang Assignee: Chaoyu Tang Attachments: HIVE-10307.1.patch, HIVE-10307.patch Data types like TinyInt, SmallInt, BigInt or Decimal can be expressed as literals with postfix like Y, S, L, or BD appended to the number. These literals work in most Hive queries, but do not when they are used as partition column value. For a partitioned table like: create table partcoltypenum (key int, value string) partitioned by (tint tinyint, sint smallint, bint bigint); insert into partcoltypenum partition (tint=100Y, sint=1S, bint=1000L) select key, value from src limit 30; Queries like select, describe and drop partition do not work. For an example select * from partcoltypenum where tint=100Y and sint=1S and bint=1000L; does not return any rows. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-5672) Insert with custom separator not supported for non-local directory
[ https://issues.apache.org/jira/browse/HIVE-5672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14496403#comment-14496403 ] Sushanth Sowmyan commented on HIVE-5672: Adding additional doc note here : https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DML#LanguageManualDML-Writingdataintothefilesystemfromqueries needs to be updated to note that delimiters are not currently supported for non-LOCAL writes, and once this patch goes in, we should note which version fixed that in that doc. Insert with custom separator not supported for non-local directory -- Key: HIVE-5672 URL: https://issues.apache.org/jira/browse/HIVE-5672 Project: Hive Issue Type: Bug Affects Versions: 0.12.0, 1.0.0 Reporter: Romain Rigaux Assignee: Nemon Lou Attachments: HIVE-5672.1.patch, HIVE-5672.2.patch https://issues.apache.org/jira/browse/HIVE-3682 is great but non local directory don't seem to be supported: {code} insert overwrite directory '/tmp/test-02' row format delimited FIELDS TERMINATED BY ':' select description FROM sample_07 {code} {code} Error while compiling statement: FAILED: ParseException line 2:0 cannot recognize input near 'row' 'format' 'delimited' in select clause {code} This works (with 'local'): {code} insert overwrite local directory '/tmp/test-02' row format delimited FIELDS TERMINATED BY ':' select code, description FROM sample_07 {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10228) Changes to Hive Export/Import/DropTable/DropPartition to support replication semantics
[ https://issues.apache.org/jira/browse/HIVE-10228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sushanth Sowmyan updated HIVE-10228: Description: We need to update a couple of hive commands to support replication semantics. To wit, we need the following: EXPORT ... [FOR [METADATA] REPLICATION(“comment”)] Export will now support an extra optional clause to tell it that this export is being prepared for the purpose of replication. There is also an additional optional clause here, that allows for the export to be a metadata-only export, to handle cases of capturing the diff for alter statements, for example. Also, if done for replication, the non-presence of a table, or a table being a view/offline table/non-native table is not considered an error, and instead, will result in a successful no-op. IMPORT ... (as normal) – but handles new semantics No syntax changes for import, but import will have to change to be able to handle all the permutations of export dumps possible. Also, import will have to ensure that it should update the object only if the update being imported is not older than the state of the object. Also, import currently does not work with dbname.tablename kind of specification, this should be fixed to work. DROP TABLE ... FOR REPLICATION('eventid') Drop Table now has an additional clause, to specify that this drop table is being done for replication purposes, and that the dop should not actually drop the table if the table is newer than that event id specified. ALTER TABLE ... DROP PARTITION (...) FOR REPLICATION('eventid') Similarly, Drop Partition also has an equivalent change to Drop Table. = In addition, we introduce a new property repl.last.id, which when tagged on to table properties or partition properties on a replication-destination, holds the effective state identifier of the object. was: We need to update a couple of hive commands to support replication semantics. To wit, we need the following: EXPORT ... [FOR [METADATA] REPLICATION(“comment”)] Export will now support an extra optional clause to tell it that this export is being prepared for the purpose of replication. There is also an additional optional clause here, that allows for the export to be a metadata-only export, to handle cases of capturing the diff for alter statements, for example. Also, if done for replication, the non-presence of a table, or a table being a view/offline table/non-native table is not considered an error, and instead, will result in a successful no-op. IMPORT ... (as normal) – but handles new semantics No syntax changes for import, but import will have to change to be able to handle all the permutations of export dumps possible. Also, import will have to ensure that it should update the object only if the update being imported is not older than the state of the object. DROP TABLE ... FOR REPLICATION('eventid') Drop Table now has an additional clause, to specify that this drop table is being done for replication purposes, and that the dop should not actually drop the table if the table is newer than that event id specified. ALTER TABLE ... DROP PARTITION (...) FOR REPLICATION('eventid') Similarly, Drop Partition also has an equivalent change to Drop Table. = In addition, we introduce a new property repl.last.id, which when tagged on to table properties or partition properties on a replication-destination, holds the effective state identifier of the object. Changes to Hive Export/Import/DropTable/DropPartition to support replication semantics -- Key: HIVE-10228 URL: https://issues.apache.org/jira/browse/HIVE-10228 Project: Hive Issue Type: Sub-task Components: Import/Export Affects Versions: 1.2.0 Reporter: Sushanth Sowmyan Assignee: Sushanth Sowmyan Attachments: HIVE-10228.2.patch, HIVE-10228.3.patch, HIVE-10228.patch We need to update a couple of hive commands to support replication semantics. To wit, we need the following: EXPORT ... [FOR [METADATA] REPLICATION(“comment”)] Export will now support an extra optional clause to tell it that this export is being prepared for the purpose of replication. There is also an additional optional clause here, that allows for the export to be a metadata-only export, to handle cases of capturing the diff for alter statements, for example. Also, if done for replication, the non-presence of a table, or a table being a view/offline table/non-native table is not considered an error, and instead, will result in a successful no-op. IMPORT ... (as normal) – but handles new semantics No syntax changes for import, but import will have to change to be able to handle all the permutations of export dumps possible. Also, import will have
[jira] [Updated] (HIVE-10310) Support GROUPING() in HIVE
[ https://issues.apache.org/jira/browse/HIVE-10310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Laljo John Pullokkaran updated HIVE-10310: -- Summary: Support GROUPING() in HIVE (was: Support GROUPING() and GROUP_ID() in HIVE) Support GROUPING() in HIVE -- Key: HIVE-10310 URL: https://issues.apache.org/jira/browse/HIVE-10310 Project: Hive Issue Type: New Feature Components: Parser, SQL Reporter: sanjiv singh Priority: Minor I have lots of queries using GROUPING() function. failing on hive , just because GROUPING() not supported in hive. See the Query below; SELECT fact_1_id, fact_2_id, GROUPING(fact_1_id) AS f1g, GROUPING(fact_2_id) AS f2g FROM dimension_tab GROUP BY CUBE (fact_1_id, fact_2_id) ORDER BY fact_1_id, fact_2_id; In order to run in HIVE all such queries, It need to be transformed to HIVE syntax. See below transformed query, compatible to hive. Equivalent have been derived using Case statement . SELECT fact_1_id, fact_2_id, (case when (GROUPING__ID 1) = 0 then 1 else 0 end) as f1g, (case when (GROUPING__ID 2) = 0 then 1 else 0 end) as f2g FROM dimension_tab GROUP BY fact_1_id, fact_2_id WITH CUBE ORDER BY fact_1_id, fact_2_id; It would be great if GROUPING() implemented in hive. I see two ways to do it 1) Handle it at parser level. 2) GROUPING() aggregate function to hive(recommended) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10288) Cannot call permanent UDFs
[ https://issues.apache.org/jira/browse/HIVE-10288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14496468#comment-14496468 ] Nezih Yigitbasi commented on HIVE-10288: Thanks [~jdere] and [~chinnalalam] for the quick turnaround. I also verified the patch with several tests and it seems to solve this issue. Cannot call permanent UDFs -- Key: HIVE-10288 URL: https://issues.apache.org/jira/browse/HIVE-10288 Project: Hive Issue Type: Bug Affects Versions: 1.2.0 Reporter: Nezih Yigitbasi Assignee: Chinna Rao Lalam Attachments: HIVE-10288.1.patch, HIVE-10288.patch Just pulled the trunk and built the hive binary. If I create a permanent udf and exit the cli, and then open the cli and try calling the udf it fails with the exception below. However, the call succeeds if I call the udf right after registering the permanent udf (without exiting the cli). The call also succeeds with the apache-hive-1.0.0 release. {code} 15-04-13 17:04:54,004 INFO org.apache.hadoop.hive.ql.log.PerfLogger (PerfLogger.java:PerfLogEnd(148)) - /PERFLOG method=parse start=1428969893115 end=1428969894004 duration=889 from=org.apache.hadoop.hive.ql.Driver 2015-04-13 17:04:54,007 DEBUG org.apache.hadoop.hive.ql.Driver (Driver.java:recordValidTxns(939)) - Encoding valid txns info 9223372036854775807: 2015-04-13 17:04:54,007 INFO org.apache.hadoop.hive.ql.log.PerfLogger (PerfLogger.java:PerfLogBegin(121)) - PERFLOG method=semanticAnalyze from=org.apache.hadoop.hive.ql.Driver 2015-04-13 17:04:54,052 INFO org.apache.hadoop.hive.ql.parse.CalcitePlanner (SemanticAnalyzer.java:analyzeInternal(9997)) - Starting Semantic Analysis 2015-04-13 17:04:54,053 DEBUG org.apache.hadoop.hive.ql.exec.FunctionRegistry (FunctionRegistry.java:getGenericUDAFResolver(942)) - Looking up GenericUDAF: hour_now 2015-04-13 17:04:54,053 INFO org.apache.hadoop.hive.ql.parse.CalcitePlanner (SemanticAnalyzer.java:genResolvedParseTree(9980)) - Completed phase 1 of Semantic Analysis 2015-04-13 17:04:54,053 INFO org.apache.hadoop.hive.ql.parse.CalcitePlanner (SemanticAnalyzer.java:getMetaData(1530)) - Get metadata for source tables 2015-04-13 17:04:54,054 INFO org.apache.hadoop.hive.metastore.HiveMetaStore (HiveMetaStore.java:logInfo(744)) - 0: get_table : db=default tbl=test_table 2015-04-13 17:04:54,054 INFO org.apache.hadoop.hive.metastore.HiveMetaStore.audit (HiveMetaStore.java:logAuditEvent(369)) - ugi=nyigitbasi ip=unknown-ip-addr cmd=get_table : db=default tbl=test_table 2015-04-13 17:04:54,054 DEBUG org.apache.hadoop.hive.metastore.ObjectStore (ObjectStore.java:debugLog(6776)) - Open transaction: count = 1, isActive = true at: org.apache.hadoop.hive.metastore.ObjectStore.getTable(ObjectStore.java:927) 2015-04-13 17:04:54,054 DEBUG org.apache.hadoop.hive.metastore.ObjectStore (ObjectStore.java:debugLog(6776)) - Open transaction: count = 2, isActive = true at: org.apache.hadoop.hive.metastore.ObjectStore.getMTable(ObjectStore.java:990) 2015-04-13 17:04:54,104 DEBUG org.apache.hadoop.hive.metastore.ObjectStore (ObjectStore.java:debugLog(6776)) - Commit transaction: count = 1, isactive true at: org.apache.hadoop.hive.metastore.ObjectStore.getMTable(ObjectStore.java:998) 2015-04-13 17:04:54,232 DEBUG org.apache.hadoop.hive.metastore.ObjectStore (ObjectStore.java:debugLog(6776)) - Commit transaction: count = 0, isactive true at: org.apache.hadoop.hive.metastore.ObjectStore.getTable(ObjectStore.java:929) 2015-04-13 17:04:54,242 INFO org.apache.hadoop.hive.ql.parse.CalcitePlanner (SemanticAnalyzer.java:getMetaData(1682)) - Get metadata for subqueries 2015-04-13 17:04:54,247 INFO org.apache.hadoop.hive.ql.parse.CalcitePlanner (SemanticAnalyzer.java:getMetaData(1706)) - Get metadata for destination tables 2015-04-13 17:04:54,256 INFO org.apache.hadoop.hive.ql.parse.CalcitePlanner (SemanticAnalyzer.java:genResolvedParseTree(9984)) - Completed getting MetaData in Semantic Analysis 2015-04-13 17:04:54,259 INFO org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer (CalcitePlanner.java:canHandleAstForCbo(369)) - Not invoking CBO because the statement has too few joins 2015-04-13 17:04:54,344 DEBUG org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe (LazySimpleSerDe.java:initialize(135)) - org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe initialized with: columnNames=[_c0, _c1] columnTypes=[int, int] separator=[[B@6e6d4780] nullstring=\N lastColumnTakesRest=false timestampFormats=null 2015-04-13 17:04:54,406 DEBUG org.apache.hadoop.hive.ql.parse.CalcitePlanner (SemanticAnalyzer.java:genTablePlan(9458)) - Created Table Plan for test_table TS[0] 2015-04-13 17:04:54,410 DEBUG
[jira] [Updated] (HIVE-10239) Create scripts to do metastore upgrade tests on jenkins for Derby, Oracle and PostgreSQL
[ https://issues.apache.org/jira/browse/HIVE-10239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergio Peña updated HIVE-10239: --- Attachment: HIVE-10239.0.patch Re-uploading patch to start jenkins tests. Create scripts to do metastore upgrade tests on jenkins for Derby, Oracle and PostgreSQL Key: HIVE-10239 URL: https://issues.apache.org/jira/browse/HIVE-10239 Project: Hive Issue Type: Improvement Affects Versions: 1.1.0 Reporter: Naveen Gangam Assignee: Naveen Gangam Attachments: HIVE-10239-donotcommit.patch, HIVE-10239.0.patch, HIVE-10239.0.patch, HIVE-10239.patch Need to create DB-implementation specific scripts to use the framework introduced in HIVE-9800 to have any metastore schema changes tested across all supported databases. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10319) Hive CLI startup takes a long time with a large number of databases
[ https://issues.apache.org/jira/browse/HIVE-10319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nezih Yigitbasi updated HIVE-10319: --- Attachment: HIVE-10319.patch Hive CLI startup takes a long time with a large number of databases --- Key: HIVE-10319 URL: https://issues.apache.org/jira/browse/HIVE-10319 Project: Hive Issue Type: Improvement Components: CLI Affects Versions: 1.0.0 Reporter: Nezih Yigitbasi Assignee: Nezih Yigitbasi Attachments: HIVE-10319.patch The Hive CLI takes a long time to start when there is a large number of databases in the DW. I think the root cause is the way permanent UDFs are loaded from the metastore. When I looked at the logs and the source code I see that at startup Hive first gets all the databases from the metastore and then for each database it makes a metastore call to get the permanent functions for that database [see Hive.java | https://github.com/apache/hive/blob/trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java#L162-185]. So the number of metastore calls made is in the order of the number of databases. In production we have several hundreds of databases so Hive makes several hundreds of RPC calls during startup, taking 30+ seconds. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-10341) CBO (Calcite Return Path): TraitSets not correctly propagated in HiveSortExchange causes Assertion error
[ https://issues.apache.org/jira/browse/HIVE-10341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan resolved HIVE-10341. - Resolution: Fixed Committed to branch. Thanks, Jesus! CBO (Calcite Return Path): TraitSets not correctly propagated in HiveSortExchange causes Assertion error Key: HIVE-10341 URL: https://issues.apache.org/jira/browse/HIVE-10341 Project: Hive Issue Type: Sub-task Components: CBO Affects Versions: cbo-branch Reporter: Jesus Camacho Rodriguez Assignee: Jesus Camacho Rodriguez Fix For: cbo-branch Attachments: HIVE-10341.cbo.patch When return path is on ({{hive.cbo.returnpath.hiveop=true}}), the TraitSets are not correctly set up by HiveSortExchange. For instance, correlationoptimizer14.q produces the following exception: {noformat} Unexpected exception java.lang.AssertionError: traits=NONE.[], collation=[0] at org.apache.calcite.rel.core.SortExchange.init(SortExchange.java:63) at org.apache.hadoop.hive.ql.optimizer.calcite.reloperators.HiveSortExchange.init(HiveSortExchange.java:18) at org.apache.hadoop.hive.ql.optimizer.calcite.reloperators.HiveSortExchange.create(HiveSortExchange.java:39) at org.apache.hadoop.hive.ql.optimizer.calcite.rules.HiveInsertExchange4JoinRule.onMatch(HiveInsertExchange4JoinRule.java:95) at org.apache.calcite.plan.AbstractRelOptPlanner.fireRule(AbstractRelOptPlanner.java:326) at org.apache.calcite.plan.hep.HepPlanner.applyRule(HepPlanner.java:515) at org.apache.calcite.plan.hep.HepPlanner.applyRules(HepPlanner.java:392) at org.apache.calcite.plan.hep.HepPlanner.executeInstruction(HepPlanner.java:255) at org.apache.calcite.plan.hep.HepInstruction$RuleInstance.execute(HepInstruction.java:125) at org.apache.calcite.plan.hep.HepPlanner.executeProgram(HepPlanner.java:207) at org.apache.calcite.plan.hep.HepPlanner.findBestExp(HepPlanner.java:194) ... {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10270) Cannot use Decimal constants less than 0.1BD
[ https://issues.apache.org/jira/browse/HIVE-10270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14496529#comment-14496529 ] Hive QA commented on HIVE-10270: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12725506/HIVE-10270.4.patch {color:red}ERROR:{color} -1 due to 15 failed/errored test(s), 8690 tests executed *Failed tests:* {noformat} TestMinimrCliDriver-bucketmapjoin6.q-constprog_partitioner.q-infer_bucket_sort_dyn_part.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-external_table_with_space_in_location_path.q-infer_bucket_sort_merge.q-auto_sortmerge_join_16.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-groupby2.q-import_exported_table.q-bucketizedhiveinputformat.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-index_bitmap3.q-stats_counter_partitioned.q-temp_table_external.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_map_operators.q-join1.q-bucketmapjoin7.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_num_buckets.q-disable_merge_for_bucketing.q-uber_reduce.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_reducers_power_two.q-scriptfile1.q-scriptfile1_win.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-leftsemijoin_mr.q-load_hdfs_file_with_space_in_the_name.q-root_dir_external_table.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-list_bucket_dml_10.q-bucket_num_reducers.q-bucket6.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-load_fs2.q-file_with_header_footer.q-ql_rewrite_gbtoidx_cbo_1.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-parallel_orderby.q-reduce_deduplicate.q-ql_rewrite_gbtoidx_cbo_2.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-ql_rewrite_gbtoidx.q-smb_mapjoin_8.q - did not produce a TEST-*.xml file TestMinimrCliDriver-schemeAuthority2.q-bucket4.q-input16_cc.q-and-1-more - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udaf_context_ngrams org.apache.hadoop.hive.serde2.binarysortable.TestBinarySortableFast.testBinarySortableFast {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3445/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3445/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3445/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 15 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12725506 - PreCommit-HIVE-TRUNK-Build Cannot use Decimal constants less than 0.1BD Key: HIVE-10270 URL: https://issues.apache.org/jira/browse/HIVE-10270 Project: Hive Issue Type: Bug Components: Types Reporter: Jason Dere Assignee: Jason Dere Attachments: HIVE-10270.1.patch, HIVE-10270.2.patch, HIVE-10270.3.patch, HIVE-10270.4.patch {noformat} hive select 0.09765625BD; FAILED: IllegalArgumentException Decimal scale must be less than or equal to precision {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9923) No clear message when from is missing
[ https://issues.apache.org/jira/browse/HIVE-9923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14496451#comment-14496451 ] Yongzhi Chen commented on HIVE-9923: The NullPointerException stack is: {noformat} Caused by: java.lang.NullPointerException at org.apache.hadoop.hive.ql.parse.HiveParser.regularBody(HiveParser.java:40882) at org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpressionBody(HiveParser.java:40059) at org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpression(HiveParser.java:39929) at org.apache.hadoop.hive.ql.parse.HiveParser.execStatement(HiveParser.java:1574) at org.apache.hadoop.hive.ql.parse.HiveParser.statement(HiveParser.java:1093) at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:202) at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:166) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:396) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:308) at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1122) at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1116) at org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:110) ... 27 more {noformat} It is from HiveParser.java: {noformat} if ( state.backtracking==0 ) {(s!=null?((CommonTree)s.tree):null).getChild(1).replaceChildren(0, 0, (i!=null?((CommonTree)i.tree):null));} {noformat} Where there is no from key word, the getChild(1) will be null, then the exception thrown. When insert with select statement, a from should be required not optional. Make the parser change to let it error out before reach getChild(1). No clear message when from is missing --- Key: HIVE-9923 URL: https://issues.apache.org/jira/browse/HIVE-9923 Project: Hive Issue Type: Bug Affects Versions: 1.0.0 Reporter: Jeff Zhang Assignee: Yongzhi Chen Attachments: HIVE-9923.1.patch For the following sql, from is missing but it throw NPE which is not clear for user. {code} hive insert overwrite directory '/tmp/hive-3' select sb1.name, sb2.age student_bucketed sb1 join student_bucketed sb2 on sb1.name=sb2.name; FAILED: NullPointerException null {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9923) No clear message when from is missing
[ https://issues.apache.org/jira/browse/HIVE-9923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yongzhi Chen updated HIVE-9923: --- Attachment: HIVE-9923.1.patch No clear message when from is missing --- Key: HIVE-9923 URL: https://issues.apache.org/jira/browse/HIVE-9923 Project: Hive Issue Type: Bug Affects Versions: 1.0.0 Reporter: Jeff Zhang Assignee: Yongzhi Chen Attachments: HIVE-9923.1.patch For the following sql, from is missing but it throw NPE which is not clear for user. {code} hive insert overwrite directory '/tmp/hive-3' select sb1.name, sb2.age student_bucketed sb1 join student_bucketed sb2 on sb1.name=sb2.name; FAILED: NullPointerException null {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8136) Reduce table locking
[ https://issues.apache.org/jira/browse/HIVE-8136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14496651#comment-14496651 ] Hive QA commented on HIVE-8136: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12725511/HIVE-8136.1.patch {color:red}ERROR:{color} -1 due to 13 failed/errored test(s), 8689 tests executed *Failed tests:* {noformat} TestMinimrCliDriver-bucketmapjoin6.q-constprog_partitioner.q-infer_bucket_sort_dyn_part.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-external_table_with_space_in_location_path.q-infer_bucket_sort_merge.q-auto_sortmerge_join_16.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-groupby2.q-import_exported_table.q-bucketizedhiveinputformat.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-index_bitmap3.q-stats_counter_partitioned.q-temp_table_external.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_map_operators.q-join1.q-bucketmapjoin7.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_num_buckets.q-disable_merge_for_bucketing.q-uber_reduce.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_reducers_power_two.q-scriptfile1.q-scriptfile1_win.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-leftsemijoin_mr.q-load_hdfs_file_with_space_in_the_name.q-root_dir_external_table.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-list_bucket_dml_10.q-bucket_num_reducers.q-bucket6.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-load_fs2.q-file_with_header_footer.q-ql_rewrite_gbtoidx_cbo_1.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-parallel_orderby.q-reduce_deduplicate.q-ql_rewrite_gbtoidx_cbo_2.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-ql_rewrite_gbtoidx.q-smb_mapjoin_8.q - did not produce a TEST-*.xml file TestMinimrCliDriver-schemeAuthority2.q-bucket4.q-input16_cc.q-and-1-more - did not produce a TEST-*.xml file {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3446/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3446/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3446/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 13 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12725511 - PreCommit-HIVE-TRUNK-Build Reduce table locking Key: HIVE-8136 URL: https://issues.apache.org/jira/browse/HIVE-8136 Project: Hive Issue Type: Sub-task Reporter: Brock Noland Assignee: Ferdinand Xu Attachments: HIVE-8136.1.patch, HIVE-8136.patch When using ZK for concurrency control, some statements require an exclusive table lock when they are atomic. Such as setting a tables location. This JIRA is to analyze the scope of statements like ALTER TABLE and see if we can reduce the locking required. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10329) Hadoop reflectionutils has issues
[ https://issues.apache.org/jira/browse/HIVE-10329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-10329: Summary: Hadoop reflectionutils has issues (was: LLAP: Hadoop reflectionutils has issues) Hadoop reflectionutils has issues - Key: HIVE-10329 URL: https://issues.apache.org/jira/browse/HIVE-10329 Project: Hive Issue Type: Sub-task Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: HIVE-10329.patch 1) Constructor cache leaks classes and their attendant static overhead forever. 2) Class cache inside conf used when getting JobConfigurable classes has an epic lock. Both bugs are files in Hadoop but will hardly ever be fixed at this rate. This version avoids both problems -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10331) ORC : Is null SARG filters out all row groups written in old ORC format
[ https://issues.apache.org/jira/browse/HIVE-10331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14496571#comment-14496571 ] Prasanth Jayachandran commented on HIVE-10331: -- Actually there is more to this issue, you might need to set the hasNull default back to false as setNull() explicitly changes it to true whenever a null is encountered for a column which is correct. The wrong part is not the initialization but the condition when hasNull is missing. Can you change the initialization of hasNull same as the old one and add an else condition of hasHasNull() check which returns true when hasNull protobuf field is missing? ORC : Is null SARG filters out all row groups written in old ORC format --- Key: HIVE-10331 URL: https://issues.apache.org/jira/browse/HIVE-10331 Project: Hive Issue Type: Bug Components: Hive Affects Versions: 1.1.0 Reporter: Mostafa Mokhtar Assignee: Mostafa Mokhtar Fix For: 1.2.0 Attachments: HIVE-10331.01.patch, HIVE-10331.02.patch Queries are returning wrong results as all row groups gets filtered out and no rows get scanned. {code} SELECT count(*) FROM store_sales WHERE ss_addr_sk IS NULL {code} With hive.optimize.index.filter disabled we get the correct results In pickRowGroups stats show that hasNull_ is fales, while the rowgroup actually has null. Same query runs fine for newly loaded ORC tables. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10340) Enable ORC test for timezone reading from old format
[ https://issues.apache.org/jira/browse/HIVE-10340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14496632#comment-14496632 ] Sergey Shelukhin commented on HIVE-10340: - +1 Enable ORC test for timezone reading from old format Key: HIVE-10340 URL: https://issues.apache.org/jira/browse/HIVE-10340 Project: Hive Issue Type: Bug Affects Versions: 1.2.0 Reporter: Prasanth Jayachandran Assignee: Prasanth Jayachandran Priority: Trivial Attachments: HIVE-10340.1.patch As a part of HIVE-8746 I added a test for reading timezone data from old ORC format that was unintentionally disabled. Re-enable the test. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9710) HiveServer2 should support cookie based authentication, when using HTTP transport.
[ https://issues.apache.org/jira/browse/HIVE-9710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14496690#comment-14496690 ] Vaibhav Gumashta commented on HIVE-9710: +1. Thanks for patiently iterating [~hsubramaniyan]. HiveServer2 should support cookie based authentication, when using HTTP transport. -- Key: HIVE-9710 URL: https://issues.apache.org/jira/browse/HIVE-9710 Project: Hive Issue Type: Improvement Components: HiveServer2 Affects Versions: 1.2.0 Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Hari Sankar Sivarama Subramaniyan Labels: TODOC1.2 Fix For: 1.2.0 Attachments: HIVE-9710.1.patch, HIVE-9710.2.patch, HIVE-9710.3.patch, HIVE-9710.4.patch, HIVE-9710.5.patch, HIVE-9710.6.patch, HIVE-9710.7.patch, HIVE-9710.8.patch HiveServer2 should generate cookies and validate the client cookie send to it so that it need not perform User/Password or a Kerberos based authentication on each HTTP request. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9580) Server returns incorrect result from JOIN ON VARCHAR columns
[ https://issues.apache.org/jira/browse/HIVE-9580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14496681#comment-14496681 ] Szehon Ho commented on HIVE-9580: - Hi Aihua, looks like this works, so you are making all the varchars (and even char) in the join comparison to be the maximum length to avoid this issue. But I'm not too familiar with this code, I think [~jdere] is the varchars expert, forwarding to him to take a look as well. Server returns incorrect result from JOIN ON VARCHAR columns Key: HIVE-9580 URL: https://issues.apache.org/jira/browse/HIVE-9580 Project: Hive Issue Type: Bug Components: HiveServer2 Affects Versions: 0.12.0, 0.13.0, 0.14.0 Reporter: Mike Assignee: Aihua Xu Attachments: HIVE-9580.patch The database erroneously returns rows when joining two tables which each contain a VARCHAR column and the join's ON condition uses the equality operator on the VARCHAR columns. **The following JDBC method exhibits the problem: static void joinIssue() throws SQLException { String sql; int rowsAffected; ResultSet rs; Statement stmt = con.createStatement(); String table1_Name = blahtab1; String table1A_Name = blahtab1A; String table1B_Name = blahtab1B; String table2_Name = blahtab2; try { sql = drop table + table1_Name; System.out.println(\nsql= + sql); rowsAffected = stmt.executeUpdate(sql); } catch (SQLException se) { println(Drop table error: + se.getMessage()); } try { sql = CREATE TABLE + table1_Name + ( + VCHARCOL VARCHAR(10) + ,INTEGERCOL INT + ) ; System.out.println(\nsql= + sql); rowsAffected = stmt.executeUpdate(sql); } catch (SQLException se) { println(create table error: + se.getMessage()); } sql = insert into + table1_Name + values ('jklmnopqrs', 99); System.out.println(\nsql= + sql); stmt.executeUpdate(sql); System.out.println(===); try { sql = drop table + table1A_Name; System.out.println(\nsql= + sql); rowsAffected = stmt.executeUpdate(sql); } catch (SQLException se) { println(Drop table error: + se.getMessage()); } try { sql = CREATE TABLE + table1A_Name + ( + VCHARCOL VARCHAR(10) + ) ; System.out.println(\nsql= + sql); rowsAffected = stmt.executeUpdate(sql); } catch (SQLException se) { println(create table error: + se.getMessage()); } sql = insert into + table1A_Name + values ('jklmnopqrs'); System.out.println(\nsql= + sql); stmt.executeUpdate(sql); System.out.println(===); try { sql = drop table + table1B_Name; System.out.println(\nsql= + sql); rowsAffected = stmt.executeUpdate(sql); } catch (SQLException se) { println(Drop table error: + se.getMessage()); } try { sql = CREATE TABLE + table1B_Name + ( + VCHARCOL VARCHAR(11) + ,INTEGERCOL INT + ) ; System.out.println(\nsql= + sql); rowsAffected = stmt.executeUpdate(sql); } catch (SQLException se) {
[jira] [Updated] (HIVE-10269) HiveMetaStore.java:[6089,29] cannot find symbol class JvmPauseMonitor
[ https://issues.apache.org/jira/browse/HIVE-10269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vaibhav Gumashta updated HIVE-10269: Fix Version/s: 1.2.0 HiveMetaStore.java:[6089,29] cannot find symbol class JvmPauseMonitor - Key: HIVE-10269 URL: https://issues.apache.org/jira/browse/HIVE-10269 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 1.2.0 Reporter: Gabor Liptak Assignee: Ferdinand Xu Fix For: 1.2.0 Attachments: HIVE-10269.patch Compiling trunk fails when building based on instructions in https://cwiki.apache.org/confluence/display/Hive/HowToContribute $ git status On branch trunk Your branch is up-to-date with 'origin/trunk'. nothing to commit, working directory clean $ mvn clean install -DskipTests -Phadoop-1 ...[ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.1:compile (default-compile) on project hive-metastore: Compilation failure: Compilation failure: [ERROR] /tmp/hive/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java:[6089,29] cannot find symbol [ERROR] symbol: class JvmPauseMonitor [ERROR] location: package org.apache.hadoop.util [ERROR] /tmp/hive/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java:[6090,35] cannot find symbol [ERROR] symbol: class JvmPauseMonitor [ERROR] location: package org.apache.hadoop.util [ERROR] - [Help 1] [ERROR] [ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch. [ERROR] Re-run Maven using the -X switch to enable full debug logging. [ERROR] [ERROR] For more information about the errors and possible solutions, please read the following articles: [ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException [ERROR] [ERROR] After correcting the problems, you can resume the build with the command [ERROR] mvn goals -rf :hive-metastore -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10269) HiveMetaStore.java:[6089,29] cannot find symbol class JvmPauseMonitor
[ https://issues.apache.org/jira/browse/HIVE-10269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vaibhav Gumashta updated HIVE-10269: Affects Version/s: 1.2.0 HiveMetaStore.java:[6089,29] cannot find symbol class JvmPauseMonitor - Key: HIVE-10269 URL: https://issues.apache.org/jira/browse/HIVE-10269 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 1.2.0 Reporter: Gabor Liptak Assignee: Ferdinand Xu Fix For: 1.2.0 Attachments: HIVE-10269.patch Compiling trunk fails when building based on instructions in https://cwiki.apache.org/confluence/display/Hive/HowToContribute $ git status On branch trunk Your branch is up-to-date with 'origin/trunk'. nothing to commit, working directory clean $ mvn clean install -DskipTests -Phadoop-1 ...[ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.1:compile (default-compile) on project hive-metastore: Compilation failure: Compilation failure: [ERROR] /tmp/hive/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java:[6089,29] cannot find symbol [ERROR] symbol: class JvmPauseMonitor [ERROR] location: package org.apache.hadoop.util [ERROR] /tmp/hive/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java:[6090,35] cannot find symbol [ERROR] symbol: class JvmPauseMonitor [ERROR] location: package org.apache.hadoop.util [ERROR] - [Help 1] [ERROR] [ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch. [ERROR] Re-run Maven using the -X switch to enable full debug logging. [ERROR] [ERROR] For more information about the errors and possible solutions, please read the following articles: [ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException [ERROR] [ERROR] After correcting the problems, you can resume the build with the command [ERROR] mvn goals -rf :hive-metastore -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10344) CBO (Calcite Return Path): Use newInstance to create ExprNodeGenericFuncDesc rather than construction function
[ https://issues.apache.org/jira/browse/HIVE-10344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-10344: --- Summary: CBO (Calcite Return Path): Use newInstance to create ExprNodeGenericFuncDesc rather than construction function (was: Use newInstance to create ExprNodeGenericFuncDesc rather than construction function) CBO (Calcite Return Path): Use newInstance to create ExprNodeGenericFuncDesc rather than construction function -- Key: HIVE-10344 URL: https://issues.apache.org/jira/browse/HIVE-10344 Project: Hive Issue Type: Sub-task Components: CBO Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong Fix For: 1.2.0 ExprNodeGenericFuncDesc is now created using a constructer. It skips the initialization step genericUDF.initializeAndFoldConstants compared with using newInstance method. If the initialization step is skipped, some configuration parameters are not included in the serialization which generates wrong results/errors. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10332) CBO (Calcite Return Path): Use SortExchange rather than LogicalExchange for HiveOpConverter
[ https://issues.apache.org/jira/browse/HIVE-10332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-10332: --- Summary: CBO (Calcite Return Path): Use SortExchange rather than LogicalExchange for HiveOpConverter (was: Use SortExchange rather than LogicalExchange for HiveOpConverter) CBO (Calcite Return Path): Use SortExchange rather than LogicalExchange for HiveOpConverter --- Key: HIVE-10332 URL: https://issues.apache.org/jira/browse/HIVE-10332 Project: Hive Issue Type: Sub-task Components: CBO Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong Fix For: cbo-branch Attachments: HIVE-10332.01.patch Right now HiveSortExchange extends SortExchange extends Exchange. LogicalExchange extends Exchange. LogicalExchange is expected in HiveOpConverter but HiveSortExchange is created. After discussion, we plan to change LogicalExchange to HiveSortExchange. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10344) CBO (Calcite Return Path): Use newInstance to create ExprNodeGenericFuncDesc rather than construction function
[ https://issues.apache.org/jira/browse/HIVE-10344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14496884#comment-14496884 ] Pengcheng Xiong commented on HIVE-10344: [~jpullokkaran], after this patch, the cbo_simple_select got passed when return path is turned on. CBO (Calcite Return Path): Use newInstance to create ExprNodeGenericFuncDesc rather than construction function -- Key: HIVE-10344 URL: https://issues.apache.org/jira/browse/HIVE-10344 Project: Hive Issue Type: Sub-task Components: CBO Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong Fix For: 1.2.0 Attachments: HIVE-10344.01.patch ExprNodeGenericFuncDesc is now created using a constructer. It skips the initialization step genericUDF.initializeAndFoldConstants compared with using newInstance method. If the initialization step is skipped, some configuration parameters are not included in the serialization which generates wrong results/errors. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10270) Cannot use Decimal constants less than 0.1BD
[ https://issues.apache.org/jira/browse/HIVE-10270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Dere updated HIVE-10270: -- Attachment: HIVE-10270.5.patch TestBinarySortableFast failure was due to the fact that HIVE-9937 had duplicated some BinarySortableSerDe serialization logic including for decimals. For Patch v5 I have refactored it so both BinarySortableSerDe and BinarySortableSerializeWrite both call into the same common logic, and updated the tests for TestBinarySortableFast, similar to TestBinarySortableSerDe. Cannot use Decimal constants less than 0.1BD Key: HIVE-10270 URL: https://issues.apache.org/jira/browse/HIVE-10270 Project: Hive Issue Type: Bug Components: Types Reporter: Jason Dere Assignee: Jason Dere Attachments: HIVE-10270.1.patch, HIVE-10270.2.patch, HIVE-10270.3.patch, HIVE-10270.4.patch, HIVE-10270.5.patch {noformat} hive select 0.09765625BD; FAILED: IllegalArgumentException Decimal scale must be less than or equal to precision {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9617) UDF from_utc_timestamp throws NPE if the second argument is null
[ https://issues.apache.org/jira/browse/HIVE-9617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Dere updated HIVE-9617: - Fix Version/s: 1.2.0 UDF from_utc_timestamp throws NPE if the second argument is null Key: HIVE-9617 URL: https://issues.apache.org/jira/browse/HIVE-9617 Project: Hive Issue Type: Bug Components: UDF Reporter: Alexander Pivovarov Assignee: Alexander Pivovarov Priority: Minor Fix For: 1.2.0 Attachments: HIVE-9617.1.patch, HIVE-9617.2.patch UDF from_utc_timestamp throws NPE if the second argument is null {code} select from_utc_timestamp('2015-02-06 10:30:00', cast(null as string)); FAILED: NullPointerException null {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10344) CBO (Calcite Return Path): Use newInstance to create ExprNodeGenericFuncDesc rather than construction function
[ https://issues.apache.org/jira/browse/HIVE-10344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14496924#comment-14496924 ] Laljo John Pullokkaran commented on HIVE-10344: --- [~ashutoshc] Could you review and check this in to CBO branch? CBO (Calcite Return Path): Use newInstance to create ExprNodeGenericFuncDesc rather than construction function -- Key: HIVE-10344 URL: https://issues.apache.org/jira/browse/HIVE-10344 Project: Hive Issue Type: Sub-task Components: CBO Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong Fix For: 1.2.0 Attachments: HIVE-10344.01.patch ExprNodeGenericFuncDesc is now created using a constructer. It skips the initialization step genericUDF.initializeAndFoldConstants compared with using newInstance method. If the initialization step is skipped, some configuration parameters are not included in the serialization which generates wrong results/errors. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9917) After HIVE-3454 is done, make int to timestamp conversion configurable
[ https://issues.apache.org/jira/browse/HIVE-9917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14496784#comment-14496784 ] Hive QA commented on HIVE-9917: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12725574/HIVE-9917.patch {color:red}ERROR:{color} -1 due to 18 failed/errored test(s), 8692 tests executed *Failed tests:* {noformat} TestMinimrCliDriver-bucketmapjoin6.q-constprog_partitioner.q-infer_bucket_sort_dyn_part.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-external_table_with_space_in_location_path.q-infer_bucket_sort_merge.q-auto_sortmerge_join_16.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-groupby2.q-import_exported_table.q-bucketizedhiveinputformat.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-index_bitmap3.q-stats_counter_partitioned.q-temp_table_external.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_map_operators.q-join1.q-bucketmapjoin7.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_num_buckets.q-disable_merge_for_bucketing.q-uber_reduce.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_reducers_power_two.q-scriptfile1.q-scriptfile1_win.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-leftsemijoin_mr.q-load_hdfs_file_with_space_in_the_name.q-root_dir_external_table.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-list_bucket_dml_10.q-bucket_num_reducers.q-bucket6.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-load_fs2.q-file_with_header_footer.q-ql_rewrite_gbtoidx_cbo_1.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-parallel_orderby.q-reduce_deduplicate.q-ql_rewrite_gbtoidx_cbo_2.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-ql_rewrite_gbtoidx.q-smb_mapjoin_8.q - did not produce a TEST-*.xml file TestMinimrCliDriver-schemeAuthority2.q-bucket4.q-input16_cc.q-and-1-more - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_between_in org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_between_in org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_vector_between_in org.apache.hadoop.hive.thrift.TestHadoop20SAuthBridge.testMetastoreProxyUser org.apache.hadoop.hive.thrift.TestHadoop20SAuthBridge.testSaslWithHiveMetaStore {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3447/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3447/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3447/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 18 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12725574 - PreCommit-HIVE-TRUNK-Build After HIVE-3454 is done, make int to timestamp conversion configurable -- Key: HIVE-9917 URL: https://issues.apache.org/jira/browse/HIVE-9917 Project: Hive Issue Type: Improvement Reporter: Aihua Xu Assignee: Aihua Xu Attachments: HIVE-9917.patch After HIVE-3454 is fixed, we will have correct behavior of converting int to timestamp. While the customers are using such incorrect behavior for so long, better to make it configurable so that in one release, it will default to old/inconsistent way and the next release will default to new/consistent way. And then we will deprecate it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9580) Server returns incorrect result from JOIN ON VARCHAR columns
[ https://issues.apache.org/jira/browse/HIVE-9580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14496785#comment-14496785 ] Aihua Xu commented on HIVE-9580: That's right. For the key comparison, it will call UDF to do the key conversion if we are comparing different types, or I think we should pick the common type as the key type if the type conversion is not needed for the data types including char or varchar with different lengths. Server returns incorrect result from JOIN ON VARCHAR columns Key: HIVE-9580 URL: https://issues.apache.org/jira/browse/HIVE-9580 Project: Hive Issue Type: Bug Components: HiveServer2 Affects Versions: 0.12.0, 0.13.0, 0.14.0 Reporter: Mike Assignee: Aihua Xu Attachments: HIVE-9580.patch The database erroneously returns rows when joining two tables which each contain a VARCHAR column and the join's ON condition uses the equality operator on the VARCHAR columns. **The following JDBC method exhibits the problem: static void joinIssue() throws SQLException { String sql; int rowsAffected; ResultSet rs; Statement stmt = con.createStatement(); String table1_Name = blahtab1; String table1A_Name = blahtab1A; String table1B_Name = blahtab1B; String table2_Name = blahtab2; try { sql = drop table + table1_Name; System.out.println(\nsql= + sql); rowsAffected = stmt.executeUpdate(sql); } catch (SQLException se) { println(Drop table error: + se.getMessage()); } try { sql = CREATE TABLE + table1_Name + ( + VCHARCOL VARCHAR(10) + ,INTEGERCOL INT + ) ; System.out.println(\nsql= + sql); rowsAffected = stmt.executeUpdate(sql); } catch (SQLException se) { println(create table error: + se.getMessage()); } sql = insert into + table1_Name + values ('jklmnopqrs', 99); System.out.println(\nsql= + sql); stmt.executeUpdate(sql); System.out.println(===); try { sql = drop table + table1A_Name; System.out.println(\nsql= + sql); rowsAffected = stmt.executeUpdate(sql); } catch (SQLException se) { println(Drop table error: + se.getMessage()); } try { sql = CREATE TABLE + table1A_Name + ( + VCHARCOL VARCHAR(10) + ) ; System.out.println(\nsql= + sql); rowsAffected = stmt.executeUpdate(sql); } catch (SQLException se) { println(create table error: + se.getMessage()); } sql = insert into + table1A_Name + values ('jklmnopqrs'); System.out.println(\nsql= + sql); stmt.executeUpdate(sql); System.out.println(===); try { sql = drop table + table1B_Name; System.out.println(\nsql= + sql); rowsAffected = stmt.executeUpdate(sql); } catch (SQLException se) { println(Drop table error: + se.getMessage()); } try { sql = CREATE TABLE + table1B_Name + ( + VCHARCOL VARCHAR(11) + ,INTEGERCOL INT + ) ; System.out.println(\nsql= + sql); rowsAffected = stmt.executeUpdate(sql); } catch (SQLException se) {
[jira] [Updated] (HIVE-10343) CBO (Calcite Return Path): Parameterize algorithm cost model
[ https://issues.apache.org/jira/browse/HIVE-10343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Laljo John Pullokkaran updated HIVE-10343: -- Attachment: HIVE-10343.patch CBO (Calcite Return Path): Parameterize algorithm cost model Key: HIVE-10343 URL: https://issues.apache.org/jira/browse/HIVE-10343 Project: Hive Issue Type: Sub-task Components: CBO Reporter: Laljo John Pullokkaran Assignee: Laljo John Pullokkaran Fix For: 1.2.0 Attachments: HIVE-10343.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10284) enable container reuse for grace hash join
[ https://issues.apache.org/jira/browse/HIVE-10284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14496750#comment-14496750 ] Matt McCline commented on HIVE-10284: - I'm not sure what is going on here. Probably I think we are forming the vector expression writers incorrectly in the new code we added. I need to go study the code and think. enable container reuse for grace hash join --- Key: HIVE-10284 URL: https://issues.apache.org/jira/browse/HIVE-10284 Project: Hive Issue Type: Bug Reporter: Gunther Hagleitner Assignee: Wei Zheng Attachments: HIVE-10284.1.patch, HIVE-10284.2.patch, HIVE-10284.3.patch, HIVE-10284.4.patch, HIVE-10284.5.patch, HIVE-10284.6.patch, HIVE-10284.7.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10324) Hive metatool should take table_param_key to allow for changes to avro serde's schema url key
[ https://issues.apache.org/jira/browse/HIVE-10324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14496694#comment-14496694 ] Szehon Ho commented on HIVE-10324: -- Thanks Ferdinand for taking care of this. Can we keep the update of any property that match StorageDescriptor property, and just add another method for Table property? I am afraid that somebody might be using this, unless we can confirm that that StorageDescriptor property is never used. Hive metatool should take table_param_key to allow for changes to avro serde's schema url key - Key: HIVE-10324 URL: https://issues.apache.org/jira/browse/HIVE-10324 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 1.1.0 Reporter: Szehon Ho Assignee: Ferdinand Xu Attachments: HIVE-10324.patch, HIVE-10324.patch.WIP HIVE-3443 added support to change the serdeParams from 'metatool updateLocation' command. However, in avro it is possible to specify the schema via the tableParams: {noformat} CREATE TABLE `testavro`( `test` string COMMENT 'from deserializer') ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat' TBLPROPERTIES ( 'avro.schema.url'='hdfs://namenode:8020/tmp/test.avsc', 'kite.compression.type'='snappy', 'transient_lastDdlTime'='1427996456') {noformat} Hence for those tables the 'metatool updateLocation' will not help. This is necessary in case like upgrade the namenode to HA where the absolute paths have changed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10273) Union with partition tables which have no data fails with NPE
[ https://issues.apache.org/jira/browse/HIVE-10273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-10273: -- Attachment: HIVE-10273.6.patch Updated to latest trunk. Test failures unrelated. Union with partition tables which have no data fails with NPE - Key: HIVE-10273 URL: https://issues.apache.org/jira/browse/HIVE-10273 Project: Hive Issue Type: Bug Components: Tez Affects Versions: 1.2.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K Attachments: HIVE-10273.1.patch, HIVE-10273.2.patch, HIVE-10273.3.patch, HIVE-10273.4.patch, HIVE-10273.5.patch, HIVE-10273.6.patch As shown in the test case in the patch below, when we have partitioned tables which have no data, we fail with an NPE with the following stack trace: {code} NullPointerException null java.lang.NullPointerException at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.walk(DefaultGraphWalker.java:128) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:109) at org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.validateMapWork(Vectorizer.java:357) at org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.convertMapWork(Vectorizer.java:321) at org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.dispatch(Vectorizer.java:307) at org.apache.hadoop.hive.ql.lib.TaskGraphWalker.dispatch(TaskGraphWalker.java:111) at org.apache.hadoop.hive.ql.lib.TaskGraphWalker.walk(TaskGraphWalker.java:194) at org.apache.hadoop.hive.ql.lib.TaskGraphWalker.startWalking(TaskGraphWalker.java:139) at org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.resolve(Vectorizer.java:847) at org.apache.hadoop.hive.ql.parse.TezCompiler.optimizeTaskPlan(TezCompiler.java:468) at org.apache.hadoop.hive.ql.parse.TaskCompiler.compile(TaskCompiler.java:223) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10170) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:221) at org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:74) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:221) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10233) Hive on LLAP: Memory manager
[ https://issues.apache.org/jira/browse/HIVE-10233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-10233: -- Attachment: HIVE-10233-WIP-2.patch Hive on LLAP: Memory manager Key: HIVE-10233 URL: https://issues.apache.org/jira/browse/HIVE-10233 Project: Hive Issue Type: Bug Components: Tez Affects Versions: llap Reporter: Vikram Dixit K Assignee: Vikram Dixit K Attachments: HIVE-10233-WIP-2.patch, HIVE-10233-WIP.patch We need a memory manager in llap/tez to manage the usage of memory across threads. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10350) CBO: With hive.cbo.costmodel.extended enabled IO cost is negative
[ https://issues.apache.org/jira/browse/HIVE-10350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mostafa Mokhtar updated HIVE-10350: --- Issue Type: Sub-task (was: Bug) Parent: HIVE-9132 CBO: With hive.cbo.costmodel.extended enabled IO cost is negative - Key: HIVE-10350 URL: https://issues.apache.org/jira/browse/HIVE-10350 Project: Hive Issue Type: Sub-task Components: CBO Affects Versions: 1.2.0 Reporter: Mostafa Mokhtar Assignee: Mostafa Mokhtar Fix For: 1.2.0 Not an overflow but parallelism ends up being -1 as it uses number of buckets {code} final int parallelism = RelMetadataQuery.splitCount(join) == null ? 1 : RelMetadataQuery.splitCount(join); {code} {code} 2015-04-13 18:19:09,154 DEBUG [main]: cost.HiveCostModel (HiveCostModel.java:getJoinCost(62)) - COMMON_JOIN cost: {1600892.857142857 rows, 2.4463782008994658E7 cpu, 8.54445445875E10 io} 2015-04-13 18:19:09,155 DEBUG [main]: cost.HiveCostModel (HiveCostModel.java:getJoinCost(62)) - MAP_JOIN cost: {1600892.857142857 rows, 1601785.714285714 cpu, -1698787.48 io} 2015-04-13 18:19:09,155 DEBUG [main]: cost.HiveCostModel (HiveCostModel.java:getJoinCost(72)) - MAP_JOIN selected 2015-04-13 18:19:09,157 DEBUG [main]: parse.CalcitePlanner (CalcitePlanner.java:apply(862)) - Plan After Join Reordering: HiveSort(fetch=[100]): rowcount = 6006.726049749041, cumulative cost = {1.1468867492063493E8 rows, 1.166177684126984E8 cpu, -1.1757664816220238E9 io}, id = 3000 HiveSort(sort0=[$0], dir0=[ASC]): rowcount = 6006.726049749041, cumulative cost = {1.1468867492063493E8 rows, 1.166177684126984E8 cpu, -1.1757664816220238E9 io}, id = 2998 HiveProject(customer_id=[$4], customername=[concat($9, ', ', $8)]): rowcount = 6006.726049749041, cumulative cost = {1.1468867492063493E8 rows, 1.166177684126984E8 cpu, -1.1757664816220238E9 io}, id = 3136 HiveJoin(condition=[=($1, $5)], joinType=[inner], joinAlgorithm=[map_join], cost=[{5.557820341269841E7 rows, 5.557840182539682E7 cpu, -4299694.122023809 io}]): rowcount = 6006.726049749041, cumulative cost = {1.1468867492063493E8 rows, 1.166177684126984E8 cpu, -1.1757664816220238E9 io}, id = 3132 HiveJoin(condition=[=($0, $1)], joinType=[inner], joinAlgorithm=[map_join], cost=[{5.7498805E7 rows, 5.9419605E7 cpu, -1.15248E9 io}]): rowcount = 5.5578005E7, cumulative cost = {5.7498805E7 rows, 5.9419605E7 cpu, -1.15248E9 io}, id = 3100 HiveProject(sr_cdemo_sk=[$4]): rowcount = 5.5578005E7, cumulative cost = {0.0 rows, 0.0 cpu, 0.0 io}, id = 2992 HiveTableScan(table=[[tpcds_bin_orc_200.store_returns]]): rowcount = 5.5578005E7, cumulative cost = {0}, id = 2878 HiveProject(cd_demo_sk=[$0]): rowcount = 1920800.0, cumulative cost = {0.0 rows, 0.0 cpu, 0.0 io}, id = 2978 HiveTableScan(table=[[tpcds_bin_orc_200.customer_demographics]]): rowcount = 1920800.0, cumulative cost = {0}, id = 2868 HiveJoin(condition=[=($10, $1)], joinType=[inner], joinAlgorithm=[map_join], cost=[{1787.9365079365077 rows, 1790.15873015873 cpu, -8000.0 io}]): rowcount = 198.4126984126984, cumulative cost = {1611666.507936508 rows, 1619761.5873015872 cpu, -1.89867875E7 io}, id = 3130 HiveJoin(condition=[=($0, $4)], joinType=[inner], joinAlgorithm=[map_join], cost=[{8985.714285714286 rows, 16185.714285714286 cpu, -1.728E7 io}]): rowcount = 1785.7142857142856, cumulative cost = {1609878.5714285714 rows, 1617971.4285714284 cpu, -1.89787875E7 io}, id = 3128 HiveProject(hd_demo_sk=[$0], hd_income_band_sk=[$1]): rowcount = 7200.0, cumulative cost = {0.0 rows, 0.0 cpu, 0.0 io}, id = 2982 HiveTableScan(table=[[tpcds_bin_orc_200.household_demographics]]): rowcount = 7200.0, cumulative cost = {0}, id = 2871 HiveJoin(condition=[=($3, $6)], joinType=[inner], joinAlgorithm=[map_join], cost=[{1600892.857142857 rows, 1601785.714285714 cpu, -1698787.48 io}]): rowcount = 1785.7142857142856, cumulative cost = {1600892.857142857 rows, 1601785.714285714 cpu, -1698787.48 io}, id = 3105 HiveProject(c_customer_id=[$1], c_current_cdemo_sk=[$2], c_current_hdemo_sk=[$3], c_current_addr_sk=[$4], c_first_name=[$8], c_last_name=[$9]): rowcount = 160.0, cumulative cost = {0.0 rows, 0.0 cpu, 0.0 io}, id = 2970 HiveTableScan(table=[[tpcds_bin_orc_200.customer]]): rowcount = 160.0, cumulative cost = {0}, id = 2862 HiveProject(ca_address_sk=[$0], ca_city=[$6]): rowcount = 892.8571428571428, cumulative cost = {0.0 rows, 0.0 cpu, 0.0 io}, id = 2974 HiveFilter(condition=[=($6, 'Hopewell')]): rowcount = 892.8571428571428,
[jira] [Updated] (HIVE-10350) CBO: With hive.cbo.costmodel.extended enabled IO cost is negative
[ https://issues.apache.org/jira/browse/HIVE-10350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mostafa Mokhtar updated HIVE-10350: --- Attachment: HIVE-10331.01.patch CBO: With hive.cbo.costmodel.extended enabled IO cost is negative - Key: HIVE-10350 URL: https://issues.apache.org/jira/browse/HIVE-10350 Project: Hive Issue Type: Sub-task Components: CBO Affects Versions: 1.2.0 Reporter: Mostafa Mokhtar Assignee: Mostafa Mokhtar Fix For: 1.2.0 Attachments: HIVE-10331.01.patch Not an overflow but parallelism ends up being -1 as it uses number of buckets {code} final int parallelism = RelMetadataQuery.splitCount(join) == null ? 1 : RelMetadataQuery.splitCount(join); {code} {code} 2015-04-13 18:19:09,154 DEBUG [main]: cost.HiveCostModel (HiveCostModel.java:getJoinCost(62)) - COMMON_JOIN cost: {1600892.857142857 rows, 2.4463782008994658E7 cpu, 8.54445445875E10 io} 2015-04-13 18:19:09,155 DEBUG [main]: cost.HiveCostModel (HiveCostModel.java:getJoinCost(62)) - MAP_JOIN cost: {1600892.857142857 rows, 1601785.714285714 cpu, -1698787.48 io} 2015-04-13 18:19:09,155 DEBUG [main]: cost.HiveCostModel (HiveCostModel.java:getJoinCost(72)) - MAP_JOIN selected 2015-04-13 18:19:09,157 DEBUG [main]: parse.CalcitePlanner (CalcitePlanner.java:apply(862)) - Plan After Join Reordering: HiveSort(fetch=[100]): rowcount = 6006.726049749041, cumulative cost = {1.1468867492063493E8 rows, 1.166177684126984E8 cpu, -1.1757664816220238E9 io}, id = 3000 HiveSort(sort0=[$0], dir0=[ASC]): rowcount = 6006.726049749041, cumulative cost = {1.1468867492063493E8 rows, 1.166177684126984E8 cpu, -1.1757664816220238E9 io}, id = 2998 HiveProject(customer_id=[$4], customername=[concat($9, ', ', $8)]): rowcount = 6006.726049749041, cumulative cost = {1.1468867492063493E8 rows, 1.166177684126984E8 cpu, -1.1757664816220238E9 io}, id = 3136 HiveJoin(condition=[=($1, $5)], joinType=[inner], joinAlgorithm=[map_join], cost=[{5.557820341269841E7 rows, 5.557840182539682E7 cpu, -4299694.122023809 io}]): rowcount = 6006.726049749041, cumulative cost = {1.1468867492063493E8 rows, 1.166177684126984E8 cpu, -1.1757664816220238E9 io}, id = 3132 HiveJoin(condition=[=($0, $1)], joinType=[inner], joinAlgorithm=[map_join], cost=[{5.7498805E7 rows, 5.9419605E7 cpu, -1.15248E9 io}]): rowcount = 5.5578005E7, cumulative cost = {5.7498805E7 rows, 5.9419605E7 cpu, -1.15248E9 io}, id = 3100 HiveProject(sr_cdemo_sk=[$4]): rowcount = 5.5578005E7, cumulative cost = {0.0 rows, 0.0 cpu, 0.0 io}, id = 2992 HiveTableScan(table=[[tpcds_bin_orc_200.store_returns]]): rowcount = 5.5578005E7, cumulative cost = {0}, id = 2878 HiveProject(cd_demo_sk=[$0]): rowcount = 1920800.0, cumulative cost = {0.0 rows, 0.0 cpu, 0.0 io}, id = 2978 HiveTableScan(table=[[tpcds_bin_orc_200.customer_demographics]]): rowcount = 1920800.0, cumulative cost = {0}, id = 2868 HiveJoin(condition=[=($10, $1)], joinType=[inner], joinAlgorithm=[map_join], cost=[{1787.9365079365077 rows, 1790.15873015873 cpu, -8000.0 io}]): rowcount = 198.4126984126984, cumulative cost = {1611666.507936508 rows, 1619761.5873015872 cpu, -1.89867875E7 io}, id = 3130 HiveJoin(condition=[=($0, $4)], joinType=[inner], joinAlgorithm=[map_join], cost=[{8985.714285714286 rows, 16185.714285714286 cpu, -1.728E7 io}]): rowcount = 1785.7142857142856, cumulative cost = {1609878.5714285714 rows, 1617971.4285714284 cpu, -1.89787875E7 io}, id = 3128 HiveProject(hd_demo_sk=[$0], hd_income_band_sk=[$1]): rowcount = 7200.0, cumulative cost = {0.0 rows, 0.0 cpu, 0.0 io}, id = 2982 HiveTableScan(table=[[tpcds_bin_orc_200.household_demographics]]): rowcount = 7200.0, cumulative cost = {0}, id = 2871 HiveJoin(condition=[=($3, $6)], joinType=[inner], joinAlgorithm=[map_join], cost=[{1600892.857142857 rows, 1601785.714285714 cpu, -1698787.48 io}]): rowcount = 1785.7142857142856, cumulative cost = {1600892.857142857 rows, 1601785.714285714 cpu, -1698787.48 io}, id = 3105 HiveProject(c_customer_id=[$1], c_current_cdemo_sk=[$2], c_current_hdemo_sk=[$3], c_current_addr_sk=[$4], c_first_name=[$8], c_last_name=[$9]): rowcount = 160.0, cumulative cost = {0.0 rows, 0.0 cpu, 0.0 io}, id = 2970 HiveTableScan(table=[[tpcds_bin_orc_200.customer]]): rowcount = 160.0, cumulative cost = {0}, id = 2862 HiveProject(ca_address_sk=[$0], ca_city=[$6]): rowcount = 892.8571428571428, cumulative cost = {0.0 rows, 0.0 cpu, 0.0 io}, id = 2974 HiveFilter(condition=[=($6, 'Hopewell')]): rowcount =
[jira] [Updated] (HIVE-10346) Tez on HBase has problems with settings again
[ https://issues.apache.org/jira/browse/HIVE-10346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-10346: Attachment: HIVE-10346.patch Tez on HBase has problems with settings again - Key: HIVE-10346 URL: https://issues.apache.org/jira/browse/HIVE-10346 Project: Hive Issue Type: Bug Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: HIVE-10346.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10346) Tez on HBase has problems with settings again
[ https://issues.apache.org/jira/browse/HIVE-10346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14497089#comment-14497089 ] Sergey Shelukhin commented on HIVE-10346: - [~hagleitn] can you please review? Tez on HBase has problems with settings again - Key: HIVE-10346 URL: https://issues.apache.org/jira/browse/HIVE-10346 Project: Hive Issue Type: Bug Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: HIVE-10346.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10347) Merge spark to trunk 4/15/2015
[ https://issues.apache.org/jira/browse/HIVE-10347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szehon Ho updated HIVE-10347: - Attachment: HIVE-10347.patch Attaching to run precommit tests. Merge spark to trunk 4/15/2015 -- Key: HIVE-10347 URL: https://issues.apache.org/jira/browse/HIVE-10347 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Szehon Ho Assignee: Szehon Ho Attachments: HIVE-10347.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10028) LLAP: Create a fixed size execution queue for daemons
[ https://issues.apache.org/jira/browse/HIVE-10028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-10028: - Attachment: HIVE-10028.2.patch LLAP: Create a fixed size execution queue for daemons - Key: HIVE-10028 URL: https://issues.apache.org/jira/browse/HIVE-10028 Project: Hive Issue Type: Sub-task Reporter: Siddharth Seth Assignee: Prasanth Jayachandran Fix For: llap Attachments: HIVE-10028.1.patch, HIVE-10028.2.patch Currently, this is unbounded. This should be a configurable size. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10307) Support to use number literals in partition column
[ https://issues.apache.org/jira/browse/HIVE-10307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14497157#comment-14497157 ] Chaoyu Tang commented on HIVE-10307: The failed tests seems not related to this patch. Thanks Support to use number literals in partition column -- Key: HIVE-10307 URL: https://issues.apache.org/jira/browse/HIVE-10307 Project: Hive Issue Type: Improvement Components: Query Processor Affects Versions: 1.0.0 Reporter: Chaoyu Tang Assignee: Chaoyu Tang Attachments: HIVE-10307.1.patch, HIVE-10307.patch Data types like TinyInt, SmallInt, BigInt or Decimal can be expressed as literals with postfix like Y, S, L, or BD appended to the number. These literals work in most Hive queries, but do not when they are used as partition column value. For a partitioned table like: create table partcoltypenum (key int, value string) partitioned by (tint tinyint, sint smallint, bint bigint); insert into partcoltypenum partition (tint=100Y, sint=1S, bint=1000L) select key, value from src limit 30; Queries like select, describe and drop partition do not work. For an example select * from partcoltypenum where tint=100Y and sint=1S and bint=1000L; does not return any rows. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10350) CBO: With hive.cbo.costmodel.extended enabled IO cost is negative
[ https://issues.apache.org/jira/browse/HIVE-10350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14497228#comment-14497228 ] Mostafa Mokhtar commented on HIVE-10350: [~jcamachorodriguez] [~jpullokkaran] Can you please take a look? CBO: With hive.cbo.costmodel.extended enabled IO cost is negative - Key: HIVE-10350 URL: https://issues.apache.org/jira/browse/HIVE-10350 Project: Hive Issue Type: Sub-task Components: CBO Affects Versions: 1.2.0 Reporter: Mostafa Mokhtar Assignee: Mostafa Mokhtar Fix For: 1.2.0 Attachments: HIVE-10331.01.patch Not an overflow but parallelism ends up being -1 as it uses number of buckets {code} final int parallelism = RelMetadataQuery.splitCount(join) == null ? 1 : RelMetadataQuery.splitCount(join); {code} {code} 2015-04-13 18:19:09,154 DEBUG [main]: cost.HiveCostModel (HiveCostModel.java:getJoinCost(62)) - COMMON_JOIN cost: {1600892.857142857 rows, 2.4463782008994658E7 cpu, 8.54445445875E10 io} 2015-04-13 18:19:09,155 DEBUG [main]: cost.HiveCostModel (HiveCostModel.java:getJoinCost(62)) - MAP_JOIN cost: {1600892.857142857 rows, 1601785.714285714 cpu, -1698787.48 io} 2015-04-13 18:19:09,155 DEBUG [main]: cost.HiveCostModel (HiveCostModel.java:getJoinCost(72)) - MAP_JOIN selected 2015-04-13 18:19:09,157 DEBUG [main]: parse.CalcitePlanner (CalcitePlanner.java:apply(862)) - Plan After Join Reordering: HiveSort(fetch=[100]): rowcount = 6006.726049749041, cumulative cost = {1.1468867492063493E8 rows, 1.166177684126984E8 cpu, -1.1757664816220238E9 io}, id = 3000 HiveSort(sort0=[$0], dir0=[ASC]): rowcount = 6006.726049749041, cumulative cost = {1.1468867492063493E8 rows, 1.166177684126984E8 cpu, -1.1757664816220238E9 io}, id = 2998 HiveProject(customer_id=[$4], customername=[concat($9, ', ', $8)]): rowcount = 6006.726049749041, cumulative cost = {1.1468867492063493E8 rows, 1.166177684126984E8 cpu, -1.1757664816220238E9 io}, id = 3136 HiveJoin(condition=[=($1, $5)], joinType=[inner], joinAlgorithm=[map_join], cost=[{5.557820341269841E7 rows, 5.557840182539682E7 cpu, -4299694.122023809 io}]): rowcount = 6006.726049749041, cumulative cost = {1.1468867492063493E8 rows, 1.166177684126984E8 cpu, -1.1757664816220238E9 io}, id = 3132 HiveJoin(condition=[=($0, $1)], joinType=[inner], joinAlgorithm=[map_join], cost=[{5.7498805E7 rows, 5.9419605E7 cpu, -1.15248E9 io}]): rowcount = 5.5578005E7, cumulative cost = {5.7498805E7 rows, 5.9419605E7 cpu, -1.15248E9 io}, id = 3100 HiveProject(sr_cdemo_sk=[$4]): rowcount = 5.5578005E7, cumulative cost = {0.0 rows, 0.0 cpu, 0.0 io}, id = 2992 HiveTableScan(table=[[tpcds_bin_orc_200.store_returns]]): rowcount = 5.5578005E7, cumulative cost = {0}, id = 2878 HiveProject(cd_demo_sk=[$0]): rowcount = 1920800.0, cumulative cost = {0.0 rows, 0.0 cpu, 0.0 io}, id = 2978 HiveTableScan(table=[[tpcds_bin_orc_200.customer_demographics]]): rowcount = 1920800.0, cumulative cost = {0}, id = 2868 HiveJoin(condition=[=($10, $1)], joinType=[inner], joinAlgorithm=[map_join], cost=[{1787.9365079365077 rows, 1790.15873015873 cpu, -8000.0 io}]): rowcount = 198.4126984126984, cumulative cost = {1611666.507936508 rows, 1619761.5873015872 cpu, -1.89867875E7 io}, id = 3130 HiveJoin(condition=[=($0, $4)], joinType=[inner], joinAlgorithm=[map_join], cost=[{8985.714285714286 rows, 16185.714285714286 cpu, -1.728E7 io}]): rowcount = 1785.7142857142856, cumulative cost = {1609878.5714285714 rows, 1617971.4285714284 cpu, -1.89787875E7 io}, id = 3128 HiveProject(hd_demo_sk=[$0], hd_income_band_sk=[$1]): rowcount = 7200.0, cumulative cost = {0.0 rows, 0.0 cpu, 0.0 io}, id = 2982 HiveTableScan(table=[[tpcds_bin_orc_200.household_demographics]]): rowcount = 7200.0, cumulative cost = {0}, id = 2871 HiveJoin(condition=[=($3, $6)], joinType=[inner], joinAlgorithm=[map_join], cost=[{1600892.857142857 rows, 1601785.714285714 cpu, -1698787.48 io}]): rowcount = 1785.7142857142856, cumulative cost = {1600892.857142857 rows, 1601785.714285714 cpu, -1698787.48 io}, id = 3105 HiveProject(c_customer_id=[$1], c_current_cdemo_sk=[$2], c_current_hdemo_sk=[$3], c_current_addr_sk=[$4], c_first_name=[$8], c_last_name=[$9]): rowcount = 160.0, cumulative cost = {0.0 rows, 0.0 cpu, 0.0 io}, id = 2970 HiveTableScan(table=[[tpcds_bin_orc_200.customer]]): rowcount = 160.0, cumulative cost = {0}, id = 2862 HiveProject(ca_address_sk=[$0], ca_city=[$6]): rowcount = 892.8571428571428, cumulative cost = {0.0 rows, 0.0 cpu, 0.0 io}, id
[jira] [Commented] (HIVE-9923) No clear message when from is missing
[ https://issues.apache.org/jira/browse/HIVE-9923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14497327#comment-14497327 ] Hive QA commented on HIVE-9923: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12725615/HIVE-9923.1.patch {color:red}ERROR:{color} -1 due to 57 failed/errored test(s), 8690 tests executed *Failed tests:* {noformat} TestMinimrCliDriver-bucketmapjoin6.q-constprog_partitioner.q-infer_bucket_sort_dyn_part.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-external_table_with_space_in_location_path.q-infer_bucket_sort_merge.q-auto_sortmerge_join_16.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-groupby2.q-import_exported_table.q-bucketizedhiveinputformat.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-index_bitmap3.q-stats_counter_partitioned.q-temp_table_external.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_map_operators.q-join1.q-bucketmapjoin7.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_num_buckets.q-disable_merge_for_bucketing.q-uber_reduce.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_reducers_power_two.q-scriptfile1.q-scriptfile1_win.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-leftsemijoin_mr.q-load_hdfs_file_with_space_in_the_name.q-root_dir_external_table.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-list_bucket_dml_10.q-bucket_num_reducers.q-bucket6.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-load_fs2.q-file_with_header_footer.q-ql_rewrite_gbtoidx_cbo_1.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-parallel_orderby.q-reduce_deduplicate.q-ql_rewrite_gbtoidx_cbo_2.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-ql_rewrite_gbtoidx.q-smb_mapjoin_8.q - did not produce a TEST-*.xml file TestMinimrCliDriver-schemeAuthority2.q-bucket4.q-input16_cc.q-and-1-more - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_select_dummy_source org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_timestamp_literal org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_add_months org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_bitwise_shiftleft org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_bitwise_shiftright org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_bitwise_shiftrightunsigned org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_cbrt org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_current_database org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_date_add org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_date_sub org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_decode org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_factorial org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_format_number org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_from_utc_timestamp org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_get_json_object org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_last_day org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_levenshtein org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_months_between org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_soundex org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_to_utc_timestamp org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_trunc org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udtf_stack org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_view org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_explainuser_1 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_select_dummy_source org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_ptf_negative_DistributeByOrderBy org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_ptf_negative_PartitionBySortBy org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_select_star_suffix org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_select_udtf_alias org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_subquery_missing_from org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_timestamp_literal org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_udf_add_months_error_1 org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_udf_add_months_error_2 org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_udf_last_day_error_1
[jira] [Commented] (HIVE-10290) Add negative test case to modify a non-existent config value when hive security authorization is enabled.
[ https://issues.apache.org/jira/browse/HIVE-10290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14497107#comment-14497107 ] Hari Sankar Sivarama Subramaniyan commented on HIVE-10290: -- [~thejas] Is it possible to get this in ? Thanks Hari Add negative test case to modify a non-existent config value when hive security authorization is enabled. - Key: HIVE-10290 URL: https://issues.apache.org/jira/browse/HIVE-10290 Project: Hive Issue Type: Bug Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Hari Sankar Sivarama Subramaniyan Attachments: HIVE-10290.1.patch We need to have a test case to cover the following scenario when hive security authorization is enabled: {code} set hive.exec.reduce.max=1; Query returned non-zero code: 1, cause: hive configuration hive.exec.reduce.max does not exists. {code} This is important for ease-of-use and we need to prevent future code change/regression which might convert the above test case to throw a permission denied error. i.e, the below output is not desirable : {code} set hive.exec.reduce.max=1; Error: Error while processing statement: Cannot modify hive.exec.reduce.max at runtime. It is not in list of params that are allowed to be modified at runtime (state=42000,code=1) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-10348) LLAP: merge trunk to branch 2015-04-15
[ https://issues.apache.org/jira/browse/HIVE-10348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin resolved HIVE-10348. - Resolution: Fixed Fix Version/s: llap LLAP: merge trunk to branch 2015-04-15 -- Key: HIVE-10348 URL: https://issues.apache.org/jira/browse/HIVE-10348 Project: Hive Issue Type: Sub-task Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Fix For: llap -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10331) ORC : Is null SARG filters out all row groups written in old ORC format
[ https://issues.apache.org/jira/browse/HIVE-10331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mostafa Mokhtar updated HIVE-10331: --- Issue Type: Bug (was: Sub-task) Parent: (was: HIVE-9132) ORC : Is null SARG filters out all row groups written in old ORC format --- Key: HIVE-10331 URL: https://issues.apache.org/jira/browse/HIVE-10331 Project: Hive Issue Type: Bug Components: Hive Affects Versions: 1.1.0 Reporter: Mostafa Mokhtar Assignee: Mostafa Mokhtar Fix For: 1.2.0 Attachments: HIVE-10331.01.patch, HIVE-10331.02.patch Queries are returning wrong results as all row groups gets filtered out and no rows get scanned. {code} SELECT count(*) FROM store_sales WHERE ss_addr_sk IS NULL {code} With hive.optimize.index.filter disabled we get the correct results In pickRowGroups stats show that hasNull_ is fales, while the rowgroup actually has null. Same query runs fine for newly loaded ORC tables. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10350) CBO: With hive.cbo.costmodel.extended enabled IO cost is negative
[ https://issues.apache.org/jira/browse/HIVE-10350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mostafa Mokhtar updated HIVE-10350: --- Attachment: HIVE-10331.01.patch CBO: With hive.cbo.costmodel.extended enabled IO cost is negative - Key: HIVE-10350 URL: https://issues.apache.org/jira/browse/HIVE-10350 Project: Hive Issue Type: Sub-task Components: CBO Affects Versions: 1.2.0 Reporter: Mostafa Mokhtar Assignee: Mostafa Mokhtar Fix For: 1.2.0 Attachments: HIVE-10331.01.patch Not an overflow but parallelism ends up being -1 as it uses number of buckets {code} final int parallelism = RelMetadataQuery.splitCount(join) == null ? 1 : RelMetadataQuery.splitCount(join); {code} {code} 2015-04-13 18:19:09,154 DEBUG [main]: cost.HiveCostModel (HiveCostModel.java:getJoinCost(62)) - COMMON_JOIN cost: {1600892.857142857 rows, 2.4463782008994658E7 cpu, 8.54445445875E10 io} 2015-04-13 18:19:09,155 DEBUG [main]: cost.HiveCostModel (HiveCostModel.java:getJoinCost(62)) - MAP_JOIN cost: {1600892.857142857 rows, 1601785.714285714 cpu, -1698787.48 io} 2015-04-13 18:19:09,155 DEBUG [main]: cost.HiveCostModel (HiveCostModel.java:getJoinCost(72)) - MAP_JOIN selected 2015-04-13 18:19:09,157 DEBUG [main]: parse.CalcitePlanner (CalcitePlanner.java:apply(862)) - Plan After Join Reordering: HiveSort(fetch=[100]): rowcount = 6006.726049749041, cumulative cost = {1.1468867492063493E8 rows, 1.166177684126984E8 cpu, -1.1757664816220238E9 io}, id = 3000 HiveSort(sort0=[$0], dir0=[ASC]): rowcount = 6006.726049749041, cumulative cost = {1.1468867492063493E8 rows, 1.166177684126984E8 cpu, -1.1757664816220238E9 io}, id = 2998 HiveProject(customer_id=[$4], customername=[concat($9, ', ', $8)]): rowcount = 6006.726049749041, cumulative cost = {1.1468867492063493E8 rows, 1.166177684126984E8 cpu, -1.1757664816220238E9 io}, id = 3136 HiveJoin(condition=[=($1, $5)], joinType=[inner], joinAlgorithm=[map_join], cost=[{5.557820341269841E7 rows, 5.557840182539682E7 cpu, -4299694.122023809 io}]): rowcount = 6006.726049749041, cumulative cost = {1.1468867492063493E8 rows, 1.166177684126984E8 cpu, -1.1757664816220238E9 io}, id = 3132 HiveJoin(condition=[=($0, $1)], joinType=[inner], joinAlgorithm=[map_join], cost=[{5.7498805E7 rows, 5.9419605E7 cpu, -1.15248E9 io}]): rowcount = 5.5578005E7, cumulative cost = {5.7498805E7 rows, 5.9419605E7 cpu, -1.15248E9 io}, id = 3100 HiveProject(sr_cdemo_sk=[$4]): rowcount = 5.5578005E7, cumulative cost = {0.0 rows, 0.0 cpu, 0.0 io}, id = 2992 HiveTableScan(table=[[tpcds_bin_orc_200.store_returns]]): rowcount = 5.5578005E7, cumulative cost = {0}, id = 2878 HiveProject(cd_demo_sk=[$0]): rowcount = 1920800.0, cumulative cost = {0.0 rows, 0.0 cpu, 0.0 io}, id = 2978 HiveTableScan(table=[[tpcds_bin_orc_200.customer_demographics]]): rowcount = 1920800.0, cumulative cost = {0}, id = 2868 HiveJoin(condition=[=($10, $1)], joinType=[inner], joinAlgorithm=[map_join], cost=[{1787.9365079365077 rows, 1790.15873015873 cpu, -8000.0 io}]): rowcount = 198.4126984126984, cumulative cost = {1611666.507936508 rows, 1619761.5873015872 cpu, -1.89867875E7 io}, id = 3130 HiveJoin(condition=[=($0, $4)], joinType=[inner], joinAlgorithm=[map_join], cost=[{8985.714285714286 rows, 16185.714285714286 cpu, -1.728E7 io}]): rowcount = 1785.7142857142856, cumulative cost = {1609878.5714285714 rows, 1617971.4285714284 cpu, -1.89787875E7 io}, id = 3128 HiveProject(hd_demo_sk=[$0], hd_income_band_sk=[$1]): rowcount = 7200.0, cumulative cost = {0.0 rows, 0.0 cpu, 0.0 io}, id = 2982 HiveTableScan(table=[[tpcds_bin_orc_200.household_demographics]]): rowcount = 7200.0, cumulative cost = {0}, id = 2871 HiveJoin(condition=[=($3, $6)], joinType=[inner], joinAlgorithm=[map_join], cost=[{1600892.857142857 rows, 1601785.714285714 cpu, -1698787.48 io}]): rowcount = 1785.7142857142856, cumulative cost = {1600892.857142857 rows, 1601785.714285714 cpu, -1698787.48 io}, id = 3105 HiveProject(c_customer_id=[$1], c_current_cdemo_sk=[$2], c_current_hdemo_sk=[$3], c_current_addr_sk=[$4], c_first_name=[$8], c_last_name=[$9]): rowcount = 160.0, cumulative cost = {0.0 rows, 0.0 cpu, 0.0 io}, id = 2970 HiveTableScan(table=[[tpcds_bin_orc_200.customer]]): rowcount = 160.0, cumulative cost = {0}, id = 2862 HiveProject(ca_address_sk=[$0], ca_city=[$6]): rowcount = 892.8571428571428, cumulative cost = {0.0 rows, 0.0 cpu, 0.0 io}, id = 2974 HiveFilter(condition=[=($6, 'Hopewell')]): rowcount =
[jira] [Commented] (HIVE-10304) Add deprecation message to HiveCLI
[ https://issues.apache.org/jira/browse/HIVE-10304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14497116#comment-14497116 ] Szehon Ho commented on HIVE-10304: -- Done editing these sections with new information or links on Beeline/HS2, as well as deprecation warnings of HiveCLI, feel free to check. Thanks Lefty for the links. Add deprecation message to HiveCLI -- Key: HIVE-10304 URL: https://issues.apache.org/jira/browse/HIVE-10304 Project: Hive Issue Type: Improvement Components: CLI Affects Versions: 1.1.0 Reporter: Szehon Ho Assignee: Szehon Ho Labels: TODOC1.2 Fix For: 1.2.0 Attachments: HIVE-10304.2.patch, HIVE-10304.3.patch, HIVE-10304.patch As Beeline is now the recommended command line tool to Hive, we should add a message to HiveCLI to indicate that it is deprecated and redirect them to Beeline. This is not suggesting to remove HiveCLI for now, but just a helpful direction for user to know the direction to focus attention in Beeline. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-10335) LLAP: IndexOutOfBound in MapJoinOperator
[ https://issues.apache.org/jira/browse/HIVE-10335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin resolved HIVE-10335. - Resolution: Not A Problem LLAP: IndexOutOfBound in MapJoinOperator Key: HIVE-10335 URL: https://issues.apache.org/jira/browse/HIVE-10335 Project: Hive Issue Type: Sub-task Reporter: Siddharth Seth Assignee: Sergey Shelukhin Fix For: llap {code} 2015-04-14 13:57:55,889 [TezTaskRunner_attempt_1428572510173_0173_2_03_14_0(container_1_0173_01_66_sseth_20150414135750_7a7c2f4f-5f2d-4645-b833-677621f087bd:2_Map 1_14_0)] ERROR org.apache.hadoop.hive.ql.exec.MapJoinOperator: Unexpected exception: Index: 0, Size: 0 java.lang.IndexOutOfBoundsException: Index: 0, Size: 0 at java.util.ArrayList.rangeCheck(ArrayList.java:653) at java.util.ArrayList.get(ArrayList.java:429) at org.apache.hadoop.hive.ql.exec.persistence.UnwrapRowContainer.unwrap(UnwrapRowContainer.java:79) at org.apache.hadoop.hive.ql.exec.persistence.UnwrapRowContainer.first(UnwrapRowContainer.java:62) at org.apache.hadoop.hive.ql.exec.persistence.UnwrapRowContainer.first(UnwrapRowContainer.java:33) at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genAllOneUniqueJoinObject(CommonJoinOperator.java:670) at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject(CommonJoinOperator.java:754) at org.apache.hadoop.hive.ql.exec.MapJoinOperator.process(MapJoinOperator.java:386) at org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.process(VectorMapJoinOperator.java:283) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837) at org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.flushOutput(VectorMapJoinOperator.java:232) at org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.closeOp(VectorMapJoinOperator.java:240) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:616) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:630) at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.close(MapRecordProcessor.java:348) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:162) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:332) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:180) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:172) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:172) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:168) at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-10335) LLAP: IndexOutOfBound in MapJoinOperator
[ https://issues.apache.org/jira/browse/HIVE-10335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin resolved HIVE-10335. - Resolution: Done LLAP: IndexOutOfBound in MapJoinOperator Key: HIVE-10335 URL: https://issues.apache.org/jira/browse/HIVE-10335 Project: Hive Issue Type: Sub-task Reporter: Siddharth Seth Assignee: Sergey Shelukhin Fix For: llap {code} 2015-04-14 13:57:55,889 [TezTaskRunner_attempt_1428572510173_0173_2_03_14_0(container_1_0173_01_66_sseth_20150414135750_7a7c2f4f-5f2d-4645-b833-677621f087bd:2_Map 1_14_0)] ERROR org.apache.hadoop.hive.ql.exec.MapJoinOperator: Unexpected exception: Index: 0, Size: 0 java.lang.IndexOutOfBoundsException: Index: 0, Size: 0 at java.util.ArrayList.rangeCheck(ArrayList.java:653) at java.util.ArrayList.get(ArrayList.java:429) at org.apache.hadoop.hive.ql.exec.persistence.UnwrapRowContainer.unwrap(UnwrapRowContainer.java:79) at org.apache.hadoop.hive.ql.exec.persistence.UnwrapRowContainer.first(UnwrapRowContainer.java:62) at org.apache.hadoop.hive.ql.exec.persistence.UnwrapRowContainer.first(UnwrapRowContainer.java:33) at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genAllOneUniqueJoinObject(CommonJoinOperator.java:670) at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject(CommonJoinOperator.java:754) at org.apache.hadoop.hive.ql.exec.MapJoinOperator.process(MapJoinOperator.java:386) at org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.process(VectorMapJoinOperator.java:283) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837) at org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.flushOutput(VectorMapJoinOperator.java:232) at org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.closeOp(VectorMapJoinOperator.java:240) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:616) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:630) at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.close(MapRecordProcessor.java:348) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:162) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:332) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:180) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:172) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:172) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:168) at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Reopened] (HIVE-10335) LLAP: IndexOutOfBound in MapJoinOperator
[ https://issues.apache.org/jira/browse/HIVE-10335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin reopened HIVE-10335: - LLAP: IndexOutOfBound in MapJoinOperator Key: HIVE-10335 URL: https://issues.apache.org/jira/browse/HIVE-10335 Project: Hive Issue Type: Sub-task Reporter: Siddharth Seth Assignee: Sergey Shelukhin Fix For: llap {code} 2015-04-14 13:57:55,889 [TezTaskRunner_attempt_1428572510173_0173_2_03_14_0(container_1_0173_01_66_sseth_20150414135750_7a7c2f4f-5f2d-4645-b833-677621f087bd:2_Map 1_14_0)] ERROR org.apache.hadoop.hive.ql.exec.MapJoinOperator: Unexpected exception: Index: 0, Size: 0 java.lang.IndexOutOfBoundsException: Index: 0, Size: 0 at java.util.ArrayList.rangeCheck(ArrayList.java:653) at java.util.ArrayList.get(ArrayList.java:429) at org.apache.hadoop.hive.ql.exec.persistence.UnwrapRowContainer.unwrap(UnwrapRowContainer.java:79) at org.apache.hadoop.hive.ql.exec.persistence.UnwrapRowContainer.first(UnwrapRowContainer.java:62) at org.apache.hadoop.hive.ql.exec.persistence.UnwrapRowContainer.first(UnwrapRowContainer.java:33) at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genAllOneUniqueJoinObject(CommonJoinOperator.java:670) at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject(CommonJoinOperator.java:754) at org.apache.hadoop.hive.ql.exec.MapJoinOperator.process(MapJoinOperator.java:386) at org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.process(VectorMapJoinOperator.java:283) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837) at org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.flushOutput(VectorMapJoinOperator.java:232) at org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.closeOp(VectorMapJoinOperator.java:240) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:616) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:630) at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.close(MapRecordProcessor.java:348) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:162) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:332) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:180) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:172) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:172) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:168) at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10306) We need to print tez summary when hive.server2.logging.level = PERFORMANCE.
[ https://issues.apache.org/jira/browse/HIVE-10306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Sankar Sivarama Subramaniyan updated HIVE-10306: - Attachment: HIVE-10306.4.patch We need to print tez summary when hive.server2.logging.level = PERFORMANCE. - Key: HIVE-10306 URL: https://issues.apache.org/jira/browse/HIVE-10306 Project: Hive Issue Type: Bug Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Hari Sankar Sivarama Subramaniyan Attachments: HIVE-10306.1.patch, HIVE-10306.2.patch, HIVE-10306.3.patch, HIVE-10306.4.patch We need to print tez summary when hive.server2.logging.level = PERFORMANCE. We introduced this parameter via HIVE-10119. The logging param for levels is only relevant to HS2, so for hive-cli users the hive.tez.exec.print.summary still makes sense. We can check for log-level param as well, in places we are checking value of hive.tez.exec.print.summary. Ie, consider hive.tez.exec.print.summary=true if log.level = PERFORMANCE. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10344) CBO (Calcite Return Path): Use newInstance to create ExprNodeGenericFuncDesc rather than construction function
[ https://issues.apache.org/jira/browse/HIVE-10344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14497137#comment-14497137 ] Ashutosh Chauhan commented on HIVE-10344: - +1 CBO (Calcite Return Path): Use newInstance to create ExprNodeGenericFuncDesc rather than construction function -- Key: HIVE-10344 URL: https://issues.apache.org/jira/browse/HIVE-10344 Project: Hive Issue Type: Sub-task Components: CBO Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong Fix For: 1.2.0 Attachments: HIVE-10344.01.patch ExprNodeGenericFuncDesc is now created using a constructer. It skips the initialization step genericUDF.initializeAndFoldConstants compared with using newInstance method. If the initialization step is skipped, some configuration parameters are not included in the serialization which generates wrong results/errors. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10306) We need to print tez summary when hive.server2.logging.level = PERFORMANCE.
[ https://issues.apache.org/jira/browse/HIVE-10306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Sankar Sivarama Subramaniyan updated HIVE-10306: - Attachment: (was: HIVE-10306.4.patch) We need to print tez summary when hive.server2.logging.level = PERFORMANCE. - Key: HIVE-10306 URL: https://issues.apache.org/jira/browse/HIVE-10306 Project: Hive Issue Type: Bug Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Hari Sankar Sivarama Subramaniyan Attachments: HIVE-10306.1.patch, HIVE-10306.2.patch, HIVE-10306.3.patch We need to print tez summary when hive.server2.logging.level = PERFORMANCE. We introduced this parameter via HIVE-10119. The logging param for levels is only relevant to HS2, so for hive-cli users the hive.tez.exec.print.summary still makes sense. We can check for log-level param as well, in places we are checking value of hive.tez.exec.print.summary. Ie, consider hive.tez.exec.print.summary=true if log.level = PERFORMANCE. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10029) LLAP: Scheduling of work from different queries within the daemon
[ https://issues.apache.org/jira/browse/HIVE-10029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14497279#comment-14497279 ] Prasanth Jayachandran commented on HIVE-10029: -- [~seth.siddha...@gmail.com] This should be covered by HIVE-10028 patch right? LLAP: Scheduling of work from different queries within the daemon - Key: HIVE-10029 URL: https://issues.apache.org/jira/browse/HIVE-10029 Project: Hive Issue Type: Sub-task Reporter: Siddharth Seth Fix For: llap The current implementation is a simple queue - whichever query wins the race to submit work to a daemon will execute first. A policy around this may be useful - potentially a fair share, or a first query in gets all slots approach. Also, prioritiy associated with work within a query should be considered. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10349) overflow in stats
[ https://issues.apache.org/jira/browse/HIVE-10349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-10349: Description: Discovered while running q17 in LLAP. {noformat} Reducer 2 Execution mode: llap Reduce Operator Tree: Merge Join Operator condition map: Inner Join 0 to 1 keys: 0 _col28 (type: int), _col27 (type: int) 1 cs_bill_customer_sk (type: int), cs_item_sk (type: int) outputColumnNames: _col1, _col2, _col6, _col8, _col9, _col22, _col27, _col28, _col34, _col35, _col45, _col51, _col63, _col66, _col82 Statistics: Num rows: 1047651367827495040 Data size: 9223372036854775807 Basic stats: COMPLETE Column stats: PARTIAL Map Join Operator condition map: Inner Join 0 to 1 keys: 0 _col22 (type: int) 1 d_date_sk (type: int) outputColumnNames: _col1, _col2, _col6, _col8, _col9, _col22, _col27, _col28, _col34, _col35, _col45, _col51, _col63, _col66, _col82, _col86 input vertices: 1 Map 7 Statistics: Num rows: 1152416529588199552 Data size: 9223372036854775807 Basic stats: COMPLETE Column stats: NONE {noformat} Data size overflows and row count also looks wrong. I wonder if this is why it generates 1009 reducers for this stage on 6 containers was: Discovered while running q17 in LLAP. {noformat} Reducer 2 Execution mode: llap Reduce Operator Tree: Merge Join Operator condition map: Inner Join 0 to 1 keys: 0 _col28 (type: int), _col27 (type: int) 1 cs_bill_customer_sk (type: int), cs_item_sk (type: int) outputColumnNames: _col1, _col2, _col6, _col8, _col9, _col22, _col27, _col28, _col34, _col35, _col45, _col51, _col63, _col66, _col82 Statistics: Num rows: 1047651367827495040 Data size: 9223372036854775807 Basic stats: COMPLETE Column stats: PARTIAL Map Join Operator condition map: Inner Join 0 to 1 keys: 0 _col22 (type: int) 1 d_date_sk (type: int) outputColumnNames: _col1, _col2, _col6, _col8, _col9, _col22, _col27, _col28, _col34, _col35, _col45, _col51, _col63, _col66, _col82, _col86 input vertices: 1 Map 7 Statistics: Num rows: 1152416529588199552 Data size: 9223372036854775807 Basic stats: COMPLETE Column stats: NONE {noformat} Data size overflows and row count also looks wrong. I wonder if this is why it generates 1009 reducers for this stage overflow in stats - Key: HIVE-10349 URL: https://issues.apache.org/jira/browse/HIVE-10349 Project: Hive Issue Type: Bug Reporter: Sergey Shelukhin Assignee: Prasanth Jayachandran Discovered while running q17 in LLAP. {noformat} Reducer 2 Execution mode: llap Reduce Operator Tree: Merge Join Operator condition map: Inner Join 0 to 1 keys: 0 _col28 (type: int), _col27 (type: int) 1 cs_bill_customer_sk (type: int), cs_item_sk (type: int) outputColumnNames: _col1, _col2, _col6, _col8, _col9, _col22, _col27, _col28, _col34, _col35, _col45, _col51, _col63, _col66, _col82 Statistics: Num rows: 1047651367827495040 Data size: 9223372036854775807 Basic stats: COMPLETE Column stats: PARTIAL Map Join Operator condition map: Inner Join 0 to 1 keys: 0 _col22 (type: int) 1 d_date_sk (type: int) outputColumnNames: _col1, _col2, _col6, _col8, _col9, _col22, _col27, _col28, _col34, _col35, _col45, _col51, _col63, _col66, _col82, _col86 input vertices: 1 Map 7 Statistics: Num rows: 1152416529588199552 Data size: 9223372036854775807 Basic stats: COMPLETE Column stats: NONE {noformat} Data size overflows and row count also looks wrong. I wonder if this is why it generates 1009 reducers for this stage on 6 containers -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10307) Support to use number literals in partition column
[ https://issues.apache.org/jira/browse/HIVE-10307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14497132#comment-14497132 ] Hive QA commented on HIVE-10307: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12725594/HIVE-10307.1.patch {color:red}ERROR:{color} -1 due to 18 failed/errored test(s), 8687 tests executed *Failed tests:* {noformat} TestHBaseNegativeCliDriver - did not produce a TEST-*.xml file TestMinimrCliDriver-bucketmapjoin6.q-constprog_partitioner.q-infer_bucket_sort_dyn_part.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-external_table_with_space_in_location_path.q-infer_bucket_sort_merge.q-auto_sortmerge_join_16.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-groupby2.q-import_exported_table.q-bucketizedhiveinputformat.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-index_bitmap3.q-stats_counter_partitioned.q-temp_table_external.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_map_operators.q-join1.q-bucketmapjoin7.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_num_buckets.q-disable_merge_for_bucketing.q-uber_reduce.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_reducers_power_two.q-scriptfile1.q-scriptfile1_win.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-leftsemijoin_mr.q-load_hdfs_file_with_space_in_the_name.q-root_dir_external_table.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-list_bucket_dml_10.q-bucket_num_reducers.q-bucket6.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-load_fs2.q-file_with_header_footer.q-ql_rewrite_gbtoidx_cbo_1.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-parallel_orderby.q-reduce_deduplicate.q-ql_rewrite_gbtoidx_cbo_2.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-ql_rewrite_gbtoidx.q-smb_mapjoin_8.q - did not produce a TEST-*.xml file TestMinimrCliDriver-schemeAuthority2.q-bucket4.q-input16_cc.q-and-1-more - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udaf_percentile_approx_23 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_view org.apache.hadoop.hive.thrift.TestHadoop20SAuthBridge.testMetastoreProxyUser org.apache.hadoop.hive.thrift.TestHadoop20SAuthBridge.testSaslWithHiveMetaStore {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3449/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3449/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3449/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 18 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12725594 - PreCommit-HIVE-TRUNK-Build Support to use number literals in partition column -- Key: HIVE-10307 URL: https://issues.apache.org/jira/browse/HIVE-10307 Project: Hive Issue Type: Improvement Components: Query Processor Affects Versions: 1.0.0 Reporter: Chaoyu Tang Assignee: Chaoyu Tang Attachments: HIVE-10307.1.patch, HIVE-10307.patch Data types like TinyInt, SmallInt, BigInt or Decimal can be expressed as literals with postfix like Y, S, L, or BD appended to the number. These literals work in most Hive queries, but do not when they are used as partition column value. For a partitioned table like: create table partcoltypenum (key int, value string) partitioned by (tint tinyint, sint smallint, bint bigint); insert into partcoltypenum partition (tint=100Y, sint=1S, bint=1000L) select key, value from src limit 30; Queries like select, describe and drop partition do not work. For an example select * from partcoltypenum where tint=100Y and sint=1S and bint=1000L; does not return any rows. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10350) CBO: With hive.cbo.costmodel.extended enabled IO cost is negative
[ https://issues.apache.org/jira/browse/HIVE-10350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mostafa Mokhtar updated HIVE-10350: --- Description: Not an overflow but parallelism ends up being -1 as it uses number of buckets {code} final int parallelism = RelMetadataQuery.splitCount(join) == null ? 1 : RelMetadataQuery.splitCount(join); {code} {code} 2015-04-13 18:19:09,154 DEBUG [main]: cost.HiveCostModel (HiveCostModel.java:getJoinCost(62)) - COMMON_JOIN cost: {1600892.857142857 rows, 2.4463782008994658E7 cpu, 8.54445445875E10 io} 2015-04-13 18:19:09,155 DEBUG [main]: cost.HiveCostModel (HiveCostModel.java:getJoinCost(62)) - MAP_JOIN cost: {1600892.857142857 rows, 1601785.714285714 cpu, -1698787.48 io} 2015-04-13 18:19:09,155 DEBUG [main]: cost.HiveCostModel (HiveCostModel.java:getJoinCost(72)) - MAP_JOIN selected 2015-04-13 18:19:09,157 DEBUG [main]: parse.CalcitePlanner (CalcitePlanner.java:apply(862)) - Plan After Join Reordering: HiveSort(fetch=[100]): rowcount = 6006.726049749041, cumulative cost = {1.1468867492063493E8 rows, 1.166177684126984E8 cpu, -1.1757664816220238E9 io}, id = 3000 HiveSort(sort0=[$0], dir0=[ASC]): rowcount = 6006.726049749041, cumulative cost = {1.1468867492063493E8 rows, 1.166177684126984E8 cpu, -1.1757664816220238E9 io}, id = 2998 HiveProject(customer_id=[$4], customername=[concat($9, ', ', $8)]): rowcount = 6006.726049749041, cumulative cost = {1.1468867492063493E8 rows, 1.166177684126984E8 cpu, -1.1757664816220238E9 io}, id = 3136 HiveJoin(condition=[=($1, $5)], joinType=[inner], joinAlgorithm=[map_join], cost=[{5.557820341269841E7 rows, 5.557840182539682E7 cpu, -4299694.122023809 io}]): rowcount = 6006.726049749041, cumulative cost = {1.1468867492063493E8 rows, 1.166177684126984E8 cpu, -1.1757664816220238E9 io}, id = 3132 HiveJoin(condition=[=($0, $1)], joinType=[inner], joinAlgorithm=[map_join], cost=[{5.7498805E7 rows, 5.9419605E7 cpu, -1.15248E9 io}]): rowcount = 5.5578005E7, cumulative cost = {5.7498805E7 rows, 5.9419605E7 cpu, -1.15248E9 io}, id = 3100 HiveProject(sr_cdemo_sk=[$4]): rowcount = 5.5578005E7, cumulative cost = {0.0 rows, 0.0 cpu, 0.0 io}, id = 2992 HiveTableScan(table=[[tpcds_bin_orc_200.store_returns]]): rowcount = 5.5578005E7, cumulative cost = {0}, id = 2878 HiveProject(cd_demo_sk=[$0]): rowcount = 1920800.0, cumulative cost = {0.0 rows, 0.0 cpu, 0.0 io}, id = 2978 HiveTableScan(table=[[tpcds_bin_orc_200.customer_demographics]]): rowcount = 1920800.0, cumulative cost = {0}, id = 2868 HiveJoin(condition=[=($10, $1)], joinType=[inner], joinAlgorithm=[map_join], cost=[{1787.9365079365077 rows, 1790.15873015873 cpu, -8000.0 io}]): rowcount = 198.4126984126984, cumulative cost = {1611666.507936508 rows, 1619761.5873015872 cpu, -1.89867875E7 io}, id = 3130 HiveJoin(condition=[=($0, $4)], joinType=[inner], joinAlgorithm=[map_join], cost=[{8985.714285714286 rows, 16185.714285714286 cpu, -1.728E7 io}]): rowcount = 1785.7142857142856, cumulative cost = {1609878.5714285714 rows, 1617971.4285714284 cpu, -1.89787875E7 io}, id = 3128 HiveProject(hd_demo_sk=[$0], hd_income_band_sk=[$1]): rowcount = 7200.0, cumulative cost = {0.0 rows, 0.0 cpu, 0.0 io}, id = 2982 HiveTableScan(table=[[tpcds_bin_orc_200.household_demographics]]): rowcount = 7200.0, cumulative cost = {0}, id = 2871 HiveJoin(condition=[=($3, $6)], joinType=[inner], joinAlgorithm=[map_join], cost=[{1600892.857142857 rows, 1601785.714285714 cpu, -1698787.48 io}]): rowcount = 1785.7142857142856, cumulative cost = {1600892.857142857 rows, 1601785.714285714 cpu, -1698787.48 io}, id = 3105 HiveProject(c_customer_id=[$1], c_current_cdemo_sk=[$2], c_current_hdemo_sk=[$3], c_current_addr_sk=[$4], c_first_name=[$8], c_last_name=[$9]): rowcount = 160.0, cumulative cost = {0.0 rows, 0.0 cpu, 0.0 io}, id = 2970 HiveTableScan(table=[[tpcds_bin_orc_200.customer]]): rowcount = 160.0, cumulative cost = {0}, id = 2862 HiveProject(ca_address_sk=[$0], ca_city=[$6]): rowcount = 892.8571428571428, cumulative cost = {0.0 rows, 0.0 cpu, 0.0 io}, id = 2974 HiveFilter(condition=[=($6, 'Hopewell')]): rowcount = 892.8571428571428, cumulative cost = {0.0 rows, 0.0 cpu, 0.0 io}, id = 2972 HiveTableScan(table=[[tpcds_bin_orc_200.customer_address]]): rowcount = 80.0, cumulative cost = {0}, id = 2864 HiveProject(ib_income_band_sk=[$0], ib_lower_bound=[$1], ib_upper_bound=[$2]): rowcount = 2.2223, cumulative cost = {0.0 rows, 0.0 cpu, 0.0 io}, id = 2988 HiveFilter(condition=[AND(=($1, 32287), =($2, +(32287, 5)))]): rowcount = 2.2223, cumulative cost = {0.0 rows, 0.0 cpu, 0.0 io}, id = 2986
[jira] [Commented] (HIVE-10306) We need to print tez summary when hive.server2.logging.level = PERFORMANCE.
[ https://issues.apache.org/jira/browse/HIVE-10306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14497148#comment-14497148 ] Thejas M Nair commented on HIVE-10306: -- It might be ptest2 that expecting the file to be present. Try changing name to TestOperationLoggingAPITestBase. We need to print tez summary when hive.server2.logging.level = PERFORMANCE. - Key: HIVE-10306 URL: https://issues.apache.org/jira/browse/HIVE-10306 Project: Hive Issue Type: Bug Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Hari Sankar Sivarama Subramaniyan Attachments: HIVE-10306.1.patch, HIVE-10306.2.patch, HIVE-10306.3.patch, HIVE-10306.4.patch We need to print tez summary when hive.server2.logging.level = PERFORMANCE. We introduced this parameter via HIVE-10119. The logging param for levels is only relevant to HS2, so for hive-cli users the hive.tez.exec.print.summary still makes sense. We can check for log-level param as well, in places we are checking value of hive.tez.exec.print.summary. Ie, consider hive.tez.exec.print.summary=true if log.level = PERFORMANCE. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10331) ORC : Is null SARG filters out all row groups written in old ORC format
[ https://issues.apache.org/jira/browse/HIVE-10331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mostafa Mokhtar updated HIVE-10331: --- Issue Type: Sub-task (was: Bug) Parent: HIVE-9132 ORC : Is null SARG filters out all row groups written in old ORC format --- Key: HIVE-10331 URL: https://issues.apache.org/jira/browse/HIVE-10331 Project: Hive Issue Type: Sub-task Components: Hive Affects Versions: 1.1.0 Reporter: Mostafa Mokhtar Assignee: Mostafa Mokhtar Fix For: 1.2.0 Attachments: HIVE-10331.01.patch, HIVE-10331.02.patch Queries are returning wrong results as all row groups gets filtered out and no rows get scanned. {code} SELECT count(*) FROM store_sales WHERE ss_addr_sk IS NULL {code} With hive.optimize.index.filter disabled we get the correct results In pickRowGroups stats show that hasNull_ is fales, while the rowgroup actually has null. Same query runs fine for newly loaded ORC tables. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10028) LLAP: Create a fixed size execution queue for daemons
[ https://issues.apache.org/jira/browse/HIVE-10028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14497266#comment-14497266 ] Prasanth Jayachandran commented on HIVE-10028: -- [~seth.siddha...@gmail.com] Very useful comments! Fixed them all in the new patch. can you take a look again? LLAP: Create a fixed size execution queue for daemons - Key: HIVE-10028 URL: https://issues.apache.org/jira/browse/HIVE-10028 Project: Hive Issue Type: Sub-task Reporter: Siddharth Seth Assignee: Prasanth Jayachandran Fix For: llap Attachments: HIVE-10028.1.patch, HIVE-10028.2.patch Currently, this is unbounded. This should be a configurable size. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10284) enable container reuse for grace hash join
[ https://issues.apache.org/jira/browse/HIVE-10284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei Zheng updated HIVE-10284: - Attachment: HIVE-10284.8.patch Upload patch 8 for testing enable container reuse for grace hash join --- Key: HIVE-10284 URL: https://issues.apache.org/jira/browse/HIVE-10284 Project: Hive Issue Type: Bug Reporter: Gunther Hagleitner Assignee: Wei Zheng Attachments: HIVE-10284.1.patch, HIVE-10284.2.patch, HIVE-10284.3.patch, HIVE-10284.4.patch, HIVE-10284.5.patch, HIVE-10284.6.patch, HIVE-10284.7.patch, HIVE-10284.8.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-8306) Map join sizing done by auto.convert.join.noconditionaltask.size doesn't take into account Hash table overhead and results in OOM
[ https://issues.apache.org/jira/browse/HIVE-8306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mostafa Mokhtar resolved HIVE-8306. --- Resolution: Fixed Resolving since there is the Hybrid Hybrid grace hash table which should handle under estimates gracefully Map join sizing done by auto.convert.join.noconditionaltask.size doesn't take into account Hash table overhead and results in OOM - Key: HIVE-8306 URL: https://issues.apache.org/jira/browse/HIVE-8306 Project: Hive Issue Type: Bug Components: Physical Optimizer Affects Versions: 0.14.0 Reporter: Mostafa Mokhtar Assignee: Prasanth Jayachandran Priority: Minor Attachments: query64_oom_trim.txt When hive.auto.convert.join.noconditionaltask = true we check noconditionaltask.size and if the sum of tables sizes in the map join is less than noconditionaltask.size the plan would generate a Map join, the issue with this is that the calculation doesn't take into account the overhead introduced by different HashTable implementation as results if the sum of input sizes is smaller than the noconditionaltask size by a small margin queries will hit OOM. TPC-DS query 64 is a good example for this issue as one as non conditional task size is set to 1,280,000,000 while sum of input is 1,012,379,321 which is 20% smaller than the expected size. Vertex {code} Map 28 - Map 11 (BROADCAST_EDGE), Map 12 (BROADCAST_EDGE), Map 14 (BROADCAST_EDGE), Map 15 (BROADCAST_EDGE), Map 16 (BROADCAST_EDGE), Map 24 (BROADCAST_EDGE), Map 26 (BROADCAST_EDGE), Map 30 (BROADCAST_EDGE), Map 31 (BROADCAST_EDGE), Map 32 (BROADCAST_EDGE), Map 39 (BROADCAST_EDGE), Map 40 (BROADCAST_EDGE), Map 43 (BROADCAST_EDGE), Map 45 (BROADCAST_EDGE), Map 5 (BROADCAST_EDGE) {code} Exception {code} , TaskAttempt 3 failed, info=[Error: Failure while running task:java.lang.RuntimeException: java.lang.OutOfMemoryError: Java heap space at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:169) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:324) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:180) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:172) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:172) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:167) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:744) Caused by: java.lang.OutOfMemoryError: Java heap space at org.apache.hadoop.hive.serde2.WriteBuffers.nextBufferToWrite(WriteBuffers.java:206) at org.apache.hadoop.hive.serde2.WriteBuffers.write(WriteBuffers.java:182) at org.apache.hadoop.hive.ql.exec.persistence.MapJoinBytesTableContainer$LazyBinaryKvWriter.writeKey(MapJoinBytesTableContainer.java:189) at org.apache.hadoop.hive.ql.exec.persistence.BytesBytesMultiHashMap.put(BytesBytesMultiHashMap.java:200) at org.apache.hadoop.hive.ql.exec.persistence.MapJoinBytesTableContainer.putRow(MapJoinBytesTableContainer.java:267) at org.apache.hadoop.hive.ql.exec.tez.HashTableLoader.load(HashTableLoader.java:114) at org.apache.hadoop.hive.ql.exec.MapJoinOperator.loadHashTable(MapJoinOperator.java:184) at org.apache.hadoop.hive.ql.exec.MapJoinOperator.cleanUpInputFileChangedOp(MapJoinOperator.java:210) at org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1036) at org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1040) at org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1040) at org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1040) at org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:37) at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.processRow(MapRecordProcessor.java:186) at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:164) at
[jira] [Commented] (HIVE-10343) CBO (Calcite Return Path): Parameterize algorithm cost model
[ https://issues.apache.org/jira/browse/HIVE-10343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14497467#comment-14497467 ] Lefty Leverenz commented on HIVE-10343: --- Doc note: I added two labels (TODOC-CBO and TODOC1.2) because commit r1673948 went to the cbo branch but this jira is marked with Fix Version 1.2.0. The patch adds seven configuration parameters to HiveConf.java, so they need to be documented in the wiki for release 1.2.0 or whenever the cbo branch gets merged to trunk. Another parameter is removed (*hive.cbo.costmodel.extended* which came from HIVE-10040). * hive.cbo.costmodel.extended * hive.cbo.costmodel.cpu * hive.cbo.costmodel.network * hive.cbo.costmodel.local.fs.write * hive.cbo.costmodel.local.fs.read * hive.cbo.costmodel.hdfs.write * hive.cbo.costmodel.hdfs.read CBO (Calcite Return Path): Parameterize algorithm cost model Key: HIVE-10343 URL: https://issues.apache.org/jira/browse/HIVE-10343 Project: Hive Issue Type: Sub-task Components: CBO Reporter: Laljo John Pullokkaran Assignee: Laljo John Pullokkaran Labels: TODOC-CBO, TODOC1.2 Fix For: 1.2.0 Attachments: HIVE-10343.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10349) overflow in stats
[ https://issues.apache.org/jira/browse/HIVE-10349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14497412#comment-14497412 ] Sergey Shelukhin commented on HIVE-10349: - [~hagleitn] [~prasanth_j] [~mmokhtar] [~gopalv] many queries in TPCDS suffer from the problem where there are 1000s of reducers (sometimes, 3 stages of 600-700 reducers each); this is running on 6 nodes with ~6 slots each. Not sure if this is caused just by the stats problem or there are other problems with physical optimizer, but it seems like a big perf issue that we should address soon. overflow in stats - Key: HIVE-10349 URL: https://issues.apache.org/jira/browse/HIVE-10349 Project: Hive Issue Type: Bug Reporter: Sergey Shelukhin Assignee: Prasanth Jayachandran Discovered while running q17 in LLAP. {noformat} Reducer 2 Execution mode: llap Reduce Operator Tree: Merge Join Operator condition map: Inner Join 0 to 1 keys: 0 _col28 (type: int), _col27 (type: int) 1 cs_bill_customer_sk (type: int), cs_item_sk (type: int) outputColumnNames: _col1, _col2, _col6, _col8, _col9, _col22, _col27, _col28, _col34, _col35, _col45, _col51, _col63, _col66, _col82 Statistics: Num rows: 1047651367827495040 Data size: 9223372036854775807 Basic stats: COMPLETE Column stats: PARTIAL Map Join Operator condition map: Inner Join 0 to 1 keys: 0 _col22 (type: int) 1 d_date_sk (type: int) outputColumnNames: _col1, _col2, _col6, _col8, _col9, _col22, _col27, _col28, _col34, _col35, _col45, _col51, _col63, _col66, _col82, _col86 input vertices: 1 Map 7 Statistics: Num rows: 1152416529588199552 Data size: 9223372036854775807 Basic stats: COMPLETE Column stats: NONE {noformat} Data size overflows and row count also looks wrong. I wonder if this is why it generates 1009 reducers for this stage on 6 machines -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10319) Hive CLI startup takes a long time with a large number of databases
[ https://issues.apache.org/jira/browse/HIVE-10319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14497423#comment-14497423 ] Hive QA commented on HIVE-10319: {color:red}Overall{color}: -1 no tests executed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12725625/HIVE-10319.patch Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3452/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3452/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3452/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Tests exited with: NonZeroExitCodeException Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit status 1 and output '+ [[ -n /usr/java/jdk1.7.0_45-cloudera ]] + export JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera + JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera + export PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin + PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin + export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m ' + ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m ' + export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + cd /data/hive-ptest/working/ + tee /data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-3452/source-prep.txt + [[ false == \t\r\u\e ]] + mkdir -p maven ivy + [[ svn = \s\v\n ]] + [[ -n '' ]] + [[ -d apache-svn-trunk-source ]] + [[ ! -d apache-svn-trunk-source/.svn ]] + [[ ! -d apache-svn-trunk-source ]] + cd apache-svn-trunk-source + svn revert -R . Reverted 'metastore/scripts/upgrade/derby/hive-schema-1.2.0.derby.sql' Reverted 'metastore/scripts/upgrade/derby/upgrade-1.1.0-to-1.2.0.derby.sql' Reverted 'metastore/scripts/upgrade/oracle/hive-schema-1.2.0.oracle.sql' Reverted 'metastore/scripts/upgrade/oracle/upgrade-1.1.0-to-1.2.0.oracle.sql' Reverted 'metastore/scripts/upgrade/postgres/upgrade-1.1.0-to-1.2.0.postgres.sql' Reverted 'metastore/scripts/upgrade/postgres/hive-schema-1.2.0.postgres.sql' ++ awk '{print $2}' ++ egrep -v '^X|^Performing status on external' ++ svn status --no-ignore + rm -rf target datanucleus.log ant/target shims/target shims/0.20S/target shims/0.23/target shims/aggregator/target shims/common/target shims/scheduler/target packaging/target hbase-handler/target testutils/target testutils/metastore/dbs/derby testutils/metastore/dbs/oracle testutils/metastore/dbs/postgres jdbc/target metastore/target metastore/scripts/upgrade/derby/022-HIVE-10239.derby.sql metastore/scripts/upgrade/oracle/022-HIVE-10239.oracle.sql metastore/scripts/upgrade/postgres/022-HIVE-10239.postgres.sql itests/target itests/thirdparty itests/hcatalog-unit/target itests/test-serde/target itests/qtest/target itests/hive-unit-hadoop2/target itests/hive-minikdc/target itests/hive-jmh/target itests/hive-unit/target itests/custom-serde/target itests/util/target itests/qtest-spark/target hcatalog/target hcatalog/core/target hcatalog/streaming/target hcatalog/server-extensions/target hcatalog/webhcat/svr/target hcatalog/webhcat/java-client/target hcatalog/hcatalog-pig-adapter/target accumulo-handler/target hwi/target common/target common/src/gen spark-client/target service/target contrib/target serde/target beeline/target odbc/target cli/target ql/dependency-reduced-pom.xml ql/target + svn update Uql/src/test/results/clientnegative/udf_next_day_error_1.q.out Uql/src/test/results/clientnegative/udf_add_months_error_1.q.out Uql/src/test/results/clientnegative/udf_next_day_error_2.q.out Uql/src/test/results/clientnegative/udf_last_day_error_1.q.out Uql/src/test/results/clientpositive/spark/vector_elt.q.out Uql/src/test/results/clientpositive/spark/load_dyn_part14.q.out Uql/src/test/results/clientpositive/spark/join8.q.out Uql/src/test/results/clientpositive/spark/optimize_nullscan.q.out Uql/src/test/results/clientpositive/spark/auto_join8.q.out Uql/src/test/results/clientpositive/annotate_stats_select.q.out Uql/src/test/results/clientpositive/udf4.q.out Uql/src/test/results/clientpositive/udf_isnull_isnotnull.q.out Uql/src/test/results/clientpositive/decimal_udf.q.out Uql/src/test/results/clientpositive/udf_hour.q.out Uql/src/test/results/clientpositive/udf_if.q.out Uql/src/test/results/clientpositive/input8.q.out U
[jira] [Commented] (HIVE-10239) Create scripts to do metastore upgrade tests on jenkins for Derby, Oracle and PostgreSQL
[ https://issues.apache.org/jira/browse/HIVE-10239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14497422#comment-14497422 ] Hive QA commented on HIVE-10239: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12725621/HIVE-10239.0.patch {color:red}ERROR:{color} -1 due to 15 failed/errored test(s), 8690 tests executed *Failed tests:* {noformat} TestMinimrCliDriver-bucketmapjoin6.q-constprog_partitioner.q-infer_bucket_sort_dyn_part.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-external_table_with_space_in_location_path.q-infer_bucket_sort_merge.q-auto_sortmerge_join_16.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-groupby2.q-import_exported_table.q-bucketizedhiveinputformat.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-index_bitmap3.q-stats_counter_partitioned.q-temp_table_external.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_map_operators.q-join1.q-bucketmapjoin7.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_num_buckets.q-disable_merge_for_bucketing.q-uber_reduce.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_reducers_power_two.q-scriptfile1.q-scriptfile1_win.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-leftsemijoin_mr.q-load_hdfs_file_with_space_in_the_name.q-root_dir_external_table.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-list_bucket_dml_10.q-bucket_num_reducers.q-bucket6.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-load_fs2.q-file_with_header_footer.q-ql_rewrite_gbtoidx_cbo_1.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-parallel_orderby.q-reduce_deduplicate.q-ql_rewrite_gbtoidx_cbo_2.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-ql_rewrite_gbtoidx.q-smb_mapjoin_8.q - did not produce a TEST-*.xml file TestMinimrCliDriver-schemeAuthority2.q-bucket4.q-input16_cc.q-and-1-more - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_view {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3451/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3451/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3451/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 15 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12725621 - PreCommit-HIVE-TRUNK-Build Create scripts to do metastore upgrade tests on jenkins for Derby, Oracle and PostgreSQL Key: HIVE-10239 URL: https://issues.apache.org/jira/browse/HIVE-10239 Project: Hive Issue Type: Improvement Affects Versions: 1.1.0 Reporter: Naveen Gangam Assignee: Naveen Gangam Attachments: HIVE-10239-donotcommit.patch, HIVE-10239.0.patch, HIVE-10239.0.patch, HIVE-10239.patch Need to create DB-implementation specific scripts to use the framework introduced in HIVE-9800 to have any metastore schema changes tested across all supported databases. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9015) Constant Folding optimizer doesn't handle expressions involving null
[ https://issues.apache.org/jira/browse/HIVE-9015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-9015: --- Component/s: Logical Optimizer Constant Folding optimizer doesn't handle expressions involving null Key: HIVE-9015 URL: https://issues.apache.org/jira/browse/HIVE-9015 Project: Hive Issue Type: Task Components: Logical Optimizer Affects Versions: 0.14.0, 1.0.0, 1.1.0 Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Fix For: 1.2.0 Expressions which are guaranteed to evaluate to {{null}} aren't folded by optimizer yet. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10302) Cache small tables in memory [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-10302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jimmy Xiang updated HIVE-10302: --- Attachment: HIVE-10302.spark-1.patch Cache small tables in memory [Spark Branch] --- Key: HIVE-10302 URL: https://issues.apache.org/jira/browse/HIVE-10302 Project: Hive Issue Type: Improvement Reporter: Jimmy Xiang Assignee: Jimmy Xiang Fix For: spark-branch Attachments: HIVE-10302.spark-1.patch If we can cache small tables in executor memory, we could save some time in loading them from HDFS. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9917) After HIVE-3454 is done, make int to timestamp conversion configurable
[ https://issues.apache.org/jira/browse/HIVE-9917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14497393#comment-14497393 ] Aihua Xu commented on HIVE-9917: Somehow vector_between_in test case baselines were not updated. Upload with new patch to fix the test cases. After HIVE-3454 is done, make int to timestamp conversion configurable -- Key: HIVE-9917 URL: https://issues.apache.org/jira/browse/HIVE-9917 Project: Hive Issue Type: Improvement Reporter: Aihua Xu Assignee: Aihua Xu Attachments: HIVE-9917.patch After HIVE-3454 is fixed, we will have correct behavior of converting int to timestamp. While the customers are using such incorrect behavior for so long, better to make it configurable so that in one release, it will default to old/inconsistent way and the next release will default to new/consistent way. And then we will deprecate it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9917) After HIVE-3454 is done, make int to timestamp conversion configurable
[ https://issues.apache.org/jira/browse/HIVE-9917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aihua Xu updated HIVE-9917: --- Attachment: HIVE-9917.patch After HIVE-3454 is done, make int to timestamp conversion configurable -- Key: HIVE-9917 URL: https://issues.apache.org/jira/browse/HIVE-9917 Project: Hive Issue Type: Improvement Reporter: Aihua Xu Assignee: Aihua Xu Attachments: HIVE-9917.patch After HIVE-3454 is fixed, we will have correct behavior of converting int to timestamp. While the customers are using such incorrect behavior for so long, better to make it configurable so that in one release, it will default to old/inconsistent way and the next release will default to new/consistent way. And then we will deprecate it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9917) After HIVE-3454 is done, make int to timestamp conversion configurable
[ https://issues.apache.org/jira/browse/HIVE-9917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aihua Xu updated HIVE-9917: --- Attachment: (was: HIVE-9917.patch) After HIVE-3454 is done, make int to timestamp conversion configurable -- Key: HIVE-9917 URL: https://issues.apache.org/jira/browse/HIVE-9917 Project: Hive Issue Type: Improvement Reporter: Aihua Xu Assignee: Aihua Xu Attachments: HIVE-9917.patch After HIVE-3454 is fixed, we will have correct behavior of converting int to timestamp. While the customers are using such incorrect behavior for so long, better to make it configurable so that in one release, it will default to old/inconsistent way and the next release will default to new/consistent way. And then we will deprecate it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10350) CBO: Use total size instead of bucket count to determine number of splits parallelism
[ https://issues.apache.org/jira/browse/HIVE-10350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Laljo John Pullokkaran updated HIVE-10350: -- Attachment: HIVE-10350.2.patch CBO: Use total size instead of bucket count to determine number of splits parallelism Key: HIVE-10350 URL: https://issues.apache.org/jira/browse/HIVE-10350 Project: Hive Issue Type: Sub-task Components: CBO Affects Versions: 1.2.0 Reporter: Mostafa Mokhtar Assignee: Mostafa Mokhtar Fix For: 1.2.0 Attachments: HIVE-10331.01.patch, HIVE-10350.2.patch Not an overflow but parallelism ends up being -1 as it uses number of buckets {code} final int parallelism = RelMetadataQuery.splitCount(join) == null ? 1 : RelMetadataQuery.splitCount(join); {code} {code} 2015-04-13 18:19:09,154 DEBUG [main]: cost.HiveCostModel (HiveCostModel.java:getJoinCost(62)) - COMMON_JOIN cost: {1600892.857142857 rows, 2.4463782008994658E7 cpu, 8.54445445875E10 io} 2015-04-13 18:19:09,155 DEBUG [main]: cost.HiveCostModel (HiveCostModel.java:getJoinCost(62)) - MAP_JOIN cost: {1600892.857142857 rows, 1601785.714285714 cpu, -1698787.48 io} 2015-04-13 18:19:09,155 DEBUG [main]: cost.HiveCostModel (HiveCostModel.java:getJoinCost(72)) - MAP_JOIN selected 2015-04-13 18:19:09,157 DEBUG [main]: parse.CalcitePlanner (CalcitePlanner.java:apply(862)) - Plan After Join Reordering: HiveSort(fetch=[100]): rowcount = 6006.726049749041, cumulative cost = {1.1468867492063493E8 rows, 1.166177684126984E8 cpu, -1.1757664816220238E9 io}, id = 3000 HiveSort(sort0=[$0], dir0=[ASC]): rowcount = 6006.726049749041, cumulative cost = {1.1468867492063493E8 rows, 1.166177684126984E8 cpu, -1.1757664816220238E9 io}, id = 2998 HiveProject(customer_id=[$4], customername=[concat($9, ', ', $8)]): rowcount = 6006.726049749041, cumulative cost = {1.1468867492063493E8 rows, 1.166177684126984E8 cpu, -1.1757664816220238E9 io}, id = 3136 HiveJoin(condition=[=($1, $5)], joinType=[inner], joinAlgorithm=[map_join], cost=[{5.557820341269841E7 rows, 5.557840182539682E7 cpu, -4299694.122023809 io}]): rowcount = 6006.726049749041, cumulative cost = {1.1468867492063493E8 rows, 1.166177684126984E8 cpu, -1.1757664816220238E9 io}, id = 3132 HiveJoin(condition=[=($0, $1)], joinType=[inner], joinAlgorithm=[map_join], cost=[{5.7498805E7 rows, 5.9419605E7 cpu, -1.15248E9 io}]): rowcount = 5.5578005E7, cumulative cost = {5.7498805E7 rows, 5.9419605E7 cpu, -1.15248E9 io}, id = 3100 HiveProject(sr_cdemo_sk=[$4]): rowcount = 5.5578005E7, cumulative cost = {0.0 rows, 0.0 cpu, 0.0 io}, id = 2992 HiveTableScan(table=[[tpcds_bin_orc_200.store_returns]]): rowcount = 5.5578005E7, cumulative cost = {0}, id = 2878 HiveProject(cd_demo_sk=[$0]): rowcount = 1920800.0, cumulative cost = {0.0 rows, 0.0 cpu, 0.0 io}, id = 2978 HiveTableScan(table=[[tpcds_bin_orc_200.customer_demographics]]): rowcount = 1920800.0, cumulative cost = {0}, id = 2868 HiveJoin(condition=[=($10, $1)], joinType=[inner], joinAlgorithm=[map_join], cost=[{1787.9365079365077 rows, 1790.15873015873 cpu, -8000.0 io}]): rowcount = 198.4126984126984, cumulative cost = {1611666.507936508 rows, 1619761.5873015872 cpu, -1.89867875E7 io}, id = 3130 HiveJoin(condition=[=($0, $4)], joinType=[inner], joinAlgorithm=[map_join], cost=[{8985.714285714286 rows, 16185.714285714286 cpu, -1.728E7 io}]): rowcount = 1785.7142857142856, cumulative cost = {1609878.5714285714 rows, 1617971.4285714284 cpu, -1.89787875E7 io}, id = 3128 HiveProject(hd_demo_sk=[$0], hd_income_band_sk=[$1]): rowcount = 7200.0, cumulative cost = {0.0 rows, 0.0 cpu, 0.0 io}, id = 2982 HiveTableScan(table=[[tpcds_bin_orc_200.household_demographics]]): rowcount = 7200.0, cumulative cost = {0}, id = 2871 HiveJoin(condition=[=($3, $6)], joinType=[inner], joinAlgorithm=[map_join], cost=[{1600892.857142857 rows, 1601785.714285714 cpu, -1698787.48 io}]): rowcount = 1785.7142857142856, cumulative cost = {1600892.857142857 rows, 1601785.714285714 cpu, -1698787.48 io}, id = 3105 HiveProject(c_customer_id=[$1], c_current_cdemo_sk=[$2], c_current_hdemo_sk=[$3], c_current_addr_sk=[$4], c_first_name=[$8], c_last_name=[$9]): rowcount = 160.0, cumulative cost = {0.0 rows, 0.0 cpu, 0.0 io}, id = 2970 HiveTableScan(table=[[tpcds_bin_orc_200.customer]]): rowcount = 160.0, cumulative cost = {0}, id = 2862 HiveProject(ca_address_sk=[$0], ca_city=[$6]): rowcount = 892.8571428571428, cumulative cost = {0.0 rows, 0.0 cpu, 0.0 io}, id =
[jira] [Commented] (HIVE-10350) CBO: Use total size instead of bucket count to determine number of splits parallelism
[ https://issues.apache.org/jira/browse/HIVE-10350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14497444#comment-14497444 ] Laljo John Pullokkaran commented on HIVE-10350: --- [~mmokhtar] I have uploaded a refined patch. Try it out. CBO: Use total size instead of bucket count to determine number of splits parallelism Key: HIVE-10350 URL: https://issues.apache.org/jira/browse/HIVE-10350 Project: Hive Issue Type: Sub-task Components: CBO Affects Versions: 1.2.0 Reporter: Mostafa Mokhtar Assignee: Mostafa Mokhtar Fix For: 1.2.0 Attachments: HIVE-10331.01.patch, HIVE-10350.2.patch Not an overflow but parallelism ends up being -1 as it uses number of buckets {code} final int parallelism = RelMetadataQuery.splitCount(join) == null ? 1 : RelMetadataQuery.splitCount(join); {code} {code} 2015-04-13 18:19:09,154 DEBUG [main]: cost.HiveCostModel (HiveCostModel.java:getJoinCost(62)) - COMMON_JOIN cost: {1600892.857142857 rows, 2.4463782008994658E7 cpu, 8.54445445875E10 io} 2015-04-13 18:19:09,155 DEBUG [main]: cost.HiveCostModel (HiveCostModel.java:getJoinCost(62)) - MAP_JOIN cost: {1600892.857142857 rows, 1601785.714285714 cpu, -1698787.48 io} 2015-04-13 18:19:09,155 DEBUG [main]: cost.HiveCostModel (HiveCostModel.java:getJoinCost(72)) - MAP_JOIN selected 2015-04-13 18:19:09,157 DEBUG [main]: parse.CalcitePlanner (CalcitePlanner.java:apply(862)) - Plan After Join Reordering: HiveSort(fetch=[100]): rowcount = 6006.726049749041, cumulative cost = {1.1468867492063493E8 rows, 1.166177684126984E8 cpu, -1.1757664816220238E9 io}, id = 3000 HiveSort(sort0=[$0], dir0=[ASC]): rowcount = 6006.726049749041, cumulative cost = {1.1468867492063493E8 rows, 1.166177684126984E8 cpu, -1.1757664816220238E9 io}, id = 2998 HiveProject(customer_id=[$4], customername=[concat($9, ', ', $8)]): rowcount = 6006.726049749041, cumulative cost = {1.1468867492063493E8 rows, 1.166177684126984E8 cpu, -1.1757664816220238E9 io}, id = 3136 HiveJoin(condition=[=($1, $5)], joinType=[inner], joinAlgorithm=[map_join], cost=[{5.557820341269841E7 rows, 5.557840182539682E7 cpu, -4299694.122023809 io}]): rowcount = 6006.726049749041, cumulative cost = {1.1468867492063493E8 rows, 1.166177684126984E8 cpu, -1.1757664816220238E9 io}, id = 3132 HiveJoin(condition=[=($0, $1)], joinType=[inner], joinAlgorithm=[map_join], cost=[{5.7498805E7 rows, 5.9419605E7 cpu, -1.15248E9 io}]): rowcount = 5.5578005E7, cumulative cost = {5.7498805E7 rows, 5.9419605E7 cpu, -1.15248E9 io}, id = 3100 HiveProject(sr_cdemo_sk=[$4]): rowcount = 5.5578005E7, cumulative cost = {0.0 rows, 0.0 cpu, 0.0 io}, id = 2992 HiveTableScan(table=[[tpcds_bin_orc_200.store_returns]]): rowcount = 5.5578005E7, cumulative cost = {0}, id = 2878 HiveProject(cd_demo_sk=[$0]): rowcount = 1920800.0, cumulative cost = {0.0 rows, 0.0 cpu, 0.0 io}, id = 2978 HiveTableScan(table=[[tpcds_bin_orc_200.customer_demographics]]): rowcount = 1920800.0, cumulative cost = {0}, id = 2868 HiveJoin(condition=[=($10, $1)], joinType=[inner], joinAlgorithm=[map_join], cost=[{1787.9365079365077 rows, 1790.15873015873 cpu, -8000.0 io}]): rowcount = 198.4126984126984, cumulative cost = {1611666.507936508 rows, 1619761.5873015872 cpu, -1.89867875E7 io}, id = 3130 HiveJoin(condition=[=($0, $4)], joinType=[inner], joinAlgorithm=[map_join], cost=[{8985.714285714286 rows, 16185.714285714286 cpu, -1.728E7 io}]): rowcount = 1785.7142857142856, cumulative cost = {1609878.5714285714 rows, 1617971.4285714284 cpu, -1.89787875E7 io}, id = 3128 HiveProject(hd_demo_sk=[$0], hd_income_band_sk=[$1]): rowcount = 7200.0, cumulative cost = {0.0 rows, 0.0 cpu, 0.0 io}, id = 2982 HiveTableScan(table=[[tpcds_bin_orc_200.household_demographics]]): rowcount = 7200.0, cumulative cost = {0}, id = 2871 HiveJoin(condition=[=($3, $6)], joinType=[inner], joinAlgorithm=[map_join], cost=[{1600892.857142857 rows, 1601785.714285714 cpu, -1698787.48 io}]): rowcount = 1785.7142857142856, cumulative cost = {1600892.857142857 rows, 1601785.714285714 cpu, -1698787.48 io}, id = 3105 HiveProject(c_customer_id=[$1], c_current_cdemo_sk=[$2], c_current_hdemo_sk=[$3], c_current_addr_sk=[$4], c_first_name=[$8], c_last_name=[$9]): rowcount = 160.0, cumulative cost = {0.0 rows, 0.0 cpu, 0.0 io}, id = 2970 HiveTableScan(table=[[tpcds_bin_orc_200.customer]]): rowcount = 160.0, cumulative cost = {0}, id = 2862 HiveProject(ca_address_sk=[$0], ca_city=[$6]): rowcount
[jira] [Commented] (HIVE-10356) LLAP: query80 fails with vectorization cast issue
[ https://issues.apache.org/jira/browse/HIVE-10356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14497442#comment-14497442 ] Matt McCline commented on HIVE-10356: - Looks like HIVE-10244. LLAP: query80 fails with vectorization cast issue -- Key: HIVE-10356 URL: https://issues.apache.org/jira/browse/HIVE-10356 Project: Hive Issue Type: Sub-task Reporter: Sergey Shelukhin Assignee: Matt McCline Reducer 6 fails: {noformat} Error: Failure while running task:java.lang.RuntimeException: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing vector batch (tag=0) \N\N09.285817653506076E84.639990363237801E7-1.1814318134524737E8 \N\N01.2847032699693155E96.41569738480791E7-5.956161019898126E8 \N\N04.682909323885761E82.288924051203157E7-5.995957665973593E7 at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:171) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:332) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:180) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:172) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:172) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:168) at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing vector batch (tag=0) \N\N09.285817653506076E84.639990363237801E7-1.1814318134524737E8 \N\N01.2847032699693155E96.41569738480791E7-5.956161019898126E8 \N\N04.682909323885761E82.288924051203157E7-5.995957665973593E7 at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:267) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.run(ReduceRecordProcessor.java:254) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:148) ... 14 more Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing vector batch (tag=0) \N\N09.285817653506076E84.639990363237801E7-1.1814318134524737E8 \N\N01.2847032699693155E96.41569738480791E7-5.956161019898126E8 \N\N04.682909323885761E82.288924051203157E7-5.995957665973593E7 at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.processVectors(ReduceRecordSource.java:394) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:252) ... 16 more Caused by: java.lang.ClassCastException: org.apache.hadoop.hive.ql.exec.vector.DoubleColumnVector cannot be cast to org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector at org.apache.hadoop.hive.ql.exec.vector.VectorGroupKeyHelper.copyGroupKey(VectorGroupKeyHelper.java:94) at org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator$ProcessingModeGroupBatches.processBatch(VectorGroupByOperator.java:729) at org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator.process(VectorGroupByOperator.java:878) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.processVectors(ReduceRecordSource.java:378) ... 17 more ]], Vertex failed as one or more tasks failed. failedTasks:1, Vertex vertex_1428572510173_0231_1_24 [Reducer 5] killed/failed due to:null]Vertex killed, vertexName=Reducer 6, vertexId=vertex_1428572510173_0231_1_25, diagnostics=[Vertex received Kill while in RUNNING state., Vertex killed as other vertex failed. failedTasks:0, Vertex vertex_1428572510173_0231_1_25 [Reducer 6] killed/failed due to:null]DAG failed due to vertex failure. failedVertices:1 killedVertices:1 {noformat} How to repro: run query80 on scale factor 200. I might look tomorrow to see if this is specific to LLAP or not -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10324) Hive metatool should take table_param_key to allow for changes to avro serde's schema url key
[ https://issues.apache.org/jira/browse/HIVE-10324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ferdinand Xu updated HIVE-10324: Attachment: HIVE-10324.1.patch Thanks [~szehon] for your review. Update patch addressing backwards compatibility Hive metatool should take table_param_key to allow for changes to avro serde's schema url key - Key: HIVE-10324 URL: https://issues.apache.org/jira/browse/HIVE-10324 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 1.1.0 Reporter: Szehon Ho Assignee: Ferdinand Xu Attachments: HIVE-10324.1.patch, HIVE-10324.patch, HIVE-10324.patch.WIP HIVE-3443 added support to change the serdeParams from 'metatool updateLocation' command. However, in avro it is possible to specify the schema via the tableParams: {noformat} CREATE TABLE `testavro`( `test` string COMMENT 'from deserializer') ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat' TBLPROPERTIES ( 'avro.schema.url'='hdfs://namenode:8020/tmp/test.avsc', 'kite.compression.type'='snappy', 'transient_lastDdlTime'='1427996456') {noformat} Hence for those tables the 'metatool updateLocation' will not help. This is necessary in case like upgrade the namenode to HA where the absolute paths have changed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9252) Linking custom SerDe jar to table definition.
[ https://issues.apache.org/jira/browse/HIVE-9252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14497585#comment-14497585 ] Ferdinand Xu commented on HIVE-9252: Sorry, I don't have circles working on this jira currently. It's on my TODO list. I will work on it soon. Thank you! Linking custom SerDe jar to table definition. - Key: HIVE-9252 URL: https://issues.apache.org/jira/browse/HIVE-9252 Project: Hive Issue Type: New Feature Components: Serializers/Deserializers Reporter: Niels Basjes Assignee: Ferdinand Xu Attachments: HIVE-9252.1.patch In HIVE-6047 the option was created that a jar file can be hooked to the definition of a function. (See: [Language Manual DDL: Permanent Functions|https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-PermanentFunctions] ) I propose to add something similar that can be used when defining an external table that relies on a custom Serde (I expect to usually only have the Deserializer). Something like this: {code} CREATE [TEMPORARY] [EXTERNAL] TABLE [IF NOT EXISTS] [db_name.]table_name ... STORED BY 'storage.handler.class.name' [WITH SERDEPROPERTIES (...)] [USING JAR|FILE|ARCHIVE 'file_uri' [, JAR|FILE|ARCHIVE 'file_uri'] ]; {code} Using this you can define (and share !!!) a Hive table on top of a custom fileformat without the need to let the IT operations people deploy a custom SerDe jar file on all nodes. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10302) Cache small tables in memory [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-10302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14497390#comment-14497390 ] Jimmy Xiang commented on HIVE-10302: The patch is on RB: https://reviews.apache.org/r/33251/ Cache small tables in memory [Spark Branch] --- Key: HIVE-10302 URL: https://issues.apache.org/jira/browse/HIVE-10302 Project: Hive Issue Type: Improvement Reporter: Jimmy Xiang Assignee: Jimmy Xiang Fix For: spark-branch Attachments: HIVE-10302.spark-1.patch If we can cache small tables in executor memory, we could save some time in loading them from HDFS. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-9015) Constant Folding optimizer doesn't handle expressions involving null
[ https://issues.apache.org/jira/browse/HIVE-9015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan resolved HIVE-9015. Resolution: Fixed Fix Version/s: 1.2.0 Fixed via HIVE-9645 Constant Folding optimizer doesn't handle expressions involving null Key: HIVE-9015 URL: https://issues.apache.org/jira/browse/HIVE-9015 Project: Hive Issue Type: Task Affects Versions: 0.14.0, 1.0.0, 1.1.0 Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Fix For: 1.2.0 Expressions which are guaranteed to evaluate to {{null}} aren't folded by optimizer yet. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10270) Cannot use Decimal constants less than 0.1BD
[ https://issues.apache.org/jira/browse/HIVE-10270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14497481#comment-14497481 ] Hive QA commented on HIVE-10270: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12725676/HIVE-10270.5.patch {color:red}ERROR:{color} -1 due to 16 failed/errored test(s), 8691 tests executed *Failed tests:* {noformat} TestMinimrCliDriver-bucketmapjoin6.q-constprog_partitioner.q-infer_bucket_sort_dyn_part.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-external_table_with_space_in_location_path.q-infer_bucket_sort_merge.q-auto_sortmerge_join_16.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-groupby2.q-import_exported_table.q-bucketizedhiveinputformat.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-index_bitmap3.q-stats_counter_partitioned.q-temp_table_external.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_map_operators.q-join1.q-bucketmapjoin7.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_num_buckets.q-disable_merge_for_bucketing.q-uber_reduce.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_reducers_power_two.q-scriptfile1.q-scriptfile1_win.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-leftsemijoin_mr.q-load_hdfs_file_with_space_in_the_name.q-root_dir_external_table.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-list_bucket_dml_10.q-bucket_num_reducers.q-bucket6.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-load_fs2.q-file_with_header_footer.q-ql_rewrite_gbtoidx_cbo_1.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-parallel_orderby.q-reduce_deduplicate.q-ql_rewrite_gbtoidx_cbo_2.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-ql_rewrite_gbtoidx.q-smb_mapjoin_8.q - did not produce a TEST-*.xml file TestMinimrCliDriver-schemeAuthority2.q-bucket4.q-input16_cc.q-and-1-more - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_view org.apache.hadoop.hive.thrift.TestHadoop20SAuthBridge.testMetastoreProxyUser org.apache.hadoop.hive.thrift.TestHadoop20SAuthBridge.testSaslWithHiveMetaStore {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3454/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3454/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3454/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 16 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12725676 - PreCommit-HIVE-TRUNK-Build Cannot use Decimal constants less than 0.1BD Key: HIVE-10270 URL: https://issues.apache.org/jira/browse/HIVE-10270 Project: Hive Issue Type: Bug Components: Types Reporter: Jason Dere Assignee: Jason Dere Attachments: HIVE-10270.1.patch, HIVE-10270.2.patch, HIVE-10270.3.patch, HIVE-10270.4.patch, HIVE-10270.5.patch {noformat} hive select 0.09765625BD; FAILED: IllegalArgumentException Decimal scale must be less than or equal to precision {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10324) Hive metatool should take table_param_key to allow for changes to avro serde's schema url key
[ https://issues.apache.org/jira/browse/HIVE-10324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14497557#comment-14497557 ] Szehon Ho commented on HIVE-10324: -- Thanks! +1 Hive metatool should take table_param_key to allow for changes to avro serde's schema url key - Key: HIVE-10324 URL: https://issues.apache.org/jira/browse/HIVE-10324 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 1.1.0 Reporter: Szehon Ho Assignee: Ferdinand Xu Attachments: HIVE-10324.1.patch, HIVE-10324.patch, HIVE-10324.patch.WIP HIVE-3443 added support to change the serdeParams from 'metatool updateLocation' command. However, in avro it is possible to specify the schema via the tableParams: {noformat} CREATE TABLE `testavro`( `test` string COMMENT 'from deserializer') ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat' TBLPROPERTIES ( 'avro.schema.url'='hdfs://namenode:8020/tmp/test.avsc', 'kite.compression.type'='snappy', 'transient_lastDdlTime'='1427996456') {noformat} Hence for those tables the 'metatool updateLocation' will not help. This is necessary in case like upgrade the namenode to HA where the absolute paths have changed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10306) We need to print tez summary when hive.server2.logging.level = PERFORMANCE.
[ https://issues.apache.org/jira/browse/HIVE-10306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Sankar Sivarama Subramaniyan updated HIVE-10306: - Attachment: HIVE-10306.4.patch We need to print tez summary when hive.server2.logging.level = PERFORMANCE. - Key: HIVE-10306 URL: https://issues.apache.org/jira/browse/HIVE-10306 Project: Hive Issue Type: Bug Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Hari Sankar Sivarama Subramaniyan Attachments: HIVE-10306.1.patch, HIVE-10306.2.patch, HIVE-10306.3.patch, HIVE-10306.4.patch We need to print tez summary when hive.server2.logging.level = PERFORMANCE. We introduced this parameter via HIVE-10119. The logging param for levels is only relevant to HS2, so for hive-cli users the hive.tez.exec.print.summary still makes sense. We can check for log-level param as well, in places we are checking value of hive.tez.exec.print.summary. Ie, consider hive.tez.exec.print.summary=true if log.level = PERFORMANCE. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10306) We need to print tez summary when hive.server2.logging.level = PERFORMANCE.
[ https://issues.apache.org/jira/browse/HIVE-10306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Sankar Sivarama Subramaniyan updated HIVE-10306: - Attachment: (was: HIVE-10306.4.patch) We need to print tez summary when hive.server2.logging.level = PERFORMANCE. - Key: HIVE-10306 URL: https://issues.apache.org/jira/browse/HIVE-10306 Project: Hive Issue Type: Bug Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Hari Sankar Sivarama Subramaniyan Attachments: HIVE-10306.1.patch, HIVE-10306.2.patch, HIVE-10306.3.patch We need to print tez summary when hive.server2.logging.level = PERFORMANCE. We introduced this parameter via HIVE-10119. The logging param for levels is only relevant to HS2, so for hive-cli users the hive.tez.exec.print.summary still makes sense. We can check for log-level param as well, in places we are checking value of hive.tez.exec.print.summary. Ie, consider hive.tez.exec.print.summary=true if log.level = PERFORMANCE. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10040) CBO (Calcite Return Path): Pluggable cost modules [CBO branch]
[ https://issues.apache.org/jira/browse/HIVE-10040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lefty Leverenz updated HIVE-10040: -- Labels: (was: TODOC-CBO) CBO (Calcite Return Path): Pluggable cost modules [CBO branch] -- Key: HIVE-10040 URL: https://issues.apache.org/jira/browse/HIVE-10040 Project: Hive Issue Type: Sub-task Components: CBO Affects Versions: cbo-branch Reporter: Jesus Camacho Rodriguez Assignee: Jesus Camacho Rodriguez Fix For: cbo-branch Attachments: HIVE-10040.01.cbo.patch, HIVE-10040.02.cbo.patch, HIVE-10040.03.cbo.patch, HIVE-10040.cbo.patch We should be able to deal with cost models in a modular way. Thus, the cost model should be integrated within a Calcite MD provider that is pluggable. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10040) CBO (Calcite Return Path): Pluggable cost modules [CBO branch]
[ https://issues.apache.org/jira/browse/HIVE-10040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14497469#comment-14497469 ] Lefty Leverenz commented on HIVE-10040: --- No doc needed: HIVE-10343 removed *hive.cbo.costmodel.extended* so I'm removing the TODOC-CBO label. CBO (Calcite Return Path): Pluggable cost modules [CBO branch] -- Key: HIVE-10040 URL: https://issues.apache.org/jira/browse/HIVE-10040 Project: Hive Issue Type: Sub-task Components: CBO Affects Versions: cbo-branch Reporter: Jesus Camacho Rodriguez Assignee: Jesus Camacho Rodriguez Fix For: cbo-branch Attachments: HIVE-10040.01.cbo.patch, HIVE-10040.02.cbo.patch, HIVE-10040.03.cbo.patch, HIVE-10040.cbo.patch We should be able to deal with cost models in a modular way. Thus, the cost model should be integrated within a Calcite MD provider that is pluggable. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10346) Tez on HBase has problems with settings again
[ https://issues.apache.org/jira/browse/HIVE-10346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14497533#comment-14497533 ] Hive QA commented on HIVE-10346: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12725700/HIVE-10346.patch {color:red}ERROR:{color} -1 due to 14 failed/errored test(s), 8690 tests executed *Failed tests:* {noformat} TestMinimrCliDriver-bucketmapjoin6.q-constprog_partitioner.q-infer_bucket_sort_dyn_part.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-external_table_with_space_in_location_path.q-infer_bucket_sort_merge.q-auto_sortmerge_join_16.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-groupby2.q-import_exported_table.q-bucketizedhiveinputformat.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-index_bitmap3.q-stats_counter_partitioned.q-temp_table_external.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_map_operators.q-join1.q-bucketmapjoin7.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_num_buckets.q-disable_merge_for_bucketing.q-uber_reduce.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_reducers_power_two.q-scriptfile1.q-scriptfile1_win.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-leftsemijoin_mr.q-load_hdfs_file_with_space_in_the_name.q-root_dir_external_table.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-list_bucket_dml_10.q-bucket_num_reducers.q-bucket6.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-load_fs2.q-file_with_header_footer.q-ql_rewrite_gbtoidx_cbo_1.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-parallel_orderby.q-reduce_deduplicate.q-ql_rewrite_gbtoidx_cbo_2.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-ql_rewrite_gbtoidx.q-smb_mapjoin_8.q - did not produce a TEST-*.xml file TestMinimrCliDriver-schemeAuthority2.q-bucket4.q-input16_cc.q-and-1-more - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_view {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3455/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3455/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3455/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 14 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12725700 - PreCommit-HIVE-TRUNK-Build Tez on HBase has problems with settings again - Key: HIVE-10346 URL: https://issues.apache.org/jira/browse/HIVE-10346 Project: Hive Issue Type: Bug Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: HIVE-10346.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9710) HiveServer2 should support cookie based authentication, when using HTTP transport.
[ https://issues.apache.org/jira/browse/HIVE-9710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14496983#comment-14496983 ] Hari Sankar Sivarama Subramaniyan commented on HIVE-9710: - [~vgumashta] Thanks for reviewing the change. I am adding the follow-up jira HIVE-10345 to cover the test case addressed by you. Thanks Hari HiveServer2 should support cookie based authentication, when using HTTP transport. -- Key: HIVE-9710 URL: https://issues.apache.org/jira/browse/HIVE-9710 Project: Hive Issue Type: Improvement Components: HiveServer2 Affects Versions: 1.2.0 Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Hari Sankar Sivarama Subramaniyan Labels: TODOC1.2 Fix For: 1.2.0 Attachments: HIVE-9710.1.patch, HIVE-9710.2.patch, HIVE-9710.3.patch, HIVE-9710.4.patch, HIVE-9710.5.patch, HIVE-9710.6.patch, HIVE-9710.7.patch, HIVE-9710.8.patch HiveServer2 should generate cookies and validate the client cookie send to it so that it need not perform User/Password or a Kerberos based authentication on each HTTP request. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10345) Add test case to ensure client sends credentials in non-ssl mode when HS2 sends a secure cookie
[ https://issues.apache.org/jira/browse/HIVE-10345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Sankar Sivarama Subramaniyan updated HIVE-10345: - Description: We need to add test cases to cover these scenarios. _ Client | HS2 Cookie | Expected Behavior SSL | Secured | Client replays, server validates the cookie. SSL | Unsecured| Client replays, server validates the cookie. No SSL | UnSecured| Client replays, server validates the cookie. No SSL | Secured | Client should send back credentials since cookie | | replay will not be transmitted back to the server. was: We need to add test cases to cover these scenarios. _ Client | HS2 Cookie | Expected Behavior ___| _ |___ SSL | Secured | Client replays, server validates the cookie. SSL | Unsecured| Client replays, server validates the cookie. No SSL | UnSecured| Client replays, server validates the cookie. No SSL | Secured | Client should send back credentials since cookie | | replay will not be transmitted back to the server. Add test case to ensure client sends credentials in non-ssl mode when HS2 sends a secure cookie --- Key: HIVE-10345 URL: https://issues.apache.org/jira/browse/HIVE-10345 Project: Hive Issue Type: Bug Components: HiveServer2 Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Hari Sankar Sivarama Subramaniyan We need to add test cases to cover these scenarios. _ Client | HS2 Cookie | Expected Behavior SSL | Secured | Client replays, server validates the cookie. SSL | Unsecured| Client replays, server validates the cookie. No SSL | UnSecured| Client replays, server validates the cookie. No SSL | Secured | Client should send back credentials since cookie | | replay will not be transmitted back to the server. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10345) Add test case to ensure client sends credentials in non-ssl mode when HS2 sends a secure cookie
[ https://issues.apache.org/jira/browse/HIVE-10345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Sankar Sivarama Subramaniyan updated HIVE-10345: - Description: We need to add test cases to cover these scenarios. _ Client | HS2 Cookie | Expected Behavior SSL | Secured | Client replays, server validates the cookie. SSL | Unsecured| Client replays, server validates the cookie. No SSL | UnSecured| Client replays, server validates the cookie. No SSL | Secured | Client should send back credentials since cookie replay will not be transmitted back to the server. was: We need to add test cases to cover these scenarios. _ Client | HS2 Cookie | Expected Behavior SSL | Secured | Client replays, server validates the cookie. SSL | Unsecured| Client replays, server validates the cookie. No SSL | UnSecured| Client replays, server validates the cookie. No SSL | Secured | Client should send back credentials since cookie | | replay will not be transmitted back to the server. Add test case to ensure client sends credentials in non-ssl mode when HS2 sends a secure cookie --- Key: HIVE-10345 URL: https://issues.apache.org/jira/browse/HIVE-10345 Project: Hive Issue Type: Bug Components: HiveServer2 Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Hari Sankar Sivarama Subramaniyan We need to add test cases to cover these scenarios. _ Client | HS2 Cookie | Expected Behavior SSL | Secured | Client replays, server validates the cookie. SSL | Unsecured| Client replays, server validates the cookie. No SSL | UnSecured| Client replays, server validates the cookie. No SSL | Secured | Client should send back credentials since cookie replay will not be transmitted back to the server. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9580) Server returns incorrect result from JOIN ON VARCHAR columns
[ https://issues.apache.org/jira/browse/HIVE-9580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14497008#comment-14497008 ] Jason Dere commented on HIVE-9580: -- I think this looks fine. I would just say to make sure there are tests to cover the types that would get affected by this change (char/varchar/decimal joins), which it looks like there already are. Server returns incorrect result from JOIN ON VARCHAR columns Key: HIVE-9580 URL: https://issues.apache.org/jira/browse/HIVE-9580 Project: Hive Issue Type: Bug Components: HiveServer2 Affects Versions: 0.12.0, 0.13.0, 0.14.0 Reporter: Mike Assignee: Aihua Xu Attachments: HIVE-9580.patch The database erroneously returns rows when joining two tables which each contain a VARCHAR column and the join's ON condition uses the equality operator on the VARCHAR columns. **The following JDBC method exhibits the problem: static void joinIssue() throws SQLException { String sql; int rowsAffected; ResultSet rs; Statement stmt = con.createStatement(); String table1_Name = blahtab1; String table1A_Name = blahtab1A; String table1B_Name = blahtab1B; String table2_Name = blahtab2; try { sql = drop table + table1_Name; System.out.println(\nsql= + sql); rowsAffected = stmt.executeUpdate(sql); } catch (SQLException se) { println(Drop table error: + se.getMessage()); } try { sql = CREATE TABLE + table1_Name + ( + VCHARCOL VARCHAR(10) + ,INTEGERCOL INT + ) ; System.out.println(\nsql= + sql); rowsAffected = stmt.executeUpdate(sql); } catch (SQLException se) { println(create table error: + se.getMessage()); } sql = insert into + table1_Name + values ('jklmnopqrs', 99); System.out.println(\nsql= + sql); stmt.executeUpdate(sql); System.out.println(===); try { sql = drop table + table1A_Name; System.out.println(\nsql= + sql); rowsAffected = stmt.executeUpdate(sql); } catch (SQLException se) { println(Drop table error: + se.getMessage()); } try { sql = CREATE TABLE + table1A_Name + ( + VCHARCOL VARCHAR(10) + ) ; System.out.println(\nsql= + sql); rowsAffected = stmt.executeUpdate(sql); } catch (SQLException se) { println(create table error: + se.getMessage()); } sql = insert into + table1A_Name + values ('jklmnopqrs'); System.out.println(\nsql= + sql); stmt.executeUpdate(sql); System.out.println(===); try { sql = drop table + table1B_Name; System.out.println(\nsql= + sql); rowsAffected = stmt.executeUpdate(sql); } catch (SQLException se) { println(Drop table error: + se.getMessage()); } try { sql = CREATE TABLE + table1B_Name + ( + VCHARCOL VARCHAR(11) + ,INTEGERCOL INT + ) ; System.out.println(\nsql= + sql); rowsAffected = stmt.executeUpdate(sql); } catch (SQLException se) { println(create table error: + se.getMessage()); }
[jira] [Assigned] (HIVE-10335) LLAP: IndexOutOfBound in MapJoinOperator
[ https://issues.apache.org/jira/browse/HIVE-10335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin reassigned HIVE-10335: --- Assignee: Sergey Shelukhin LLAP: IndexOutOfBound in MapJoinOperator Key: HIVE-10335 URL: https://issues.apache.org/jira/browse/HIVE-10335 Project: Hive Issue Type: Sub-task Reporter: Siddharth Seth Assignee: Sergey Shelukhin Fix For: llap {code} 2015-04-14 13:57:55,889 [TezTaskRunner_attempt_1428572510173_0173_2_03_14_0(container_1_0173_01_66_sseth_20150414135750_7a7c2f4f-5f2d-4645-b833-677621f087bd:2_Map 1_14_0)] ERROR org.apache.hadoop.hive.ql.exec.MapJoinOperator: Unexpected exception: Index: 0, Size: 0 java.lang.IndexOutOfBoundsException: Index: 0, Size: 0 at java.util.ArrayList.rangeCheck(ArrayList.java:653) at java.util.ArrayList.get(ArrayList.java:429) at org.apache.hadoop.hive.ql.exec.persistence.UnwrapRowContainer.unwrap(UnwrapRowContainer.java:79) at org.apache.hadoop.hive.ql.exec.persistence.UnwrapRowContainer.first(UnwrapRowContainer.java:62) at org.apache.hadoop.hive.ql.exec.persistence.UnwrapRowContainer.first(UnwrapRowContainer.java:33) at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genAllOneUniqueJoinObject(CommonJoinOperator.java:670) at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject(CommonJoinOperator.java:754) at org.apache.hadoop.hive.ql.exec.MapJoinOperator.process(MapJoinOperator.java:386) at org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.process(VectorMapJoinOperator.java:283) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837) at org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.flushOutput(VectorMapJoinOperator.java:232) at org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.closeOp(VectorMapJoinOperator.java:240) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:616) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:630) at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.close(MapRecordProcessor.java:348) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:162) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:332) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:180) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:172) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:172) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:168) at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)