[jira] [Updated] (HIVE-15796) HoS: poor reducer parallelism when operator stats are not accurate
[ https://issues.apache.org/jira/browse/HIVE-15796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Sun updated HIVE-15796: Attachment: (was: HIVE-15796.1.patch) > HoS: poor reducer parallelism when operator stats are not accurate > -- > > Key: HIVE-15796 > URL: https://issues.apache.org/jira/browse/HIVE-15796 > Project: Hive > Issue Type: Improvement > Components: Statistics >Affects Versions: 2.2.0 >Reporter: Chao Sun >Assignee: Chao Sun > Attachments: HIVE-15796.1.patch, HIVE-15796.wip.1.patch, > HIVE-15796.wip.2.patch, HIVE-15796.wip.patch > > > In HoS we use currently use operator stats to determine reducer parallelism. > However, it is often the case that operator stats are not accurate, > especially if column stats are not available. This sometimes will generate > extremely poor reducer parallelism, and cause HoS query to run forever. > This JIRA tries to offer an alternative way to compute reducer parallelism, > similar to how MR does. Here's the approach we are suggesting: > 1. when computing the parallelism for a MapWork, use stats associated with > the TableScan operator; > 2. when computing the parallelism for a ReduceWork, use the *maximum* > parallelism from all its parents. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15796) HoS: poor reducer parallelism when operator stats are not accurate
[ https://issues.apache.org/jira/browse/HIVE-15796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Sun updated HIVE-15796: Attachment: HIVE-15796.1.patch > HoS: poor reducer parallelism when operator stats are not accurate > -- > > Key: HIVE-15796 > URL: https://issues.apache.org/jira/browse/HIVE-15796 > Project: Hive > Issue Type: Improvement > Components: Statistics >Affects Versions: 2.2.0 >Reporter: Chao Sun >Assignee: Chao Sun > Attachments: HIVE-15796.1.patch, HIVE-15796.wip.1.patch, > HIVE-15796.wip.2.patch, HIVE-15796.wip.patch > > > In HoS we use currently use operator stats to determine reducer parallelism. > However, it is often the case that operator stats are not accurate, > especially if column stats are not available. This sometimes will generate > extremely poor reducer parallelism, and cause HoS query to run forever. > This JIRA tries to offer an alternative way to compute reducer parallelism, > similar to how MR does. Here's the approach we are suggesting: > 1. when computing the parallelism for a MapWork, use stats associated with > the TableScan operator; > 2. when computing the parallelism for a ReduceWork, use the *maximum* > parallelism from all its parents. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-14754) Track the queries execution lifecycle times
[ https://issues.apache.org/jira/browse/HIVE-14754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15857583#comment-15857583 ] Lefty Leverenz commented on HIVE-14754: --- Doc note: The new metrics need to be documented in the wiki. * [Hive Metrics | https://cwiki.apache.org/confluence/display/Hive/Hive+Metrics] Added a TODOC2.2 label. > Track the queries execution lifecycle times > --- > > Key: HIVE-14754 > URL: https://issues.apache.org/jira/browse/HIVE-14754 > Project: Hive > Issue Type: Sub-task > Components: Hive, HiveServer2 >Affects Versions: 2.2.0 >Reporter: Barna Zsombor Klara >Assignee: Barna Zsombor Klara > Labels: TODOC2.2 > Fix For: 2.2.0 > > Attachments: HIVE-14754.1.patch, HIVE-14754.2.patch, HIVE-14754.patch > > > We should be able to track the nr. of queries being compiled/executed at any > given time, as well as the duration of the execution and compilation phase. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-14754) Track the queries execution lifecycle times
[ https://issues.apache.org/jira/browse/HIVE-14754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lefty Leverenz updated HIVE-14754: -- Labels: TODOC2.2 (was: ) > Track the queries execution lifecycle times > --- > > Key: HIVE-14754 > URL: https://issues.apache.org/jira/browse/HIVE-14754 > Project: Hive > Issue Type: Sub-task > Components: Hive, HiveServer2 >Affects Versions: 2.2.0 >Reporter: Barna Zsombor Klara >Assignee: Barna Zsombor Klara > Labels: TODOC2.2 > Fix For: 2.2.0 > > Attachments: HIVE-14754.1.patch, HIVE-14754.2.patch, HIVE-14754.patch > > > We should be able to track the nr. of queries being compiled/executed at any > given time, as well as the duration of the execution and compilation phase. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15803) msck can hang when nested partitions are present
[ https://issues.apache.org/jira/browse/HIVE-15803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15857579#comment-15857579 ] Hive QA commented on HIVE-15803: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12851547/HIVE-15803.2.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 10241 tests executed *Failed tests:* {noformat} TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) (batchId=235) org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_join_with_different_encryption_keys] (batchId=159) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] (batchId=223) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] (batchId=223) org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationDrops.testDropTable (batchId=210) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3432/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3432/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3432/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 5 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12851547 - PreCommit-HIVE-Build > msck can hang when nested partitions are present > > > Key: HIVE-15803 > URL: https://issues.apache.org/jira/browse/HIVE-15803 > Project: Hive > Issue Type: Bug > Components: Metastore >Reporter: Rajesh Balamohan >Assignee: Rajesh Balamohan >Priority: Minor > Attachments: HIVE-15803.1.patch, HIVE-15803.2.patch, HIVE-15803.patch > > > Steps to reproduce. > {noformat} > CREATE TABLE `repairtable`( `col` string) PARTITIONED BY ( `p1` string, > `p2` string) > hive> dfs -mkdir -p /apps/hive/warehouse/test.db/repairtable/p1=c/p2=a/p3=b; > hive> dfs -touchz > /apps/hive/warehouse/test.db/repairtable/p1=c/p2=a/p3=b/datafile; > hive> set hive.mv.files.thread; > hive.mv.files.thread=15 > hive> set hive.mv.files.thread=1; > hive> MSCK TABLE repairtable; > {noformat} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15832) Hplsql UDF doesn't work in Hplsql
[ https://issues.apache.org/jira/browse/HIVE-15832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fei Hui updated HIVE-15832: --- Target Version/s: 2.2.0 (was: 1.2.1) Status: Patch Available (was: Open) > Hplsql UDF doesn't work in Hplsql > - > > Key: HIVE-15832 > URL: https://issues.apache.org/jira/browse/HIVE-15832 > Project: Hive > Issue Type: Bug > Components: hpl/sql >Affects Versions: 1.2.1 > Environment: HDP : 2.4.2.0-258 > Hive lib : hive-XXX-1.2.1000.2.4.2.0-258.jar > Hplsql : hplsql-0.3.17.jar >Reporter: Sungwoon Ma >Assignee: Fei Hui > Labels: test > Fix For: 1.2.1 > > Attachments: HIVE-15832.patch > > > ※ http://www.hplsql.org/udf > 1) UDF Test > [root@node2 /apps/hplsql]#./hplsql -e "SELECT hello(name) FROM USERS;" -trace > ... > Ln:1 SELECT completed successfully > Ln:1 Standalone SELECT executed: 1 columns in the result set > Unhandled exception in HPL/SQL > ... > 2) Add 'Exception' (org.apache.hive.hplsql.Select) > >> 123 line > - before : > else if ((ctx.parent instanceof HplsqlParser.StmtContext)) { > int cols = rm.getColumnCount(); > if (this.trace) { > trace(ctx, "Standalone SELECT executed: " + cols + " columns in the > result set"); > } > while (rs.next()) { > - after : > try { > while (rs.next()) { > ... > } > catch (Exception e) { > e.printStackTrace(); > } > - Error Log > [root@node2 /apps/hplsql]#./hplsql -e "SELECT hello(1) FROM USERS;" -trace > Configuration file: file:/apps/hplsql-0.3.17/hplsql-site.xml > Parser tree: (program (block (stmt (select_stmt (fullselect_stmt > (fullselect_stmt_item (subselect_stmt SELECT (select_list (select_list_item > (expr (expr_func (ident hello) ( (expr_func_params (func_param (expr > (expr_atom (int_number 1) ))) (select_list_alias AS (ident A > (from_clause FROM (from_table_clause (from_table_name_clause (table_name > (ident USERS)) (stmt (semicolon_stmt ; > INLCUDE CONTENT hplsqlrc (non-empty) > Ln:1 CREATE FUNCTION hello > Ln:1 SELECT > >>registerUdf begin :false > >>registerUdf end :true > Ln:1 SELECT hplsql('hello(:1)', 1) AS A FROM USERS > 17/02/06 20:28:13 INFO jdbc.Utils: Supplied authorities: node3:1 > 17/02/06 20:28:13 INFO jdbc.Utils: Resolved authority: node3:1 > Open connection: jdbc:hive2://node3:1 (225 ms) > Starting pre-SQL statement > Starting pre-SQL statement > Starting pre-SQL statement > Starting pre-SQL statement > Starting pre-SQL statement > Starting query > Query executed successfully (84 ms) > Ln:1 SELECT completed successfully > Ln:1 Standalone SELECT executed: 1 columns in the result set > cols:1 > org.apache.hive.service.cli.HiveSQLException: java.io.IOException: > org.apache.hadoop.hive.ql.metadata.HiveException: java.util.zip.ZipException: > invalid code lengths set > at org.apache.hive.jdbc.Utils.verifySuccess(Utils.java:258) > at org.apache.hive.jdbc.Utils.verifySuccessWithInfo(Utils.java:244) > at org.apache.hive.jdbc.HiveQueryResultSet.next(HiveQueryResultSet.java:364) > at org.apache.hive.hplsql.Select.select(Select.java:116) > at org.apache.hive.hplsql.Exec.visitSelect_stmt(Exec.java:870) > at org.apache.hive.hplsql.Exec.visitSelect_stmt(Exec.java:1) > at > org.apache.hive.hplsql.HplsqlParser$Select_stmtContext.accept(HplsqlParser.java:14249) > at > org.antlr.v4.runtime.tree.AbstractParseTreeVisitor.visitChildren(AbstractParseTreeVisitor.java:70) > at org.apache.hive.hplsql.Exec.visitStmt(Exec.java:865) > at org.apache.hive.hplsql.Exec.visitStmt(Exec.java:1) > at > org.apache.hive.hplsql.HplsqlParser$StmtContext.accept(HplsqlParser.java:998) > at > org.antlr.v4.runtime.tree.AbstractParseTreeVisitor.visitChildren(AbstractParseTreeVisitor.java:70) > at > org.apache.hive.hplsql.HplsqlBaseVisitor.visitBlock(HplsqlBaseVisitor.java:28) > at > org.apache.hive.hplsql.HplsqlParser$BlockContext.accept(HplsqlParser.java:438) > at > org.antlr.v4.runtime.tree.AbstractParseTreeVisitor.visitChildren(AbstractParseTreeVisitor.java:70) > at org.apache.hive.hplsql.Exec.visitProgram(Exec.java:780) > at org.apache.hive.hplsql.Exec.visitProgram(Exec.java:1) > at > org.apache.hive.hplsql.HplsqlParser$ProgramContext.accept(HplsqlParser.java:381) > at > org.antlr.v4.runtime.tree.AbstractParseTreeVisitor.visit(AbstractParseTreeVisitor.java:42) > at org.apache.hive.hplsql.Exec.run(Exec.java:652) > at org.apache.hive.hplsql.Exec.run(Exec.java:630) > at org.apache.hive.hplsql.Hplsql.main(Hplsql.java:23) > Caused by: org.apache.hive.service.cli.HiveSQLException: java.io.IOException: > org.apache.hadoop.hive.ql.metadata.HiveException: java.util.zip.ZipException: > invalid code lengths set > at >
[jira] [Commented] (HIVE-15832) Hplsql UDF doesn't work in Hplsql
[ https://issues.apache.org/jira/browse/HIVE-15832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15857548#comment-15857548 ] Fei Hui commented on HIVE-15832: CC [~dmtolpeko] > Hplsql UDF doesn't work in Hplsql > - > > Key: HIVE-15832 > URL: https://issues.apache.org/jira/browse/HIVE-15832 > Project: Hive > Issue Type: Bug > Components: hpl/sql >Affects Versions: 1.2.1 > Environment: HDP : 2.4.2.0-258 > Hive lib : hive-XXX-1.2.1000.2.4.2.0-258.jar > Hplsql : hplsql-0.3.17.jar >Reporter: Sungwoon Ma >Assignee: Fei Hui > Labels: test > Fix For: 1.2.1 > > Attachments: HIVE-15832.patch > > > ※ http://www.hplsql.org/udf > 1) UDF Test > [root@node2 /apps/hplsql]#./hplsql -e "SELECT hello(name) FROM USERS;" -trace > ... > Ln:1 SELECT completed successfully > Ln:1 Standalone SELECT executed: 1 columns in the result set > Unhandled exception in HPL/SQL > ... > 2) Add 'Exception' (org.apache.hive.hplsql.Select) > >> 123 line > - before : > else if ((ctx.parent instanceof HplsqlParser.StmtContext)) { > int cols = rm.getColumnCount(); > if (this.trace) { > trace(ctx, "Standalone SELECT executed: " + cols + " columns in the > result set"); > } > while (rs.next()) { > - after : > try { > while (rs.next()) { > ... > } > catch (Exception e) { > e.printStackTrace(); > } > - Error Log > [root@node2 /apps/hplsql]#./hplsql -e "SELECT hello(1) FROM USERS;" -trace > Configuration file: file:/apps/hplsql-0.3.17/hplsql-site.xml > Parser tree: (program (block (stmt (select_stmt (fullselect_stmt > (fullselect_stmt_item (subselect_stmt SELECT (select_list (select_list_item > (expr (expr_func (ident hello) ( (expr_func_params (func_param (expr > (expr_atom (int_number 1) ))) (select_list_alias AS (ident A > (from_clause FROM (from_table_clause (from_table_name_clause (table_name > (ident USERS)) (stmt (semicolon_stmt ; > INLCUDE CONTENT hplsqlrc (non-empty) > Ln:1 CREATE FUNCTION hello > Ln:1 SELECT > >>registerUdf begin :false > >>registerUdf end :true > Ln:1 SELECT hplsql('hello(:1)', 1) AS A FROM USERS > 17/02/06 20:28:13 INFO jdbc.Utils: Supplied authorities: node3:1 > 17/02/06 20:28:13 INFO jdbc.Utils: Resolved authority: node3:1 > Open connection: jdbc:hive2://node3:1 (225 ms) > Starting pre-SQL statement > Starting pre-SQL statement > Starting pre-SQL statement > Starting pre-SQL statement > Starting pre-SQL statement > Starting query > Query executed successfully (84 ms) > Ln:1 SELECT completed successfully > Ln:1 Standalone SELECT executed: 1 columns in the result set > cols:1 > org.apache.hive.service.cli.HiveSQLException: java.io.IOException: > org.apache.hadoop.hive.ql.metadata.HiveException: java.util.zip.ZipException: > invalid code lengths set > at org.apache.hive.jdbc.Utils.verifySuccess(Utils.java:258) > at org.apache.hive.jdbc.Utils.verifySuccessWithInfo(Utils.java:244) > at org.apache.hive.jdbc.HiveQueryResultSet.next(HiveQueryResultSet.java:364) > at org.apache.hive.hplsql.Select.select(Select.java:116) > at org.apache.hive.hplsql.Exec.visitSelect_stmt(Exec.java:870) > at org.apache.hive.hplsql.Exec.visitSelect_stmt(Exec.java:1) > at > org.apache.hive.hplsql.HplsqlParser$Select_stmtContext.accept(HplsqlParser.java:14249) > at > org.antlr.v4.runtime.tree.AbstractParseTreeVisitor.visitChildren(AbstractParseTreeVisitor.java:70) > at org.apache.hive.hplsql.Exec.visitStmt(Exec.java:865) > at org.apache.hive.hplsql.Exec.visitStmt(Exec.java:1) > at > org.apache.hive.hplsql.HplsqlParser$StmtContext.accept(HplsqlParser.java:998) > at > org.antlr.v4.runtime.tree.AbstractParseTreeVisitor.visitChildren(AbstractParseTreeVisitor.java:70) > at > org.apache.hive.hplsql.HplsqlBaseVisitor.visitBlock(HplsqlBaseVisitor.java:28) > at > org.apache.hive.hplsql.HplsqlParser$BlockContext.accept(HplsqlParser.java:438) > at > org.antlr.v4.runtime.tree.AbstractParseTreeVisitor.visitChildren(AbstractParseTreeVisitor.java:70) > at org.apache.hive.hplsql.Exec.visitProgram(Exec.java:780) > at org.apache.hive.hplsql.Exec.visitProgram(Exec.java:1) > at > org.apache.hive.hplsql.HplsqlParser$ProgramContext.accept(HplsqlParser.java:381) > at > org.antlr.v4.runtime.tree.AbstractParseTreeVisitor.visit(AbstractParseTreeVisitor.java:42) > at org.apache.hive.hplsql.Exec.run(Exec.java:652) > at org.apache.hive.hplsql.Exec.run(Exec.java:630) > at org.apache.hive.hplsql.Hplsql.main(Hplsql.java:23) > Caused by: org.apache.hive.service.cli.HiveSQLException: java.io.IOException: > org.apache.hadoop.hive.ql.metadata.HiveException: java.util.zip.ZipException: > invalid code lengths set > at >
[jira] [Updated] (HIVE-15832) Hplsql UDF doesn't work in Hplsql
[ https://issues.apache.org/jira/browse/HIVE-15832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fei Hui updated HIVE-15832: --- Attachment: HIVE-15832.patch patch uploaded > Hplsql UDF doesn't work in Hplsql > - > > Key: HIVE-15832 > URL: https://issues.apache.org/jira/browse/HIVE-15832 > Project: Hive > Issue Type: Bug > Components: hpl/sql >Affects Versions: 1.2.1 > Environment: HDP : 2.4.2.0-258 > Hive lib : hive-XXX-1.2.1000.2.4.2.0-258.jar > Hplsql : hplsql-0.3.17.jar >Reporter: Sungwoon Ma >Assignee: Fei Hui > Labels: test > Fix For: 1.2.1 > > Attachments: HIVE-15832.patch > > > ※ http://www.hplsql.org/udf > 1) UDF Test > [root@node2 /apps/hplsql]#./hplsql -e "SELECT hello(name) FROM USERS;" -trace > ... > Ln:1 SELECT completed successfully > Ln:1 Standalone SELECT executed: 1 columns in the result set > Unhandled exception in HPL/SQL > ... > 2) Add 'Exception' (org.apache.hive.hplsql.Select) > >> 123 line > - before : > else if ((ctx.parent instanceof HplsqlParser.StmtContext)) { > int cols = rm.getColumnCount(); > if (this.trace) { > trace(ctx, "Standalone SELECT executed: " + cols + " columns in the > result set"); > } > while (rs.next()) { > - after : > try { > while (rs.next()) { > ... > } > catch (Exception e) { > e.printStackTrace(); > } > - Error Log > [root@node2 /apps/hplsql]#./hplsql -e "SELECT hello(1) FROM USERS;" -trace > Configuration file: file:/apps/hplsql-0.3.17/hplsql-site.xml > Parser tree: (program (block (stmt (select_stmt (fullselect_stmt > (fullselect_stmt_item (subselect_stmt SELECT (select_list (select_list_item > (expr (expr_func (ident hello) ( (expr_func_params (func_param (expr > (expr_atom (int_number 1) ))) (select_list_alias AS (ident A > (from_clause FROM (from_table_clause (from_table_name_clause (table_name > (ident USERS)) (stmt (semicolon_stmt ; > INLCUDE CONTENT hplsqlrc (non-empty) > Ln:1 CREATE FUNCTION hello > Ln:1 SELECT > >>registerUdf begin :false > >>registerUdf end :true > Ln:1 SELECT hplsql('hello(:1)', 1) AS A FROM USERS > 17/02/06 20:28:13 INFO jdbc.Utils: Supplied authorities: node3:1 > 17/02/06 20:28:13 INFO jdbc.Utils: Resolved authority: node3:1 > Open connection: jdbc:hive2://node3:1 (225 ms) > Starting pre-SQL statement > Starting pre-SQL statement > Starting pre-SQL statement > Starting pre-SQL statement > Starting pre-SQL statement > Starting query > Query executed successfully (84 ms) > Ln:1 SELECT completed successfully > Ln:1 Standalone SELECT executed: 1 columns in the result set > cols:1 > org.apache.hive.service.cli.HiveSQLException: java.io.IOException: > org.apache.hadoop.hive.ql.metadata.HiveException: java.util.zip.ZipException: > invalid code lengths set > at org.apache.hive.jdbc.Utils.verifySuccess(Utils.java:258) > at org.apache.hive.jdbc.Utils.verifySuccessWithInfo(Utils.java:244) > at org.apache.hive.jdbc.HiveQueryResultSet.next(HiveQueryResultSet.java:364) > at org.apache.hive.hplsql.Select.select(Select.java:116) > at org.apache.hive.hplsql.Exec.visitSelect_stmt(Exec.java:870) > at org.apache.hive.hplsql.Exec.visitSelect_stmt(Exec.java:1) > at > org.apache.hive.hplsql.HplsqlParser$Select_stmtContext.accept(HplsqlParser.java:14249) > at > org.antlr.v4.runtime.tree.AbstractParseTreeVisitor.visitChildren(AbstractParseTreeVisitor.java:70) > at org.apache.hive.hplsql.Exec.visitStmt(Exec.java:865) > at org.apache.hive.hplsql.Exec.visitStmt(Exec.java:1) > at > org.apache.hive.hplsql.HplsqlParser$StmtContext.accept(HplsqlParser.java:998) > at > org.antlr.v4.runtime.tree.AbstractParseTreeVisitor.visitChildren(AbstractParseTreeVisitor.java:70) > at > org.apache.hive.hplsql.HplsqlBaseVisitor.visitBlock(HplsqlBaseVisitor.java:28) > at > org.apache.hive.hplsql.HplsqlParser$BlockContext.accept(HplsqlParser.java:438) > at > org.antlr.v4.runtime.tree.AbstractParseTreeVisitor.visitChildren(AbstractParseTreeVisitor.java:70) > at org.apache.hive.hplsql.Exec.visitProgram(Exec.java:780) > at org.apache.hive.hplsql.Exec.visitProgram(Exec.java:1) > at > org.apache.hive.hplsql.HplsqlParser$ProgramContext.accept(HplsqlParser.java:381) > at > org.antlr.v4.runtime.tree.AbstractParseTreeVisitor.visit(AbstractParseTreeVisitor.java:42) > at org.apache.hive.hplsql.Exec.run(Exec.java:652) > at org.apache.hive.hplsql.Exec.run(Exec.java:630) > at org.apache.hive.hplsql.Hplsql.main(Hplsql.java:23) > Caused by: org.apache.hive.service.cli.HiveSQLException: java.io.IOException: > org.apache.hadoop.hive.ql.metadata.HiveException: java.util.zip.ZipException: > invalid code lengths set > at >
[jira] [Commented] (HIVE-15388) HiveParser spends lots of time in parsing queries with lots of "("
[ https://issues.apache.org/jira/browse/HIVE-15388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15857532#comment-15857532 ] Pengcheng Xiong commented on HIVE-15388: [~kgyrtkirk], thanks for your attention. I have tried hard to make expression work but can not succeed. It does not matter if dt is deterministic or not, whenever we have an expression, things become complicated. I think your idea is worth trying. I will submit a new patch soon. Thanks. > HiveParser spends lots of time in parsing queries with lots of "(" > -- > > Key: HIVE-15388 > URL: https://issues.apache.org/jira/browse/HIVE-15388 > Project: Hive > Issue Type: Improvement >Affects Versions: 2.2.0 >Reporter: Rajesh Balamohan >Assignee: Pengcheng Xiong > Attachments: HIVE-15388.01.patch, HIVE-15388.02.patch, > HIVE-15388.03.patch, HIVE-15388.04.patch, HIVE-15388.05.patch, > HIVE-15388.06.patch, hive-15388.stacktrace.txt > > > Branch: apache-master (applicable with previous releases as well) > Queries generated via tools can have lots of "(" for "AND/OR" conditions. > This causes huge delays in parsing phase when the number of expressions are > high. > e.g > {noformat} > SELECT `iata`, >`airport`, >`city`, >`state`, >`country`, >`lat`, >`lon` > FROM airports > WHERE > ((`airports`.`airport` > = "Thigpen" > > OR `airports`.`airport` = "Astoria Regional") > > OR `airports`.`airport` = "Warsaw Municipal") > > OR `airports`.`airport` = "John F Kennedy Memorial") > > OR `airports`.`airport` = "Hall-Miller Municipal") > > OR `airports`.`airport` = "Atqasuk") >OR > `airports`.`airport` = "William B Hartsfield-Atlanta Intl") > OR > `airports`.`airport` = "Artesia Municipal") > OR > `airports`.`airport` = "Outagamie County Regional") > OR > `airports`.`airport` = "Watertown Municipal") >OR > `airports`.`airport` = "Augusta State") > OR > `airports`.`airport` = "Aurora Municipal") > OR > `airports`.`airport` = "Alakanuk") > OR > `airports`.`airport` = "Austin Municipal") >OR > `airports`.`airport` = "Auburn Municipal") > OR > `airports`.`airport` = "Auburn-Opelik") > OR > `airports`.`airport` = "Austin-Bergstrom International") > OR > `airports`.`airport` = "Wausau Municipal") >OR > `airports`.`airport` = "Mecklenburg-Brunswick Regional") > OR > `airports`.`airport` = "Alva Regional") > OR > `airports`.`airport` = "Asheville Regional") > OR > `airports`.`airport` = "Avon Park Municipal") >OR > `airports`.`airport` = "Wilkes-Barre/Scranton Intl") > OR > `airports`.`airport` = "Marana Northwest Regional") > OR > `airports`.`airport` = "Catalina") > OR > `airports`.`airport` = "Washington Municipal") >OR > `airports`.`airport` = "Wainwright") > OR `airports`.`airport` > = "West Memphis Municipal") > OR `airports`.`airport` > = "Arlington
[jira] [Commented] (HIVE-15388) HiveParser spends lots of time in parsing queries with lots of "("
[ https://issues.apache.org/jira/browse/HIVE-15388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15857510#comment-15857510 ] Zoltan Haindrich commented on HIVE-15388: - [~pxiong] I think that dropping the interval related udf makes it harder to later re-enable the feature; is there any reason to go beyond just the parser changes (disabling the (dt*dt) feature) - because: i assume that the deterministic udf usage is not affect by this problem. i've an idea which might worth a try: by making the interval keyword mandatory for '(dt*dt)' like queries - it may simplify this problem for the parser; and could possibly leave the columns as interval arguments alive > HiveParser spends lots of time in parsing queries with lots of "(" > -- > > Key: HIVE-15388 > URL: https://issues.apache.org/jira/browse/HIVE-15388 > Project: Hive > Issue Type: Improvement >Affects Versions: 2.2.0 >Reporter: Rajesh Balamohan >Assignee: Pengcheng Xiong > Attachments: HIVE-15388.01.patch, HIVE-15388.02.patch, > HIVE-15388.03.patch, HIVE-15388.04.patch, HIVE-15388.05.patch, > HIVE-15388.06.patch, hive-15388.stacktrace.txt > > > Branch: apache-master (applicable with previous releases as well) > Queries generated via tools can have lots of "(" for "AND/OR" conditions. > This causes huge delays in parsing phase when the number of expressions are > high. > e.g > {noformat} > SELECT `iata`, >`airport`, >`city`, >`state`, >`country`, >`lat`, >`lon` > FROM airports > WHERE > ((`airports`.`airport` > = "Thigpen" > > OR `airports`.`airport` = "Astoria Regional") > > OR `airports`.`airport` = "Warsaw Municipal") > > OR `airports`.`airport` = "John F Kennedy Memorial") > > OR `airports`.`airport` = "Hall-Miller Municipal") > > OR `airports`.`airport` = "Atqasuk") >OR > `airports`.`airport` = "William B Hartsfield-Atlanta Intl") > OR > `airports`.`airport` = "Artesia Municipal") > OR > `airports`.`airport` = "Outagamie County Regional") > OR > `airports`.`airport` = "Watertown Municipal") >OR > `airports`.`airport` = "Augusta State") > OR > `airports`.`airport` = "Aurora Municipal") > OR > `airports`.`airport` = "Alakanuk") > OR > `airports`.`airport` = "Austin Municipal") >OR > `airports`.`airport` = "Auburn Municipal") > OR > `airports`.`airport` = "Auburn-Opelik") > OR > `airports`.`airport` = "Austin-Bergstrom International") > OR > `airports`.`airport` = "Wausau Municipal") >OR > `airports`.`airport` = "Mecklenburg-Brunswick Regional") > OR > `airports`.`airport` = "Alva Regional") > OR > `airports`.`airport` = "Asheville Regional") > OR > `airports`.`airport` = "Avon Park Municipal") >OR > `airports`.`airport` = "Wilkes-Barre/Scranton Intl") > OR > `airports`.`airport` = "Marana Northwest Regional") > OR > `airports`.`airport` = "Catalina") > OR > `airports`.`airport` = "Washington Municipal") >OR > `airports`.`airport` = "Wainwright")
[jira] [Commented] (HIVE-15832) Hplsql UDF doesn't work in Hplsql
[ https://issues.apache.org/jira/browse/HIVE-15832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15857499#comment-15857499 ] Fei Hui commented on HIVE-15832: here is error log i get, it is different hive 1.2.1 Caused by: java.lang.NullPointerException at org.apache.hive.hplsql.Exec.setVariable(Exec.java:148) ~[hive-hplsql-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] at org.apache.hive.hplsql.Exec.setVariable(Exec.java:158) ~[hive-hplsql-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] at org.apache.hive.hplsql.Udf.setParameters(Udf.java:92) ~[hive-hplsql-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] at org.apache.hive.hplsql.Udf.evaluate(Udf.java:74) ~[hive-hplsql-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator._evaluate(ExprNodeGenericFuncEvaluator.java:187) ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:80) ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:68) ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] at org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:88) ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:897) ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] at org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:130) ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] at org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:438) ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] at org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:430) ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:147) ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:2209) ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] at org.apache.hive.service.cli.operation.SQLOperation.getNextRowSet(SQLOperation.java:492) ~[hive-service-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] > Hplsql UDF doesn't work in Hplsql > - > > Key: HIVE-15832 > URL: https://issues.apache.org/jira/browse/HIVE-15832 > Project: Hive > Issue Type: Bug > Components: hpl/sql >Affects Versions: 1.2.1 > Environment: HDP : 2.4.2.0-258 > Hive lib : hive-XXX-1.2.1000.2.4.2.0-258.jar > Hplsql : hplsql-0.3.17.jar >Reporter: Sungwoon Ma >Assignee: Fei Hui > Labels: test > Fix For: 1.2.1 > > > ※ http://www.hplsql.org/udf > 1) UDF Test > [root@node2 /apps/hplsql]#./hplsql -e "SELECT hello(name) FROM USERS;" -trace > ... > Ln:1 SELECT completed successfully > Ln:1 Standalone SELECT executed: 1 columns in the result set > Unhandled exception in HPL/SQL > ... > 2) Add 'Exception' (org.apache.hive.hplsql.Select) > >> 123 line > - before : > else if ((ctx.parent instanceof HplsqlParser.StmtContext)) { > int cols = rm.getColumnCount(); > if (this.trace) { > trace(ctx, "Standalone SELECT executed: " + cols + " columns in the > result set"); > } > while (rs.next()) { > - after : > try { > while (rs.next()) { > ... > } > catch (Exception e) { > e.printStackTrace(); > } > - Error Log > [root@node2 /apps/hplsql]#./hplsql -e "SELECT hello(1) FROM USERS;" -trace > Configuration file: file:/apps/hplsql-0.3.17/hplsql-site.xml > Parser tree: (program (block (stmt (select_stmt (fullselect_stmt > (fullselect_stmt_item (subselect_stmt SELECT (select_list (select_list_item > (expr (expr_func (ident hello) ( (expr_func_params (func_param (expr > (expr_atom (int_number 1) ))) (select_list_alias AS (ident A > (from_clause FROM (from_table_clause (from_table_name_clause (table_name > (ident USERS)) (stmt (semicolon_stmt ; > INLCUDE CONTENT hplsqlrc (non-empty) > Ln:1 CREATE FUNCTION hello > Ln:1 SELECT > >>registerUdf begin :false > >>registerUdf end :true > Ln:1 SELECT hplsql('hello(:1)', 1) AS A FROM USERS > 17/02/06 20:28:13 INFO jdbc.Utils: Supplied authorities: node3:1 > 17/02/06 20:28:13 INFO jdbc.Utils: Resolved authority: node3:1 > Open connection: jdbc:hive2://node3:1 (225 ms) > Starting pre-SQL statement > Starting pre-SQL statement > Starting pre-SQL statement > Starting pre-SQL statement > Starting pre-SQL statement > Starting query > Query executed successfully (84 ms) > Ln:1 SELECT completed successfully > Ln:1 Standalone SELECT executed: 1 columns in the result set > cols:1 > org.apache.hive.service.cli.HiveSQLException: java.io.IOException: >
[jira] [Assigned] (HIVE-15832) Hplsql UDF doesn't work in Hplsql
[ https://issues.apache.org/jira/browse/HIVE-15832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fei Hui reassigned HIVE-15832: -- Assignee: Fei Hui > Hplsql UDF doesn't work in Hplsql > - > > Key: HIVE-15832 > URL: https://issues.apache.org/jira/browse/HIVE-15832 > Project: Hive > Issue Type: Bug > Components: hpl/sql >Affects Versions: 1.2.1 > Environment: HDP : 2.4.2.0-258 > Hive lib : hive-XXX-1.2.1000.2.4.2.0-258.jar > Hplsql : hplsql-0.3.17.jar >Reporter: Sungwoon Ma >Assignee: Fei Hui > Labels: test > Fix For: 1.2.1 > > > ※ http://www.hplsql.org/udf > 1) UDF Test > [root@node2 /apps/hplsql]#./hplsql -e "SELECT hello(name) FROM USERS;" -trace > ... > Ln:1 SELECT completed successfully > Ln:1 Standalone SELECT executed: 1 columns in the result set > Unhandled exception in HPL/SQL > ... > 2) Add 'Exception' (org.apache.hive.hplsql.Select) > >> 123 line > - before : > else if ((ctx.parent instanceof HplsqlParser.StmtContext)) { > int cols = rm.getColumnCount(); > if (this.trace) { > trace(ctx, "Standalone SELECT executed: " + cols + " columns in the > result set"); > } > while (rs.next()) { > - after : > try { > while (rs.next()) { > ... > } > catch (Exception e) { > e.printStackTrace(); > } > - Error Log > [root@node2 /apps/hplsql]#./hplsql -e "SELECT hello(1) FROM USERS;" -trace > Configuration file: file:/apps/hplsql-0.3.17/hplsql-site.xml > Parser tree: (program (block (stmt (select_stmt (fullselect_stmt > (fullselect_stmt_item (subselect_stmt SELECT (select_list (select_list_item > (expr (expr_func (ident hello) ( (expr_func_params (func_param (expr > (expr_atom (int_number 1) ))) (select_list_alias AS (ident A > (from_clause FROM (from_table_clause (from_table_name_clause (table_name > (ident USERS)) (stmt (semicolon_stmt ; > INLCUDE CONTENT hplsqlrc (non-empty) > Ln:1 CREATE FUNCTION hello > Ln:1 SELECT > >>registerUdf begin :false > >>registerUdf end :true > Ln:1 SELECT hplsql('hello(:1)', 1) AS A FROM USERS > 17/02/06 20:28:13 INFO jdbc.Utils: Supplied authorities: node3:1 > 17/02/06 20:28:13 INFO jdbc.Utils: Resolved authority: node3:1 > Open connection: jdbc:hive2://node3:1 (225 ms) > Starting pre-SQL statement > Starting pre-SQL statement > Starting pre-SQL statement > Starting pre-SQL statement > Starting pre-SQL statement > Starting query > Query executed successfully (84 ms) > Ln:1 SELECT completed successfully > Ln:1 Standalone SELECT executed: 1 columns in the result set > cols:1 > org.apache.hive.service.cli.HiveSQLException: java.io.IOException: > org.apache.hadoop.hive.ql.metadata.HiveException: java.util.zip.ZipException: > invalid code lengths set > at org.apache.hive.jdbc.Utils.verifySuccess(Utils.java:258) > at org.apache.hive.jdbc.Utils.verifySuccessWithInfo(Utils.java:244) > at org.apache.hive.jdbc.HiveQueryResultSet.next(HiveQueryResultSet.java:364) > at org.apache.hive.hplsql.Select.select(Select.java:116) > at org.apache.hive.hplsql.Exec.visitSelect_stmt(Exec.java:870) > at org.apache.hive.hplsql.Exec.visitSelect_stmt(Exec.java:1) > at > org.apache.hive.hplsql.HplsqlParser$Select_stmtContext.accept(HplsqlParser.java:14249) > at > org.antlr.v4.runtime.tree.AbstractParseTreeVisitor.visitChildren(AbstractParseTreeVisitor.java:70) > at org.apache.hive.hplsql.Exec.visitStmt(Exec.java:865) > at org.apache.hive.hplsql.Exec.visitStmt(Exec.java:1) > at > org.apache.hive.hplsql.HplsqlParser$StmtContext.accept(HplsqlParser.java:998) > at > org.antlr.v4.runtime.tree.AbstractParseTreeVisitor.visitChildren(AbstractParseTreeVisitor.java:70) > at > org.apache.hive.hplsql.HplsqlBaseVisitor.visitBlock(HplsqlBaseVisitor.java:28) > at > org.apache.hive.hplsql.HplsqlParser$BlockContext.accept(HplsqlParser.java:438) > at > org.antlr.v4.runtime.tree.AbstractParseTreeVisitor.visitChildren(AbstractParseTreeVisitor.java:70) > at org.apache.hive.hplsql.Exec.visitProgram(Exec.java:780) > at org.apache.hive.hplsql.Exec.visitProgram(Exec.java:1) > at > org.apache.hive.hplsql.HplsqlParser$ProgramContext.accept(HplsqlParser.java:381) > at > org.antlr.v4.runtime.tree.AbstractParseTreeVisitor.visit(AbstractParseTreeVisitor.java:42) > at org.apache.hive.hplsql.Exec.run(Exec.java:652) > at org.apache.hive.hplsql.Exec.run(Exec.java:630) > at org.apache.hive.hplsql.Hplsql.main(Hplsql.java:23) > Caused by: org.apache.hive.service.cli.HiveSQLException: java.io.IOException: > org.apache.hadoop.hive.ql.metadata.HiveException: java.util.zip.ZipException: > invalid code lengths set > at > org.apache.hive.service.cli.operation.SQLOperation.getNextRowSet(SQLOperation.java:352) > at >
[jira] [Commented] (HIVE-15832) Hplsql UDF doesn't work in Hplsql
[ https://issues.apache.org/jira/browse/HIVE-15832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15857497#comment-15857497 ] Fei Hui commented on HIVE-15832: i have reproduced it on hive 2.2.0 i will work on it > Hplsql UDF doesn't work in Hplsql > - > > Key: HIVE-15832 > URL: https://issues.apache.org/jira/browse/HIVE-15832 > Project: Hive > Issue Type: Bug > Components: hpl/sql >Affects Versions: 1.2.1 > Environment: HDP : 2.4.2.0-258 > Hive lib : hive-XXX-1.2.1000.2.4.2.0-258.jar > Hplsql : hplsql-0.3.17.jar >Reporter: Sungwoon Ma > Labels: test > Fix For: 1.2.1 > > > ※ http://www.hplsql.org/udf > 1) UDF Test > [root@node2 /apps/hplsql]#./hplsql -e "SELECT hello(name) FROM USERS;" -trace > ... > Ln:1 SELECT completed successfully > Ln:1 Standalone SELECT executed: 1 columns in the result set > Unhandled exception in HPL/SQL > ... > 2) Add 'Exception' (org.apache.hive.hplsql.Select) > >> 123 line > - before : > else if ((ctx.parent instanceof HplsqlParser.StmtContext)) { > int cols = rm.getColumnCount(); > if (this.trace) { > trace(ctx, "Standalone SELECT executed: " + cols + " columns in the > result set"); > } > while (rs.next()) { > - after : > try { > while (rs.next()) { > ... > } > catch (Exception e) { > e.printStackTrace(); > } > - Error Log > [root@node2 /apps/hplsql]#./hplsql -e "SELECT hello(1) FROM USERS;" -trace > Configuration file: file:/apps/hplsql-0.3.17/hplsql-site.xml > Parser tree: (program (block (stmt (select_stmt (fullselect_stmt > (fullselect_stmt_item (subselect_stmt SELECT (select_list (select_list_item > (expr (expr_func (ident hello) ( (expr_func_params (func_param (expr > (expr_atom (int_number 1) ))) (select_list_alias AS (ident A > (from_clause FROM (from_table_clause (from_table_name_clause (table_name > (ident USERS)) (stmt (semicolon_stmt ; > INLCUDE CONTENT hplsqlrc (non-empty) > Ln:1 CREATE FUNCTION hello > Ln:1 SELECT > >>registerUdf begin :false > >>registerUdf end :true > Ln:1 SELECT hplsql('hello(:1)', 1) AS A FROM USERS > 17/02/06 20:28:13 INFO jdbc.Utils: Supplied authorities: node3:1 > 17/02/06 20:28:13 INFO jdbc.Utils: Resolved authority: node3:1 > Open connection: jdbc:hive2://node3:1 (225 ms) > Starting pre-SQL statement > Starting pre-SQL statement > Starting pre-SQL statement > Starting pre-SQL statement > Starting pre-SQL statement > Starting query > Query executed successfully (84 ms) > Ln:1 SELECT completed successfully > Ln:1 Standalone SELECT executed: 1 columns in the result set > cols:1 > org.apache.hive.service.cli.HiveSQLException: java.io.IOException: > org.apache.hadoop.hive.ql.metadata.HiveException: java.util.zip.ZipException: > invalid code lengths set > at org.apache.hive.jdbc.Utils.verifySuccess(Utils.java:258) > at org.apache.hive.jdbc.Utils.verifySuccessWithInfo(Utils.java:244) > at org.apache.hive.jdbc.HiveQueryResultSet.next(HiveQueryResultSet.java:364) > at org.apache.hive.hplsql.Select.select(Select.java:116) > at org.apache.hive.hplsql.Exec.visitSelect_stmt(Exec.java:870) > at org.apache.hive.hplsql.Exec.visitSelect_stmt(Exec.java:1) > at > org.apache.hive.hplsql.HplsqlParser$Select_stmtContext.accept(HplsqlParser.java:14249) > at > org.antlr.v4.runtime.tree.AbstractParseTreeVisitor.visitChildren(AbstractParseTreeVisitor.java:70) > at org.apache.hive.hplsql.Exec.visitStmt(Exec.java:865) > at org.apache.hive.hplsql.Exec.visitStmt(Exec.java:1) > at > org.apache.hive.hplsql.HplsqlParser$StmtContext.accept(HplsqlParser.java:998) > at > org.antlr.v4.runtime.tree.AbstractParseTreeVisitor.visitChildren(AbstractParseTreeVisitor.java:70) > at > org.apache.hive.hplsql.HplsqlBaseVisitor.visitBlock(HplsqlBaseVisitor.java:28) > at > org.apache.hive.hplsql.HplsqlParser$BlockContext.accept(HplsqlParser.java:438) > at > org.antlr.v4.runtime.tree.AbstractParseTreeVisitor.visitChildren(AbstractParseTreeVisitor.java:70) > at org.apache.hive.hplsql.Exec.visitProgram(Exec.java:780) > at org.apache.hive.hplsql.Exec.visitProgram(Exec.java:1) > at > org.apache.hive.hplsql.HplsqlParser$ProgramContext.accept(HplsqlParser.java:381) > at > org.antlr.v4.runtime.tree.AbstractParseTreeVisitor.visit(AbstractParseTreeVisitor.java:42) > at org.apache.hive.hplsql.Exec.run(Exec.java:652) > at org.apache.hive.hplsql.Exec.run(Exec.java:630) > at org.apache.hive.hplsql.Hplsql.main(Hplsql.java:23) > Caused by: org.apache.hive.service.cli.HiveSQLException: java.io.IOException: > org.apache.hadoop.hive.ql.metadata.HiveException: java.util.zip.ZipException: > invalid code lengths set > at > org.apache.hive.service.cli.operation.SQLOperation.getNextRowSet(SQLOperation.java:352) >
[jira] [Commented] (HIVE-15222) replace org.json usage in ExplainTask/TezTask related classes with some alternative
[ https://issues.apache.org/jira/browse/HIVE-15222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15857495#comment-15857495 ] Hive QA commented on HIVE-15222: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12851538/HIVE-15222.3.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 10241 tests executed *Failed tests:* {noformat} TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) (batchId=235) org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_join_with_different_encryption_keys] (batchId=159) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] (batchId=223) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] (batchId=223) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3431/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3431/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3431/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 4 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12851538 - PreCommit-HIVE-Build > replace org.json usage in ExplainTask/TezTask related classes with some > alternative > --- > > Key: HIVE-15222 > URL: https://issues.apache.org/jira/browse/HIVE-15222 > Project: Hive > Issue Type: Sub-task >Reporter: Zoltan Haindrich >Assignee: Teddy Choi > Fix For: 2.2.0 > > Attachments: HIVE-15222.1.patch, HIVE-15222.2.patch, > HIVE-15222.3.patch > > > Replace org.json usage in these classes. > It seems to me that json is probably only used to write some information - > but the application never reads it back. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15789) Vectorization: limit reduce vectorization to 32Mb chunks
[ https://issues.apache.org/jira/browse/HIVE-15789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15857479#comment-15857479 ] Gopal V commented on HIVE-15789: LGTM - +1. > Vectorization: limit reduce vectorization to 32Mb chunks > > > Key: HIVE-15789 > URL: https://issues.apache.org/jira/browse/HIVE-15789 > Project: Hive > Issue Type: Bug > Components: Vectorization >Reporter: Gopal V >Assignee: Teddy Choi > Attachments: HIVE-15789.1.patch, HIVE-15789.2.patch > > > Reduce vectorization accumulates 1024 rows before forwarding it into the > reduce processor. > Add a safety limit for 32Mb of writables, so that shorter sequences can be > forwarded into the operator trees. > {code} > rowIdx++; > if (rowIdx >= BATCH_SIZE) { > VectorizedBatchUtil.setBatchSize(batch, rowIdx); > reducer.process(batch, tag); > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15792) Hive should raise SemanticException when LPAD/RPAD pad character's length is 0
[ https://issues.apache.org/jira/browse/HIVE-15792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nandakumar updated HIVE-15792: -- Status: Patch Available (was: Open) > Hive should raise SemanticException when LPAD/RPAD pad character's length is 0 > -- > > Key: HIVE-15792 > URL: https://issues.apache.org/jira/browse/HIVE-15792 > Project: Hive > Issue Type: Improvement >Reporter: Zoltan Chovan >Assignee: Nandakumar >Priority: Minor > Attachments: HIVE-15792.000.patch > > > For example SELECT LPAD('A', 2, ''); will cause an infinite loop and the > running query will hang without any error. > It would be great if this could be prevented by checking the pad character's > length and if it's 0 then throw a SemanticException. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-6009) Add from_unixtime UDF that has controllable Timezone
[ https://issues.apache.org/jira/browse/HIVE-6009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15857460#comment-15857460 ] Alexander Pivovarov commented on HIVE-6009: --- you can convert bigint to UTC timestamp and then convert UTC timestamp to GMT-5 timestamp (EST) {code} select from_unixtime(129384); 2010-12-31 16:00:00 // in Greenwich select from_utc_timestamp(from_unixtime(129384), 'GMT-5'); 2010-12-31 11:00:00 // in NYC > Add from_unixtime UDF that has controllable Timezone > > > Key: HIVE-6009 > URL: https://issues.apache.org/jira/browse/HIVE-6009 > Project: Hive > Issue Type: Improvement > Components: CLI >Affects Versions: 0.10.0 > Environment: CDH4.4 >Reporter: Johndee Burks >Priority: Trivial > > Currently the from_unixtime UDF takes into a account timezone of the system > doing the transformation. I think that implementation is good, but it would > be nice to include or change the current UDF to have a configurable timezone. > It would be useful for looking at timestamp data from different regions in > the native region's timezone. > Example: > from_unixtime(unix_time, format, timezone) > from_unixtime(129384, dd MMM , GMT-5) -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Comment Edited] (HIVE-6009) Add from_unixtime UDF that has controllable Timezone
[ https://issues.apache.org/jira/browse/HIVE-6009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15857460#comment-15857460 ] Alexander Pivovarov edited comment on HIVE-6009 at 2/8/17 5:57 AM: --- you can convert bigint to UTC timestamp and then convert UTC timestamp to GMT-5 timestamp (EST) {code} select from_unixtime(129384); 2010-12-31 16:00:00 // in Greenwich select from_utc_timestamp(from_unixtime(129384), 'GMT-5'); 2010-12-31 11:00:00 // in NYC {code} was (Author: apivovarov): you can convert bigint to UTC timestamp and then convert UTC timestamp to GMT-5 timestamp (EST) {code} select from_unixtime(129384); 2010-12-31 16:00:00 // in Greenwich select from_utc_timestamp(from_unixtime(129384), 'GMT-5'); 2010-12-31 11:00:00 // in NYC > Add from_unixtime UDF that has controllable Timezone > > > Key: HIVE-6009 > URL: https://issues.apache.org/jira/browse/HIVE-6009 > Project: Hive > Issue Type: Improvement > Components: CLI >Affects Versions: 0.10.0 > Environment: CDH4.4 >Reporter: Johndee Burks >Priority: Trivial > > Currently the from_unixtime UDF takes into a account timezone of the system > doing the transformation. I think that implementation is good, but it would > be nice to include or change the current UDF to have a configurable timezone. > It would be useful for looking at timestamp data from different regions in > the native region's timezone. > Example: > from_unixtime(unix_time, format, timezone) > from_unixtime(129384, dd MMM , GMT-5) -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15682) Eliminate per-row based dummy iterator creation
[ https://issues.apache.org/jira/browse/HIVE-15682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15857457#comment-15857457 ] Dapeng Sun commented on HIVE-15682: --- Hi [~xuefuz], I will use TPCx-BB to run 1TB test about HIVE-15580, HIVE-15682 and no patched package, I would attach the result when I get it. > Eliminate per-row based dummy iterator creation > --- > > Key: HIVE-15682 > URL: https://issues.apache.org/jira/browse/HIVE-15682 > Project: Hive > Issue Type: Improvement > Components: Spark >Affects Versions: 2.2.0 >Reporter: Xuefu Zhang >Assignee: Xuefu Zhang > Fix For: 2.2.0 > > Attachments: HIVE-15682.patch > > > HIVE-15580 introduced a dummy iterator per input row which can be eliminated. > This is because {{SparkReduceRecordHandler}} is able to handle single key > value pairs. We can refactor this part of code 1. to remove the need for a > iterator and 2. to optimize the code path for per (key, value) based (instead > of (key, value iterator)) processing. It would be also great if we can > measure the performance after the optimizations and compare to performance > prior to HIVE-15580. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (HIVE-3558) UDF LEFT(string,position) to HIVE
[ https://issues.apache.org/jira/browse/HIVE-3558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexander Pivovarov resolved HIVE-3558. --- Resolution: Won't Fix > UDF LEFT(string,position) to HIVE > -- > > Key: HIVE-3558 > URL: https://issues.apache.org/jira/browse/HIVE-3558 > Project: Hive > Issue Type: New Feature > Components: UDF >Affects Versions: 0.9.0 >Reporter: Aruna Babu >Priority: Minor > Attachments: HIVE-3558.1.patch.txt, udf_left.q, udf_left.q.out > > > Introduction > UDF (User Defined Function) to obtain the left most 'n' characters from > a string in HIVE. > Relevance > Current releases of Hive lacks a function which would returns the > leftmost len characters from the string str, or NULL if any argument is NULL. > > The function LEFT(string,length) would return the leftmost 'n' characters > from the string , or NULL if any argument is NULL which would be useful while > using HiveQL. This would find its use in all the technical aspects where the > concept of strings are used. > Functionality :- > Function Name: LEFT(string,length) > > Returns the leftmost length characters from the string or NULL if any > argument is NULL. > Example: hive>SELECT LEFT('https://www.irctc.co.in',5); > -> 'https' > Usage :- > Case 1: To query a table to find details based on an https request > Table :-Transaction > Request_id|date|period_id|url_name > 0001|01/07/2012|110001|https://www.irctc.co.in > 0002|02/07/2012|110001|https://nextstep.tcs.com > 0003|03/07/2012|110001|https://www.hdfcbank.com > 0005|01/07/2012|110001|http://www.lmnm.co.in > 0006|08/07/2012|110001|http://nextstart.com > 0007|10/07/2012|110001|https://netbanking.icicibank.com > 0012|21/07/2012|110001|http://www.people.co.in > 0026|08/07/2012|110001|http://nextprobs.com > 00023|25/07/2012|110001|https://netbanking.canarabank.com > Query : select * from transaction where LEFT(url_name,5)='https'; > Result :- > 0001|01/07/2012|110001|https://www.irctc.com > 0002|02/07/2012|110001|https://nextstep.tcs.com > 0003|03/07/2012|110001|https://www.hdfcbank.com > 0007|10/07/2012|110001|https://netbanking.icicibank.com > 00023|25/07/2012|110001|https://netbanking.canarabank.com -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-3558) UDF LEFT(string,position) to HIVE
[ https://issues.apache.org/jira/browse/HIVE-3558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15857451#comment-15857451 ] Alexander Pivovarov commented on HIVE-3558: --- You can use substr to get LEFT and RIGHT {code} // get characters from 1st to 5th included SELECT substr('https://www.irctc.co.in', 1, 5); https // all RIGHT characters starting from 6th SELECT substr('https://www.irctc.co.in', 6); ://www.irctc.co.in {code} > UDF LEFT(string,position) to HIVE > -- > > Key: HIVE-3558 > URL: https://issues.apache.org/jira/browse/HIVE-3558 > Project: Hive > Issue Type: New Feature > Components: UDF >Affects Versions: 0.9.0 >Reporter: Aruna Babu >Priority: Minor > Attachments: HIVE-3558.1.patch.txt, udf_left.q, udf_left.q.out > > > Introduction > UDF (User Defined Function) to obtain the left most 'n' characters from > a string in HIVE. > Relevance > Current releases of Hive lacks a function which would returns the > leftmost len characters from the string str, or NULL if any argument is NULL. > > The function LEFT(string,length) would return the leftmost 'n' characters > from the string , or NULL if any argument is NULL which would be useful while > using HiveQL. This would find its use in all the technical aspects where the > concept of strings are used. > Functionality :- > Function Name: LEFT(string,length) > > Returns the leftmost length characters from the string or NULL if any > argument is NULL. > Example: hive>SELECT LEFT('https://www.irctc.co.in',5); > -> 'https' > Usage :- > Case 1: To query a table to find details based on an https request > Table :-Transaction > Request_id|date|period_id|url_name > 0001|01/07/2012|110001|https://www.irctc.co.in > 0002|02/07/2012|110001|https://nextstep.tcs.com > 0003|03/07/2012|110001|https://www.hdfcbank.com > 0005|01/07/2012|110001|http://www.lmnm.co.in > 0006|08/07/2012|110001|http://nextstart.com > 0007|10/07/2012|110001|https://netbanking.icicibank.com > 0012|21/07/2012|110001|http://www.people.co.in > 0026|08/07/2012|110001|http://nextprobs.com > 00023|25/07/2012|110001|https://netbanking.canarabank.com > Query : select * from transaction where LEFT(url_name,5)='https'; > Result :- > 0001|01/07/2012|110001|https://www.irctc.com > 0002|02/07/2012|110001|https://nextstep.tcs.com > 0003|03/07/2012|110001|https://www.hdfcbank.com > 0007|10/07/2012|110001|https://netbanking.icicibank.com > 00023|25/07/2012|110001|https://netbanking.canarabank.com -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15789) Vectorization: limit reduce vectorization to 32Mb chunks
[ https://issues.apache.org/jira/browse/HIVE-15789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15857448#comment-15857448 ] Hive QA commented on HIVE-15789: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12851534/HIVE-15789.2.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 10237 tests executed *Failed tests:* {noformat} TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) (batchId=235) org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_join_with_different_encryption_keys] (batchId=159) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] (batchId=223) org.apache.hadoop.hive.cli.TestSparkNegativeCliDriver.org.apache.hadoop.hive.cli.TestSparkNegativeCliDriver (batchId=230) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3430/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3430/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3430/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 4 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12851534 - PreCommit-HIVE-Build > Vectorization: limit reduce vectorization to 32Mb chunks > > > Key: HIVE-15789 > URL: https://issues.apache.org/jira/browse/HIVE-15789 > Project: Hive > Issue Type: Bug > Components: Vectorization >Reporter: Gopal V >Assignee: Teddy Choi > Attachments: HIVE-15789.1.patch, HIVE-15789.2.patch > > > Reduce vectorization accumulates 1024 rows before forwarding it into the > reduce processor. > Add a safety limit for 32Mb of writables, so that shorter sequences can be > forwarded into the operator trees. > {code} > rowIdx++; > if (rowIdx >= BATCH_SIZE) { > VectorizedBatchUtil.setBatchSize(batch, rowIdx); > reducer.process(batch, tag); > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15388) HiveParser spends lots of time in parsing queries with lots of "("
[ https://issues.apache.org/jira/browse/HIVE-15388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-15388: --- Status: Open (was: Patch Available) > HiveParser spends lots of time in parsing queries with lots of "(" > -- > > Key: HIVE-15388 > URL: https://issues.apache.org/jira/browse/HIVE-15388 > Project: Hive > Issue Type: Improvement >Affects Versions: 2.2.0 >Reporter: Rajesh Balamohan >Assignee: Pengcheng Xiong > Attachments: HIVE-15388.01.patch, HIVE-15388.02.patch, > HIVE-15388.03.patch, HIVE-15388.04.patch, HIVE-15388.05.patch, > HIVE-15388.06.patch, hive-15388.stacktrace.txt > > > Branch: apache-master (applicable with previous releases as well) > Queries generated via tools can have lots of "(" for "AND/OR" conditions. > This causes huge delays in parsing phase when the number of expressions are > high. > e.g > {noformat} > SELECT `iata`, >`airport`, >`city`, >`state`, >`country`, >`lat`, >`lon` > FROM airports > WHERE > ((`airports`.`airport` > = "Thigpen" > > OR `airports`.`airport` = "Astoria Regional") > > OR `airports`.`airport` = "Warsaw Municipal") > > OR `airports`.`airport` = "John F Kennedy Memorial") > > OR `airports`.`airport` = "Hall-Miller Municipal") > > OR `airports`.`airport` = "Atqasuk") >OR > `airports`.`airport` = "William B Hartsfield-Atlanta Intl") > OR > `airports`.`airport` = "Artesia Municipal") > OR > `airports`.`airport` = "Outagamie County Regional") > OR > `airports`.`airport` = "Watertown Municipal") >OR > `airports`.`airport` = "Augusta State") > OR > `airports`.`airport` = "Aurora Municipal") > OR > `airports`.`airport` = "Alakanuk") > OR > `airports`.`airport` = "Austin Municipal") >OR > `airports`.`airport` = "Auburn Municipal") > OR > `airports`.`airport` = "Auburn-Opelik") > OR > `airports`.`airport` = "Austin-Bergstrom International") > OR > `airports`.`airport` = "Wausau Municipal") >OR > `airports`.`airport` = "Mecklenburg-Brunswick Regional") > OR > `airports`.`airport` = "Alva Regional") > OR > `airports`.`airport` = "Asheville Regional") > OR > `airports`.`airport` = "Avon Park Municipal") >OR > `airports`.`airport` = "Wilkes-Barre/Scranton Intl") > OR > `airports`.`airport` = "Marana Northwest Regional") > OR > `airports`.`airport` = "Catalina") > OR > `airports`.`airport` = "Washington Municipal") >OR > `airports`.`airport` = "Wainwright") > OR `airports`.`airport` > = "West Memphis Municipal") > OR `airports`.`airport` > = "Arlington Municipal") > OR `airports`.`airport` = > "Algona Municipal") >OR `airports`.`airport` = > "Chandler") > OR `airports`.`airport` = > "Altus
[jira] [Commented] (HIVE-15847) In Progress update refreshes seem slow
[ https://issues.apache.org/jira/browse/HIVE-15847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15857447#comment-15857447 ] anishek commented on HIVE-15847: additionally there are more columns being printed in task summary as shown by after patch HIVE-15473: https://issues.apache.org/jira/secure/attachment/12851509/summary_after_patch.png before patch HIVE-15473: https://issues.apache.org/jira/secure/attachment/12851510/summary_before_patch.png > In Progress update refreshes seem slow > -- > > Key: HIVE-15847 > URL: https://issues.apache.org/jira/browse/HIVE-15847 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 2.2.0 >Reporter: anishek > > After HIVE-15473, the refresh rates for in place progress bar seems to be > slow on hive cli. > As pointed out by [~prasanth_j] > {quote} > The refresh rate is slow. Following video will show it > before patch: https://asciinema.org/a/2fgcncxg5gjavcpxt6lfb8jg9 > after patch: https://asciinema.org/a/2tht5jf6l9b2dc3ylt5gtztqg > {quote} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15388) HiveParser spends lots of time in parsing queries with lots of "("
[ https://issues.apache.org/jira/browse/HIVE-15388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-15388: --- Status: Patch Available (was: Open) > HiveParser spends lots of time in parsing queries with lots of "(" > -- > > Key: HIVE-15388 > URL: https://issues.apache.org/jira/browse/HIVE-15388 > Project: Hive > Issue Type: Improvement >Affects Versions: 2.2.0 >Reporter: Rajesh Balamohan >Assignee: Pengcheng Xiong > Attachments: HIVE-15388.01.patch, HIVE-15388.02.patch, > HIVE-15388.03.patch, HIVE-15388.04.patch, HIVE-15388.05.patch, > HIVE-15388.06.patch, hive-15388.stacktrace.txt > > > Branch: apache-master (applicable with previous releases as well) > Queries generated via tools can have lots of "(" for "AND/OR" conditions. > This causes huge delays in parsing phase when the number of expressions are > high. > e.g > {noformat} > SELECT `iata`, >`airport`, >`city`, >`state`, >`country`, >`lat`, >`lon` > FROM airports > WHERE > ((`airports`.`airport` > = "Thigpen" > > OR `airports`.`airport` = "Astoria Regional") > > OR `airports`.`airport` = "Warsaw Municipal") > > OR `airports`.`airport` = "John F Kennedy Memorial") > > OR `airports`.`airport` = "Hall-Miller Municipal") > > OR `airports`.`airport` = "Atqasuk") >OR > `airports`.`airport` = "William B Hartsfield-Atlanta Intl") > OR > `airports`.`airport` = "Artesia Municipal") > OR > `airports`.`airport` = "Outagamie County Regional") > OR > `airports`.`airport` = "Watertown Municipal") >OR > `airports`.`airport` = "Augusta State") > OR > `airports`.`airport` = "Aurora Municipal") > OR > `airports`.`airport` = "Alakanuk") > OR > `airports`.`airport` = "Austin Municipal") >OR > `airports`.`airport` = "Auburn Municipal") > OR > `airports`.`airport` = "Auburn-Opelik") > OR > `airports`.`airport` = "Austin-Bergstrom International") > OR > `airports`.`airport` = "Wausau Municipal") >OR > `airports`.`airport` = "Mecklenburg-Brunswick Regional") > OR > `airports`.`airport` = "Alva Regional") > OR > `airports`.`airport` = "Asheville Regional") > OR > `airports`.`airport` = "Avon Park Municipal") >OR > `airports`.`airport` = "Wilkes-Barre/Scranton Intl") > OR > `airports`.`airport` = "Marana Northwest Regional") > OR > `airports`.`airport` = "Catalina") > OR > `airports`.`airport` = "Washington Municipal") >OR > `airports`.`airport` = "Wainwright") > OR `airports`.`airport` > = "West Memphis Municipal") > OR `airports`.`airport` > = "Arlington Municipal") > OR `airports`.`airport` = > "Algona Municipal") >OR `airports`.`airport` = > "Chandler") > OR `airports`.`airport` = > "Altus
[jira] [Commented] (HIVE-15388) HiveParser spends lots of time in parsing queries with lots of "("
[ https://issues.apache.org/jira/browse/HIVE-15388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15857445#comment-15857445 ] Pengcheng Xiong commented on HIVE-15388: [~hagleitn], i have added back some of the tests in interval_alt.q. Please take a look. > HiveParser spends lots of time in parsing queries with lots of "(" > -- > > Key: HIVE-15388 > URL: https://issues.apache.org/jira/browse/HIVE-15388 > Project: Hive > Issue Type: Improvement >Affects Versions: 2.2.0 >Reporter: Rajesh Balamohan >Assignee: Pengcheng Xiong > Attachments: HIVE-15388.01.patch, HIVE-15388.02.patch, > HIVE-15388.03.patch, HIVE-15388.04.patch, HIVE-15388.05.patch, > HIVE-15388.06.patch, hive-15388.stacktrace.txt > > > Branch: apache-master (applicable with previous releases as well) > Queries generated via tools can have lots of "(" for "AND/OR" conditions. > This causes huge delays in parsing phase when the number of expressions are > high. > e.g > {noformat} > SELECT `iata`, >`airport`, >`city`, >`state`, >`country`, >`lat`, >`lon` > FROM airports > WHERE > ((`airports`.`airport` > = "Thigpen" > > OR `airports`.`airport` = "Astoria Regional") > > OR `airports`.`airport` = "Warsaw Municipal") > > OR `airports`.`airport` = "John F Kennedy Memorial") > > OR `airports`.`airport` = "Hall-Miller Municipal") > > OR `airports`.`airport` = "Atqasuk") >OR > `airports`.`airport` = "William B Hartsfield-Atlanta Intl") > OR > `airports`.`airport` = "Artesia Municipal") > OR > `airports`.`airport` = "Outagamie County Regional") > OR > `airports`.`airport` = "Watertown Municipal") >OR > `airports`.`airport` = "Augusta State") > OR > `airports`.`airport` = "Aurora Municipal") > OR > `airports`.`airport` = "Alakanuk") > OR > `airports`.`airport` = "Austin Municipal") >OR > `airports`.`airport` = "Auburn Municipal") > OR > `airports`.`airport` = "Auburn-Opelik") > OR > `airports`.`airport` = "Austin-Bergstrom International") > OR > `airports`.`airport` = "Wausau Municipal") >OR > `airports`.`airport` = "Mecklenburg-Brunswick Regional") > OR > `airports`.`airport` = "Alva Regional") > OR > `airports`.`airport` = "Asheville Regional") > OR > `airports`.`airport` = "Avon Park Municipal") >OR > `airports`.`airport` = "Wilkes-Barre/Scranton Intl") > OR > `airports`.`airport` = "Marana Northwest Regional") > OR > `airports`.`airport` = "Catalina") > OR > `airports`.`airport` = "Washington Municipal") >OR > `airports`.`airport` = "Wainwright") > OR `airports`.`airport` > = "West Memphis Municipal") > OR `airports`.`airport` > = "Arlington Municipal") > OR `airports`.`airport` = > "Algona Municipal") >OR `airports`.`airport` = > "Chandler")
[jira] [Updated] (HIVE-15388) HiveParser spends lots of time in parsing queries with lots of "("
[ https://issues.apache.org/jira/browse/HIVE-15388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-15388: --- Attachment: HIVE-15388.06.patch > HiveParser spends lots of time in parsing queries with lots of "(" > -- > > Key: HIVE-15388 > URL: https://issues.apache.org/jira/browse/HIVE-15388 > Project: Hive > Issue Type: Improvement >Affects Versions: 2.2.0 >Reporter: Rajesh Balamohan >Assignee: Pengcheng Xiong > Attachments: HIVE-15388.01.patch, HIVE-15388.02.patch, > HIVE-15388.03.patch, HIVE-15388.04.patch, HIVE-15388.05.patch, > HIVE-15388.06.patch, hive-15388.stacktrace.txt > > > Branch: apache-master (applicable with previous releases as well) > Queries generated via tools can have lots of "(" for "AND/OR" conditions. > This causes huge delays in parsing phase when the number of expressions are > high. > e.g > {noformat} > SELECT `iata`, >`airport`, >`city`, >`state`, >`country`, >`lat`, >`lon` > FROM airports > WHERE > ((`airports`.`airport` > = "Thigpen" > > OR `airports`.`airport` = "Astoria Regional") > > OR `airports`.`airport` = "Warsaw Municipal") > > OR `airports`.`airport` = "John F Kennedy Memorial") > > OR `airports`.`airport` = "Hall-Miller Municipal") > > OR `airports`.`airport` = "Atqasuk") >OR > `airports`.`airport` = "William B Hartsfield-Atlanta Intl") > OR > `airports`.`airport` = "Artesia Municipal") > OR > `airports`.`airport` = "Outagamie County Regional") > OR > `airports`.`airport` = "Watertown Municipal") >OR > `airports`.`airport` = "Augusta State") > OR > `airports`.`airport` = "Aurora Municipal") > OR > `airports`.`airport` = "Alakanuk") > OR > `airports`.`airport` = "Austin Municipal") >OR > `airports`.`airport` = "Auburn Municipal") > OR > `airports`.`airport` = "Auburn-Opelik") > OR > `airports`.`airport` = "Austin-Bergstrom International") > OR > `airports`.`airport` = "Wausau Municipal") >OR > `airports`.`airport` = "Mecklenburg-Brunswick Regional") > OR > `airports`.`airport` = "Alva Regional") > OR > `airports`.`airport` = "Asheville Regional") > OR > `airports`.`airport` = "Avon Park Municipal") >OR > `airports`.`airport` = "Wilkes-Barre/Scranton Intl") > OR > `airports`.`airport` = "Marana Northwest Regional") > OR > `airports`.`airport` = "Catalina") > OR > `airports`.`airport` = "Washington Municipal") >OR > `airports`.`airport` = "Wainwright") > OR `airports`.`airport` > = "West Memphis Municipal") > OR `airports`.`airport` > = "Arlington Municipal") > OR `airports`.`airport` = > "Algona Municipal") >OR `airports`.`airport` = > "Chandler") > OR `airports`.`airport` = > "Altus
[jira] [Commented] (HIVE-15473) Progress Bar on Beeline client
[ https://issues.apache.org/jira/browse/HIVE-15473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15857441#comment-15857441 ] anishek commented on HIVE-15473: There seems to 3 additional columns that are printed now, may be that is a problem will add the same to be checked as part of HIVE-15847 > Progress Bar on Beeline client > -- > > Key: HIVE-15473 > URL: https://issues.apache.org/jira/browse/HIVE-15473 > Project: Hive > Issue Type: Improvement > Components: Beeline, HiveServer2 >Affects Versions: 2.1.1 >Reporter: anishek >Assignee: anishek >Priority: Minor > Fix For: 2.2.0 > > Attachments: HIVE-15473.10.patch, HIVE-15473.11.patch, > HIVE-15473.2.patch, HIVE-15473.3.patch, HIVE-15473.4.patch, > HIVE-15473.5.patch, HIVE-15473.6.patch, HIVE-15473.7.patch, > HIVE-15473.8.patch, HIVE-15473.9.patch, io_summary_after_patch.png, > io_summary_before_patch.png, screen_shot_beeline.jpg, status_after_patch.png, > status_before_patch.png, summary_after_patch.png, summary_before_patch.png > > > Hive Cli allows showing progress bar for tez execution engine as shown in > https://issues.apache.org/jira/secure/attachment/12678767/ux-demo.gif > it would be great to have similar progress bar displayed when user is > connecting via beeline command line client as well. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Comment Edited] (HIVE-15473) Progress Bar on Beeline client
[ https://issues.apache.org/jira/browse/HIVE-15473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15857413#comment-15857413 ] anishek edited comment on HIVE-15473 at 2/8/17 5:39 AM: [~prasanth_j] the summary sections are no longer printed via the jline rendered to retain the color scheme, the reason being the report goes to log file for beeline and for hive cli its shown on the stdout, hence had to remove the color scheme for same report . I hope that should be ok ? I have created HIVE-15847 for the slow refresh rates on hive cli, will look into it. There is no inherent change that was done to the way progress bar is printed for hive-cli.Thanks for your inputs! was (Author: anishek): [~prasanth_j] the summary sections are no longer printed via the jline rendered to retain the color scheme, the reason being the report goes to log file for beeline and for hive cli its shown on the stdout, hence had to remove the color scheme for same report . I hope that should be ok ? I have created HIVE-15847 for the slow refresh rates on hive cli, will look into it. Thanks for your inputs! > Progress Bar on Beeline client > -- > > Key: HIVE-15473 > URL: https://issues.apache.org/jira/browse/HIVE-15473 > Project: Hive > Issue Type: Improvement > Components: Beeline, HiveServer2 >Affects Versions: 2.1.1 >Reporter: anishek >Assignee: anishek >Priority: Minor > Fix For: 2.2.0 > > Attachments: HIVE-15473.10.patch, HIVE-15473.11.patch, > HIVE-15473.2.patch, HIVE-15473.3.patch, HIVE-15473.4.patch, > HIVE-15473.5.patch, HIVE-15473.6.patch, HIVE-15473.7.patch, > HIVE-15473.8.patch, HIVE-15473.9.patch, io_summary_after_patch.png, > io_summary_before_patch.png, screen_shot_beeline.jpg, status_after_patch.png, > status_before_patch.png, summary_after_patch.png, summary_before_patch.png > > > Hive Cli allows showing progress bar for tez execution engine as shown in > https://issues.apache.org/jira/secure/attachment/12678767/ux-demo.gif > it would be great to have similar progress bar displayed when user is > connecting via beeline command line client as well. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (HIVE-2710) row_sequence UDF is not documented
[ https://issues.apache.org/jira/browse/HIVE-2710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexander Pivovarov resolved HIVE-2710. --- Resolution: Won't Fix > row_sequence UDF is not documented > -- > > Key: HIVE-2710 > URL: https://issues.apache.org/jira/browse/HIVE-2710 > Project: Hive > Issue Type: Bug >Reporter: Sho Shimauchi >Priority: Minor > > row_sequence UDF was implemented in HIVE-1304, however the function is not > documented on hive wiki. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-2710) row_sequence UDF is not documented
[ https://issues.apache.org/jira/browse/HIVE-2710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15857440#comment-15857440 ] Alexander Pivovarov commented on HIVE-2710: --- row_sequence UDF was moved to contrib package. Usually we do not describe contrib package UDFs in LanguageManual UDF > row_sequence UDF is not documented > -- > > Key: HIVE-2710 > URL: https://issues.apache.org/jira/browse/HIVE-2710 > Project: Hive > Issue Type: Bug >Reporter: Sho Shimauchi >Priority: Minor > > row_sequence UDF was implemented in HIVE-1304, however the function is not > documented on hive wiki. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-6046) add UDF for converting date time from one presentation to another
[ https://issues.apache.org/jira/browse/HIVE-6046?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexander Pivovarov updated HIVE-6046: -- Resolution: Duplicate Status: Resolved (was: Patch Available) > add UDF for converting date time from one presentation to another > -- > > Key: HIVE-6046 > URL: https://issues.apache.org/jira/browse/HIVE-6046 > Project: Hive > Issue Type: New Feature > Components: UDF >Affects Versions: 0.13.0 >Reporter: Kostiantyn Kudriavtsev >Assignee: Kostiantyn Kudriavtsev > Attachments: Hive-6046-Feb15.patch, Hive-6046.patch, HIVE-6046.patch > > > it'd be nice to have function for converting datetime to different formats, > for example: > format_date('2013-12-12 00:00:00.0', '-MM-dd HH:mm:ss.S', '/MM/dd') > There are two signatures to facilitate further using: > format_date(datetime, fromFormat, toFormat) > format_date(timestamp, toFormat) > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (HIVE-6214) Need a UDF to convert a Date String from any standard format to another. Should be able to provide the Date String, current format and to the format into which it need to
[ https://issues.apache.org/jira/browse/HIVE-6214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexander Pivovarov resolved HIVE-6214. --- Resolution: Duplicate > Need a UDF to convert a Date String from any standard format to another. > Should be able to provide the Date String, current format and to the format > into which it need to be converted and returned as String output of UDF > > > Key: HIVE-6214 > URL: https://issues.apache.org/jira/browse/HIVE-6214 > Project: Hive > Issue Type: New Feature > Components: UDF > Environment: Software >Reporter: Rony Pius Manakkal >Priority: Minor > Labels: features > > Need a UDF to convert a Date String from any standard format to another. > Should be able to provide the Date String, current format and to the format > into which it need to be converted and returned as String output of UDF > Example : String convertDateFormat(String dateString, String > currentDateFormat, String requiredFormat); -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-9988) Evaluating UDF before query is run
[ https://issues.apache.org/jira/browse/HIVE-9988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15857435#comment-15857435 ] Alexander Pivovarov commented on HIVE-9988: --- You can assign the expression to variable before query is evaluated and then use the variable in WHERE {code} set dt=from_unixtime(unix_timestamp(),'MMdd'); select * from A where dt=${hiveconf:dt}; {code} > Evaluating UDF before query is run > -- > > Key: HIVE-9988 > URL: https://issues.apache.org/jira/browse/HIVE-9988 > Project: Hive > Issue Type: Improvement >Reporter: Ådne Brunborg > > When using UDFs on partition column in Hive, all partitions are scanned > before the UDF is resolved. > If the UDF could be evaluated before query is run, this would greatly improve > performance in cases like this. > Example - the table has a partition by datestamp (bigint): > The following where clause touches upon all 82 partitions: > {{WHERE datestamp=cast(from_unixtime(unix_timestamp(),'MMdd') as bigint)}} > {{15/03/16 09:21:53 INFO mapred.FileInputFormat: Total input paths to process > : 82}} > …whereas the following only touches the one partition: > {{WHERE datestamp=20150316}} > {{15/03/16 09:23:06 INFO input.FileInputFormat: Total input paths to process > : 1}} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15473) Progress Bar on Beeline client
[ https://issues.apache.org/jira/browse/HIVE-15473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15857430#comment-15857430 ] Prasanth Jayachandran commented on HIVE-15473: -- Yeah. That should not be a problem. More concerned about task summary exceeding the column width. > Progress Bar on Beeline client > -- > > Key: HIVE-15473 > URL: https://issues.apache.org/jira/browse/HIVE-15473 > Project: Hive > Issue Type: Improvement > Components: Beeline, HiveServer2 >Affects Versions: 2.1.1 >Reporter: anishek >Assignee: anishek >Priority: Minor > Fix For: 2.2.0 > > Attachments: HIVE-15473.10.patch, HIVE-15473.11.patch, > HIVE-15473.2.patch, HIVE-15473.3.patch, HIVE-15473.4.patch, > HIVE-15473.5.patch, HIVE-15473.6.patch, HIVE-15473.7.patch, > HIVE-15473.8.patch, HIVE-15473.9.patch, io_summary_after_patch.png, > io_summary_before_patch.png, screen_shot_beeline.jpg, status_after_patch.png, > status_before_patch.png, summary_after_patch.png, summary_before_patch.png > > > Hive Cli allows showing progress bar for tez execution engine as shown in > https://issues.apache.org/jira/secure/attachment/12678767/ux-demo.gif > it would be great to have similar progress bar displayed when user is > connecting via beeline command line client as well. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15473) Progress Bar on Beeline client
[ https://issues.apache.org/jira/browse/HIVE-15473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15857413#comment-15857413 ] anishek commented on HIVE-15473: [~prasanth_j] the summary sections are no longer printed via the jline rendered to retain the color scheme, the reason being the report goes to log file for beeline and for hive cli its shown on the stdout, hence had to remove the color scheme for same report . I hope that should be ok ? I have created HIVE-15847 for the slow refresh rates on hive cli, will look into it. Thanks for your inputs! > Progress Bar on Beeline client > -- > > Key: HIVE-15473 > URL: https://issues.apache.org/jira/browse/HIVE-15473 > Project: Hive > Issue Type: Improvement > Components: Beeline, HiveServer2 >Affects Versions: 2.1.1 >Reporter: anishek >Assignee: anishek >Priority: Minor > Fix For: 2.2.0 > > Attachments: HIVE-15473.10.patch, HIVE-15473.11.patch, > HIVE-15473.2.patch, HIVE-15473.3.patch, HIVE-15473.4.patch, > HIVE-15473.5.patch, HIVE-15473.6.patch, HIVE-15473.7.patch, > HIVE-15473.8.patch, HIVE-15473.9.patch, io_summary_after_patch.png, > io_summary_before_patch.png, screen_shot_beeline.jpg, status_after_patch.png, > status_before_patch.png, summary_after_patch.png, summary_before_patch.png > > > Hive Cli allows showing progress bar for tez execution engine as shown in > https://issues.apache.org/jira/secure/attachment/12678767/ux-demo.gif > it would be great to have similar progress bar displayed when user is > connecting via beeline command line client as well. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15796) HoS: poor reducer parallelism when operator stats are not accurate
[ https://issues.apache.org/jira/browse/HIVE-15796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15857394#comment-15857394 ] Hive QA commented on HIVE-15796: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12851528/HIVE-15796.1.patch {color:red}ERROR:{color} -1 due to build exiting with an error Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3429/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3429/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3429/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Tests exited with: NonZeroExitCodeException Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit status 1 and output '+ date '+%Y-%m-%d %T.%3N' 2017-02-08 04:50:23.401 + [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]] + export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 + JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 + export PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games + PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games + export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m ' + ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m ' + export 'MAVEN_OPTS=-Xmx1g ' + MAVEN_OPTS='-Xmx1g ' + cd /data/hiveptest/working/ + tee /data/hiveptest/logs/PreCommit-HIVE-Build-3429/source-prep.txt + [[ false == \t\r\u\e ]] + mkdir -p maven ivy + [[ git = \s\v\n ]] + [[ git = \g\i\t ]] + [[ -z master ]] + [[ -d apache-github-source-source ]] + [[ ! -d apache-github-source-source/.git ]] + [[ ! -d apache-github-source-source ]] + date '+%Y-%m-%d %T.%3N' 2017-02-08 04:50:23.404 + cd apache-github-source-source + git fetch origin + git reset --hard HEAD HEAD is now at aa62dad HIVE-15840: Webhcat test TestPig_5 failing with Pig on Tez at check for percent complete of job (Daniel Dai, reviewed by Thejas Nair) + git clean -f -d Removing ql/src/test/queries/clientpositive/view_cbo.q Removing ql/src/test/results/clientpositive/view_cbo.q.out + git checkout master Already on 'master' Your branch is up-to-date with 'origin/master'. + git reset --hard origin/master HEAD is now at aa62dad HIVE-15840: Webhcat test TestPig_5 failing with Pig on Tez at check for percent complete of job (Daniel Dai, reviewed by Thejas Nair) + git merge --ff-only origin/master Already up-to-date. + date '+%Y-%m-%d %T.%3N' 2017-02-08 04:50:24.414 + patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh + patchFilePath=/data/hiveptest/working/scratch/build.patch + [[ -f /data/hiveptest/working/scratch/build.patch ]] + chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh + /data/hiveptest/working/scratch/smart-apply-patch.sh /data/hiveptest/working/scratch/build.patch error: patch failed: common/src/java/org/apache/hadoop/hive/conf/HiveConf.java:2886 error: common/src/java/org/apache/hadoop/hive/conf/HiveConf.java: patch does not apply error: patch failed: ql/src/test/results/clientpositive/spark/subquery_in.q.out:6260 error: ql/src/test/results/clientpositive/spark/subquery_in.q.out: patch does not apply The patch does not appear to apply with p0, p1, or p2 + exit 1 ' {noformat} This message is automatically generated. ATTACHMENT ID: 12851528 - PreCommit-HIVE-Build > HoS: poor reducer parallelism when operator stats are not accurate > -- > > Key: HIVE-15796 > URL: https://issues.apache.org/jira/browse/HIVE-15796 > Project: Hive > Issue Type: Improvement > Components: Statistics >Affects Versions: 2.2.0 >Reporter: Chao Sun >Assignee: Chao Sun > Attachments: HIVE-15796.1.patch, HIVE-15796.wip.1.patch, > HIVE-15796.wip.2.patch, HIVE-15796.wip.patch > > > In HoS we use currently use operator stats to determine reducer parallelism. > However, it is often the case that operator stats are not accurate, > especially if column stats are not available. This sometimes will generate > extremely poor reducer parallelism, and cause HoS query to run forever. > This JIRA tries to offer an alternative way to compute reducer parallelism, > similar to how MR does. Here's the approach we are suggesting: > 1. when computing the parallelism for a MapWork, use stats associated with > the TableScan operator; > 2. when computing the parallelism for a ReduceWork, use the *maximum* > parallelism from all its parents. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15769) Support view creation in CBO
[ https://issues.apache.org/jira/browse/HIVE-15769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15857391#comment-15857391 ] Hive QA commented on HIVE-15769: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12851531/HIVE-15769.02.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 240 failed/errored test(s), 10242 tests executed *Failed tests:* {noformat} TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) (batchId=235) org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_queries] (batchId=219) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join_without_localtask] (batchId=1) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[cbo_SortUnionTransposeRule] (batchId=15) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[cbo_input26] (batchId=2) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[constant_prop_3] (batchId=40) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[correlationoptimizer13] (batchId=10) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[correlationoptimizer15] (batchId=25) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[decimal_join2] (batchId=36) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[decimal_udf] (batchId=8) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[druid_topn] (batchId=3) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[dynamic_rdd_cache] (batchId=50) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[explain_rearrange] (batchId=11) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[groupby2_limit] (batchId=9) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[groupby_grouping_sets_grouping] (batchId=3) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[groupby_position] (batchId=36) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[input26] (batchId=76) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[join_vc] (batchId=4) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[limit_pushdown2] (batchId=15) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[mergejoin] (batchId=55) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[nested_column_pruning] (batchId=31) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[pcr] (batchId=55) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[pcs] (batchId=45) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[pointlookup2] (batchId=74) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[pointlookup3] (batchId=6) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[pointlookup4] (batchId=68) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ppd_udf_case] (batchId=40) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ppd_vc] (batchId=76) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[reduce_deduplicate_extended2] (batchId=55) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[regex_col] (batchId=15) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[skewjoin_noskew] (batchId=15) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[udf_case_column_pruning] (batchId=72) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[udtf_json_tuple] (batchId=72) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[udtf_parse_url_tuple] (batchId=67) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_cast_constant] (batchId=8) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_char_2] (batchId=63) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_char_mapjoin1] (batchId=30) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_date_1] (batchId=20) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_decimal_round] (batchId=33) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_groupby_mapjoin] (batchId=69) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_groupby_reduce] (batchId=51) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_if_expr] (batchId=10) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_interval_1] (batchId=15) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_interval_arithmetic] (batchId=4) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_mr_diff_schema_alias] (batchId=59) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_outer_join1] (batchId=41) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_outer_join2] (batchId=28) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_outer_join3] (batchId=31) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_outer_join4] (batchId=78) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_string_concat] (batchId=30) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_varchar_mapjoin1] (batchId=24)
[jira] [Updated] (HIVE-15683) Measure performance impact on group by by HIVE-15580
[ https://issues.apache.org/jira/browse/HIVE-15683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang updated HIVE-15683: --- Status: Patch Available (was: Open) > Measure performance impact on group by by HIVE-15580 > > > Key: HIVE-15683 > URL: https://issues.apache.org/jira/browse/HIVE-15683 > Project: Hive > Issue Type: Improvement > Components: Spark >Affects Versions: 2.2.0 >Reporter: Xuefu Zhang >Assignee: Xuefu Zhang > Attachments: HIVE-15683.patch > > > HIVE-15580 changed the way the data is shuffled for order by: instead of > using Spark's groupByKey to shuffle data, Hive on Spark now uses > repartitionAndSortWithinPartitions(), which generates (key, value) pairs > instead of original (key, value iterator). This might have some performance > implications, but it's needed to get rid of unbound memory usage by > {{groupByKey}}. > Here we'd like to compare group by performance with or w/o HIVE-15580. If the > impact is significant, we can provide a configuration that allows user to > switch back to the original way of shuffling. > This work should be ideally done after HIVE-15682 as the optimization there > should help the performance here as well. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15683) Measure performance impact on group by by HIVE-15580
[ https://issues.apache.org/jira/browse/HIVE-15683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang updated HIVE-15683: --- Attachment: HIVE-15683.patch Patch brought back the old implementation and provide a configuration to switch on the new implementation. > Measure performance impact on group by by HIVE-15580 > > > Key: HIVE-15683 > URL: https://issues.apache.org/jira/browse/HIVE-15683 > Project: Hive > Issue Type: Improvement > Components: Spark >Affects Versions: 2.2.0 >Reporter: Xuefu Zhang >Assignee: Xuefu Zhang > Attachments: HIVE-15683.patch > > > HIVE-15580 changed the way the data is shuffled for order by: instead of > using Spark's groupByKey to shuffle data, Hive on Spark now uses > repartitionAndSortWithinPartitions(), which generates (key, value) pairs > instead of original (key, value iterator). This might have some performance > implications, but it's needed to get rid of unbound memory usage by > {{groupByKey}}. > Here we'd like to compare group by performance with or w/o HIVE-15580. If the > impact is significant, we can provide a configuration that allows user to > switch back to the original way of shuffling. > This work should be ideally done after HIVE-15682 as the optimization there > should help the performance here as well. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15840) Webhcat test TestPig_5 failing with Pig on Tez at check for percent complete of job
[ https://issues.apache.org/jira/browse/HIVE-15840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-15840: -- Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 2.2.0 1.3.0 Target Version/s: 1.3.0, 2.2.0 (was: 2.2.0) Status: Resolved (was: Patch Available) UT failures are not related. Patch pushed to both master and branch-1. > Webhcat test TestPig_5 failing with Pig on Tez at check for percent complete > of job > --- > > Key: HIVE-15840 > URL: https://issues.apache.org/jira/browse/HIVE-15840 > Project: Hive > Issue Type: Bug > Components: HCatalog >Reporter: Daniel Dai >Assignee: Daniel Dai > Fix For: 1.3.0, 2.2.0 > > Attachments: HIVE-15840.1.patch > > > TestPig_5 is failing at percentage check if the job is Pig on Tez: > check_job_percent_complete failed. got percentComplete , expected 100% > complete > Test command: > curl -d user.name=daijy -d arg=-p -d arg=INPDIR=/tmp/templeton_test_data -d > arg=-p -d arg=OUTDIR=/tmp/output -d file=loadstore.pig -X POST > http://localhost:50111/templeton/v1/pig > curl > http://localhost:50111/templeton/v1/jobs/job_1486502484681_0003?user.name=daijy > This is similar to HIVE-9351, which fixes Hive on Tez. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15672) LLAP text cache: improve first query perf II
[ https://issues.apache.org/jira/browse/HIVE-15672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15857329#comment-15857329 ] Hive QA commented on HIVE-15672: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12851520/HIVE-15672.08.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 10241 tests executed *Failed tests:* {noformat} TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) (batchId=235) org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_join_with_different_encryption_keys] (batchId=159) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] (batchId=223) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] (batchId=223) org.apache.hive.hcatalog.pig.TestTextFileHCatStorer.testWriteDate2 (batchId=173) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3427/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3427/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3427/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 5 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12851520 - PreCommit-HIVE-Build > LLAP text cache: improve first query perf II > > > Key: HIVE-15672 > URL: https://issues.apache.org/jira/browse/HIVE-15672 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-15672.01.patch, HIVE-15672.02.patch, > HIVE-15672.03.patch, HIVE-15672.04.patch, HIVE-15672.05.patch, > HIVE-15672.06.patch, HIVE-15672.07.patch, HIVE-15672.08.patch > > > 4) Send VRB to the pipeline and write ORC in parallel (in background). -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15803) msck can hang when nested partitions are present
[ https://issues.apache.org/jira/browse/HIVE-15803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Balamohan updated HIVE-15803: Attachment: HIVE-15803.2.patch modified debug logs in .2 version. > msck can hang when nested partitions are present > > > Key: HIVE-15803 > URL: https://issues.apache.org/jira/browse/HIVE-15803 > Project: Hive > Issue Type: Bug > Components: Metastore >Reporter: Rajesh Balamohan >Assignee: Rajesh Balamohan >Priority: Minor > Attachments: HIVE-15803.1.patch, HIVE-15803.2.patch, HIVE-15803.patch > > > Steps to reproduce. > {noformat} > CREATE TABLE `repairtable`( `col` string) PARTITIONED BY ( `p1` string, > `p2` string) > hive> dfs -mkdir -p /apps/hive/warehouse/test.db/repairtable/p1=c/p2=a/p3=b; > hive> dfs -touchz > /apps/hive/warehouse/test.db/repairtable/p1=c/p2=a/p3=b/datafile; > hive> set hive.mv.files.thread; > hive.mv.files.thread=15 > hive> set hive.mv.files.thread=1; > hive> MSCK TABLE repairtable; > {noformat} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15803) msck can hang when nested partitions are present
[ https://issues.apache.org/jira/browse/HIVE-15803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Balamohan updated HIVE-15803: Attachment: HIVE-15803.1.patch Modified patch, which checks for thread pool's usage. Similar to the suggestion by [~pattipaka] > msck can hang when nested partitions are present > > > Key: HIVE-15803 > URL: https://issues.apache.org/jira/browse/HIVE-15803 > Project: Hive > Issue Type: Bug > Components: Metastore >Reporter: Rajesh Balamohan >Assignee: Rajesh Balamohan >Priority: Minor > Attachments: HIVE-15803.1.patch, HIVE-15803.patch > > > Steps to reproduce. > {noformat} > CREATE TABLE `repairtable`( `col` string) PARTITIONED BY ( `p1` string, > `p2` string) > hive> dfs -mkdir -p /apps/hive/warehouse/test.db/repairtable/p1=c/p2=a/p3=b; > hive> dfs -touchz > /apps/hive/warehouse/test.db/repairtable/p1=c/p2=a/p3=b/datafile; > hive> set hive.mv.files.thread; > hive.mv.files.thread=15 > hive> set hive.mv.files.thread=1; > hive> MSCK TABLE repairtable; > {noformat} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15803) msck can hang when nested partitions are present
[ https://issues.apache.org/jira/browse/HIVE-15803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15857305#comment-15857305 ] Rajesh Balamohan commented on HIVE-15803: - Thank you for sharing the patch. Deadlock would happen when multiple paths are there. For instance, following would deadlock with the patch. {noformat} DROP table repairtable; CREATE TABLE repairtable(col STRING) PARTITIONED BY (p1 STRING, p2 STRING); dfs -mkdir -p /apps/hive/warehouse/test.db/repairtable/p1=c/p2=a/p3=b; dfs -mkdir -p /apps/hive/warehouse/test.db/repairtable/p1=c/p2=a/p3=b/; dfs -mkdir -p /apps/hive/warehouse/test.db/repairtable/p1=cc/p2=aa/p3=bb/; dfs -mkdir -p /apps/hive/warehouse/test.db/repairtable/p1=ccc/p2=aaa/p3=bbb/; dfs -mkdir -p /apps/hive/warehouse/test.db/repairtable/p1=/p2=/p3=/; dfs -mkdir -p /apps/hive/warehouse/test.db/repairtable/p1=c/p2=a/p3=b/; dfs -mkdir -p /apps/hive/warehouse/test.db/repairtable/p1=cc/p2=aa/p3=bb/; dfs -mkdir -p /apps/hive/warehouse/test.db/repairtable/p1=ccc/p2=/p3=/; dfs -touchz /apps/hive/warehouse/test.db/repairtable/p1=c/p2=a/p3=b/datafile; dfs -touchz /apps/hive/warehouse/test.db/repairtable/p1=cc/p2=aa/p3=bb/datafile; dfs -touchz /apps/hive/warehouse/test.db/repairtable/p1=ccc/p2=aaa/p3=bbb/datafile; dfs -touchz /apps/hive/warehouse/test.db/repairtable/p1=/p2=/p3=/datafile; dfs -touchz /apps/hive/warehouse/test.db/repairtable/p1=c/p2=a/p3=b/datafile; dfs -touchz /apps/hive/warehouse/test.db/repairtable/p1=cc/p2=aa/p3=bb/datafile; dfs -touchz /apps/hive/warehouse/test.db/repairtable/p1=ccc/p2=/p3=/datafile; set hive.mv.files.thread=1; MSCK TABLE repairtable; {noformat} > msck can hang when nested partitions are present > > > Key: HIVE-15803 > URL: https://issues.apache.org/jira/browse/HIVE-15803 > Project: Hive > Issue Type: Bug > Components: Metastore >Reporter: Rajesh Balamohan >Assignee: Rajesh Balamohan >Priority: Minor > Attachments: HIVE-15803.patch > > > Steps to reproduce. > {noformat} > CREATE TABLE `repairtable`( `col` string) PARTITIONED BY ( `p1` string, > `p2` string) > hive> dfs -mkdir -p /apps/hive/warehouse/test.db/repairtable/p1=c/p2=a/p3=b; > hive> dfs -touchz > /apps/hive/warehouse/test.db/repairtable/p1=c/p2=a/p3=b/datafile; > hive> set hive.mv.files.thread; > hive.mv.files.thread=15 > hive> set hive.mv.files.thread=1; > hive> MSCK TABLE repairtable; > {noformat} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15222) replace org.json usage in ExplainTask/TezTask related classes with some alternative
[ https://issues.apache.org/jira/browse/HIVE-15222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Teddy Choi updated HIVE-15222: -- Attachment: HIVE-15222.3.patch This patch replaced "o.getClass() == Map.class" with "o instanceof Map" to accept a map object as a correct argument. > replace org.json usage in ExplainTask/TezTask related classes with some > alternative > --- > > Key: HIVE-15222 > URL: https://issues.apache.org/jira/browse/HIVE-15222 > Project: Hive > Issue Type: Sub-task >Reporter: Zoltan Haindrich >Assignee: Teddy Choi > Fix For: 2.2.0 > > Attachments: HIVE-15222.1.patch, HIVE-15222.2.patch, > HIVE-15222.3.patch > > > Replace org.json usage in these classes. > It seems to me that json is probably only used to write some information - > but the application never reads it back. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15791) Remove unused ant files
[ https://issues.apache.org/jira/browse/HIVE-15791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15857280#comment-15857280 ] Hive QA commented on HIVE-15791: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12851519/HIVE-15791.2.patch {color:red}ERROR:{color} -1 due to build exiting with an error Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3426/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3426/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3426/ Messages: {noformat} This message was trimmed, see log for full details [loading ZipFileIndexFileObject[/data/hiveptest/working/maven/org/apache/hadoop/hadoop-common/2.7.2/hadoop-common-2.7.2.jar(org/apache/hadoop/util/StringUtils.class)]] [loading ZipFileIndexFileObject[/data/hiveptest/working/maven/org/apache/hadoop/hadoop-common/2.7.2/hadoop-common-2.7.2.jar(org/apache/hadoop/util/VersionInfo.class)]] [loading ZipFileIndexFileObject[/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/rt.jar(java/lang/Iterable.class)]] [loading ZipFileIndexFileObject[/data/hiveptest/working/maven/org/apache/hadoop/hadoop-common/2.7.2/hadoop-common-2.7.2.jar(org/apache/hadoop/io/Writable.class)]] [loading ZipFileIndexFileObject[/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/rt.jar(java/lang/String.class)]] [loading ZipFileIndexFileObject[/data/hiveptest/working/maven/org/eclipse/jetty/aggregate/jetty-all-server/7.6.0.v20120127/jetty-all-server-7.6.0.v20120127.jar(org/eclipse/jetty/http/HttpStatus.class)]] [loading ZipFileIndexFileObject[/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/rt.jar(java/util/HashMap.class)]] [loading ZipFileIndexFileObject[/data/hiveptest/working/maven/com/sun/jersey/jersey-core/1.14/jersey-core-1.14.jar(javax/ws/rs/core/MediaType.class)]] [loading ZipFileIndexFileObject[/data/hiveptest/working/maven/com/sun/jersey/jersey-core/1.14/jersey-core-1.14.jar(javax/ws/rs/core/Response.class)]] [loading ZipFileIndexFileObject[/data/hiveptest/working/maven/org/codehaus/jackson/jackson-mapper-asl/1.9.13/jackson-mapper-asl-1.9.13.jar(org/codehaus/jackson/map/ObjectMapper.class)]] [loading ZipFileIndexFileObject[/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/rt.jar(java/lang/Exception.class)]] [loading ZipFileIndexFileObject[/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/rt.jar(java/lang/Throwable.class)]] [loading ZipFileIndexFileObject[/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/rt.jar(java/io/Serializable.class)]] [loading ZipFileIndexFileObject[/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/rt.jar(java/lang/Enum.class)]] [loading ZipFileIndexFileObject[/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/rt.jar(java/lang/Comparable.class)]] [loading ZipFileIndexFileObject[/data/hiveptest/working/maven/com/sun/jersey/jersey-server/1.14/jersey-server-1.14.jar(com/sun/jersey/api/core/PackagesResourceConfig.class)]] [loading ZipFileIndexFileObject[/data/hiveptest/working/maven/com/sun/jersey/jersey-servlet/1.14/jersey-servlet-1.14.jar(com/sun/jersey/spi/container/servlet/ServletContainer.class)]] [loading ZipFileIndexFileObject[/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/rt.jar(java/io/FileInputStream.class)]] [loading ZipFileIndexFileObject[/data/hiveptest/working/apache-github-source-source/ql/target/hive-exec-2.2.0-SNAPSHOT.jar(org/apache/commons/lang3/StringUtils.class)]] [loading ZipFileIndexFileObject[/data/hiveptest/working/apache-github-source-source/ql/target/hive-exec-2.2.0-SNAPSHOT.jar(org/apache/commons/lang3/ArrayUtils.class)]] [loading ZipFileIndexFileObject[/data/hiveptest/working/apache-github-source-source/common/target/hive-common-2.2.0-SNAPSHOT.jar(org/apache/hadoop/hive/common/classification/InterfaceStability.class)]] [loading ZipFileIndexFileObject[/data/hiveptest/working/maven/org/apache/hadoop/hadoop-hdfs/2.7.2/hadoop-hdfs-2.7.2.jar(org/apache/hadoop/hdfs/web/AuthFilter.class)]] [loading ZipFileIndexFileObject[/data/hiveptest/working/apache-github-source-source/shims/common/target/hive-shims-common-2.2.0-SNAPSHOT.jar(org/apache/hadoop/hive/shims/Utils.class)]] [loading ZipFileIndexFileObject[/data/hiveptest/working/maven/org/apache/hadoop/hadoop-common/2.7.2/hadoop-common-2.7.2.jar(org/apache/hadoop/security/UserGroupInformation.class)]] [loading ZipFileIndexFileObject[/data/hiveptest/working/maven/org/apache/hadoop/hadoop-auth/2.7.2/hadoop-auth-2.7.2.jar(org/apache/hadoop/security/authentication/client/PseudoAuthenticator.class)]] [loading ZipFileIndexFileObject[/data/hiveptest/working/maven/org/apache/hadoop/hadoop-auth/2.7.2/hadoop-auth-2.7.2.jar(org/apache/hadoop/security/authentication/server/PseudoAuthenticationHandler.class)]] [loading ZipFileIndexFileObject[/data/hiveptest/working/maven/org/apache/hadoop/hadoop-common/2.7.2/hadoop-common-2.7.2.jar(org/apache/hadoop/util/GenericOptionsParser.class)]] [loading
[jira] [Commented] (HIVE-15803) msck can hang when nested partitions are present
[ https://issues.apache.org/jira/browse/HIVE-15803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15857271#comment-15857271 ] Hive QA commented on HIVE-15803: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12851512/HIVE-15803.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 10241 tests executed *Failed tests:* {noformat} TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) (batchId=235) org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_join_with_different_encryption_keys] (batchId=159) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] (batchId=223) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] (batchId=223) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3425/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3425/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3425/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 4 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12851512 - PreCommit-HIVE-Build > msck can hang when nested partitions are present > > > Key: HIVE-15803 > URL: https://issues.apache.org/jira/browse/HIVE-15803 > Project: Hive > Issue Type: Bug > Components: Metastore >Reporter: Rajesh Balamohan >Assignee: Rajesh Balamohan >Priority: Minor > Attachments: HIVE-15803.patch > > > Steps to reproduce. > {noformat} > CREATE TABLE `repairtable`( `col` string) PARTITIONED BY ( `p1` string, > `p2` string) > hive> dfs -mkdir -p /apps/hive/warehouse/test.db/repairtable/p1=c/p2=a/p3=b; > hive> dfs -touchz > /apps/hive/warehouse/test.db/repairtable/p1=c/p2=a/p3=b/datafile; > hive> set hive.mv.files.thread; > hive.mv.files.thread=15 > hive> set hive.mv.files.thread=1; > hive> MSCK TABLE repairtable; > {noformat} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15789) Vectorization: limit reduce vectorization to 32Mb chunks
[ https://issues.apache.org/jira/browse/HIVE-15789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Teddy Choi updated HIVE-15789: -- Attachment: HIVE-15789.2.patch > Vectorization: limit reduce vectorization to 32Mb chunks > > > Key: HIVE-15789 > URL: https://issues.apache.org/jira/browse/HIVE-15789 > Project: Hive > Issue Type: Bug > Components: Vectorization >Reporter: Gopal V >Assignee: Teddy Choi > Attachments: HIVE-15789.1.patch, HIVE-15789.2.patch > > > Reduce vectorization accumulates 1024 rows before forwarding it into the > reduce processor. > Add a safety limit for 32Mb of writables, so that shorter sequences can be > forwarded into the operator trees. > {code} > rowIdx++; > if (rowIdx >= BATCH_SIZE) { > VectorizedBatchUtil.setBatchSize(batch, rowIdx); > reducer.process(batch, tag); > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15789) Vectorization: limit reduce vectorization to 32Mb chunks
[ https://issues.apache.org/jira/browse/HIVE-15789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15857258#comment-15857258 ] Teddy Choi commented on HIVE-15789: --- This patch applies HIVE-15745 change and sets a key length as a default of batchBytes. > Vectorization: limit reduce vectorization to 32Mb chunks > > > Key: HIVE-15789 > URL: https://issues.apache.org/jira/browse/HIVE-15789 > Project: Hive > Issue Type: Bug > Components: Vectorization >Reporter: Gopal V >Assignee: Teddy Choi > Attachments: HIVE-15789.1.patch, HIVE-15789.2.patch > > > Reduce vectorization accumulates 1024 rows before forwarding it into the > reduce processor. > Add a safety limit for 32Mb of writables, so that shorter sequences can be > forwarded into the operator trees. > {code} > rowIdx++; > if (rowIdx >= BATCH_SIZE) { > VectorizedBatchUtil.setBatchSize(batch, rowIdx); > reducer.process(batch, tag); > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15803) msck can hang when nested partitions are present
[ https://issues.apache.org/jira/browse/HIVE-15803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15857234#comment-15857234 ] Pengcheng Xiong commented on HIVE-15803: LGTM +1. > msck can hang when nested partitions are present > > > Key: HIVE-15803 > URL: https://issues.apache.org/jira/browse/HIVE-15803 > Project: Hive > Issue Type: Bug > Components: Metastore >Reporter: Rajesh Balamohan >Assignee: Rajesh Balamohan >Priority: Minor > Attachments: HIVE-15803.patch > > > Steps to reproduce. > {noformat} > CREATE TABLE `repairtable`( `col` string) PARTITIONED BY ( `p1` string, > `p2` string) > hive> dfs -mkdir -p /apps/hive/warehouse/test.db/repairtable/p1=c/p2=a/p3=b; > hive> dfs -touchz > /apps/hive/warehouse/test.db/repairtable/p1=c/p2=a/p3=b/datafile; > hive> set hive.mv.files.thread; > hive.mv.files.thread=15 > hive> set hive.mv.files.thread=1; > hive> MSCK TABLE repairtable; > {noformat} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15769) Support view creation in CBO
[ https://issues.apache.org/jira/browse/HIVE-15769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-15769: --- Status: Open (was: Patch Available) > Support view creation in CBO > > > Key: HIVE-15769 > URL: https://issues.apache.org/jira/browse/HIVE-15769 > Project: Hive > Issue Type: Bug >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-15769.01.patch, HIVE-15769.02.patch > > > Right now, set operator needs to run in CBO. If a view contains a set op, it > will throw exception. We need to support view creation in CBO. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15769) Support view creation in CBO
[ https://issues.apache.org/jira/browse/HIVE-15769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-15769: --- Status: Patch Available (was: Open) > Support view creation in CBO > > > Key: HIVE-15769 > URL: https://issues.apache.org/jira/browse/HIVE-15769 > Project: Hive > Issue Type: Bug >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-15769.01.patch, HIVE-15769.02.patch > > > Right now, set operator needs to run in CBO. If a view contains a set op, it > will throw exception. We need to support view creation in CBO. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15769) Support view creation in CBO
[ https://issues.apache.org/jira/browse/HIVE-15769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-15769: --- Attachment: HIVE-15769.02.patch > Support view creation in CBO > > > Key: HIVE-15769 > URL: https://issues.apache.org/jira/browse/HIVE-15769 > Project: Hive > Issue Type: Bug >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-15769.01.patch, HIVE-15769.02.patch > > > Right now, set operator needs to run in CBO. If a view contains a set op, it > will throw exception. We need to support view creation in CBO. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15796) HoS: poor reducer parallelism when operator stats are not accurate
[ https://issues.apache.org/jira/browse/HIVE-15796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Sun updated HIVE-15796: Attachment: HIVE-15796.1.patch > HoS: poor reducer parallelism when operator stats are not accurate > -- > > Key: HIVE-15796 > URL: https://issues.apache.org/jira/browse/HIVE-15796 > Project: Hive > Issue Type: Improvement > Components: Statistics >Affects Versions: 2.2.0 >Reporter: Chao Sun >Assignee: Chao Sun > Attachments: HIVE-15796.1.patch, HIVE-15796.wip.1.patch, > HIVE-15796.wip.2.patch, HIVE-15796.wip.patch > > > In HoS we use currently use operator stats to determine reducer parallelism. > However, it is often the case that operator stats are not accurate, > especially if column stats are not available. This sometimes will generate > extremely poor reducer parallelism, and cause HoS query to run forever. > This JIRA tries to offer an alternative way to compute reducer parallelism, > similar to how MR does. Here's the approach we are suggesting: > 1. when computing the parallelism for a MapWork, use stats associated with > the TableScan operator; > 2. when computing the parallelism for a ReduceWork, use the *maximum* > parallelism from all its parents. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15843) disable slider YARN resource normalization for LLAP
[ https://issues.apache.org/jira/browse/HIVE-15843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15857217#comment-15857217 ] Hive QA commented on HIVE-15843: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12851503/HIVE-15843.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 10240 tests executed *Failed tests:* {noformat} TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) (batchId=235) org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_join_with_different_encryption_keys] (batchId=159) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] (batchId=223) org.apache.hive.hcatalog.pig.TestHCatLoaderComplexSchema.testMapNullKey[0] (batchId=173) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3424/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3424/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3424/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 4 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12851503 - PreCommit-HIVE-Build > disable slider YARN resource normalization for LLAP > --- > > Key: HIVE-15843 > URL: https://issues.apache.org/jira/browse/HIVE-15843 > Project: Hive > Issue Type: Bug >Reporter: Siddharth Seth >Assignee: Sergey Shelukhin > Attachments: HIVE-15843.patch > > > The normalization can lead to LLAP starting with invalid configuration with > regard to cache size, jmx and container size. If the memory configuration is > invalid, it should fail immediately. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15843) disable slider YARN resource normalization for LLAP
[ https://issues.apache.org/jira/browse/HIVE-15843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15857214#comment-15857214 ] Siddharth Seth commented on HIVE-15843: --- The patch looks good to me. Slider build is fairly straightforward from their "develop" branch. > disable slider YARN resource normalization for LLAP > --- > > Key: HIVE-15843 > URL: https://issues.apache.org/jira/browse/HIVE-15843 > Project: Hive > Issue Type: Bug >Reporter: Siddharth Seth >Assignee: Sergey Shelukhin > Attachments: HIVE-15843.patch > > > The normalization can lead to LLAP starting with invalid configuration with > regard to cache size, jmx and container size. If the memory configuration is > invalid, it should fail immediately. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Assigned] (HIVE-15846) Relocate more dependencies (e.g. org.apache.zookeeper) for JDBC uber jar
[ https://issues.apache.org/jira/browse/HIVE-15846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Li reassigned HIVE-15846: - > Relocate more dependencies (e.g. org.apache.zookeeper) for JDBC uber jar > > > Key: HIVE-15846 > URL: https://issues.apache.org/jira/browse/HIVE-15846 > Project: Hive > Issue Type: Bug >Reporter: Tao Li >Assignee: Tao Li > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15718) Fix the NullPointer problem caused by split phase
[ https://issues.apache.org/jira/browse/HIVE-15718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ferdinand Xu updated HIVE-15718: Resolution: Fixed Fix Version/s: 2.2.0 Status: Resolved (was: Patch Available) Committed to the master. Thanks [~colin_mjj] for the contribution. > Fix the NullPointer problem caused by split phase > - > > Key: HIVE-15718 > URL: https://issues.apache.org/jira/browse/HIVE-15718 > Project: Hive > Issue Type: Sub-task >Reporter: Colin Ma >Assignee: Colin Ma >Priority: Critical > Fix For: 2.2.0 > > Attachments: HIVE-15718.001.patch, HIVE-15718.002.patch, > HIVE-15718.003.patch > > > VectorizedParquetRecordReader.initialize() will throw NullPointer Exception > because the input split is null. This split should be ignored. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Comment Edited] (HIVE-15682) Eliminate per-row based dummy iterator creation
[ https://issues.apache.org/jira/browse/HIVE-15682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15857172#comment-15857172 ] Xuefu Zhang edited comment on HIVE-15682 at 2/8/17 1:23 AM: Hi [~Ferd], when I ran the query, I had two day's data which is about 25m rows. I just ran the query again, with about 10 day's data, the runtime is about 600s with 130m rows. I have 32 executors, each having 4 cores. The query spends most of the time on the second stage where sorting via a single reducer occurs. I don't think the scale matters much as long as the query runs for sometime (in minutes at least). Thus, you should be able to use TPC-DS (or its alternatives) data for this exercise. was (Author: xuefuz): Hi [~Ferd], when I ran the query, I had two day's data which is about 25m rows. I just ran the query again, with about 10 day's data, the runtime is about 600s with 130m rows. I have 32 executors, each having 4 cores. The query spends most of the time on the second stage where sorting via a single reducer occurs. I don't think the scale matters much as long as the query runs for sometime (in minutes at least). Thus, you should be able to use TPC-DS data for this exercise. > Eliminate per-row based dummy iterator creation > --- > > Key: HIVE-15682 > URL: https://issues.apache.org/jira/browse/HIVE-15682 > Project: Hive > Issue Type: Improvement > Components: Spark >Affects Versions: 2.2.0 >Reporter: Xuefu Zhang >Assignee: Xuefu Zhang > Fix For: 2.2.0 > > Attachments: HIVE-15682.patch > > > HIVE-15580 introduced a dummy iterator per input row which can be eliminated. > This is because {{SparkReduceRecordHandler}} is able to handle single key > value pairs. We can refactor this part of code 1. to remove the need for a > iterator and 2. to optimize the code path for per (key, value) based (instead > of (key, value iterator)) processing. It would be also great if we can > measure the performance after the optimizations and compare to performance > prior to HIVE-15580. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15682) Eliminate per-row based dummy iterator creation
[ https://issues.apache.org/jira/browse/HIVE-15682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15857172#comment-15857172 ] Xuefu Zhang commented on HIVE-15682: Hi [~Ferd], when I ran the query, I had two day's data which is about 25m rows. I just ran the query again, with about 10 day's data, the runtime is about 600s with 130m rows. I have 32 executors, each having 4 cores. The query spends most of the time on the second stage where sorting via a single reducer occurs. I don't think the scale matters much as long as the query runs for sometime (in minutes at least). Thus, you should be able to use TPC-DS data for this exercise. > Eliminate per-row based dummy iterator creation > --- > > Key: HIVE-15682 > URL: https://issues.apache.org/jira/browse/HIVE-15682 > Project: Hive > Issue Type: Improvement > Components: Spark >Affects Versions: 2.2.0 >Reporter: Xuefu Zhang >Assignee: Xuefu Zhang > Fix For: 2.2.0 > > Attachments: HIVE-15682.patch > > > HIVE-15580 introduced a dummy iterator per input row which can be eliminated. > This is because {{SparkReduceRecordHandler}} is able to handle single key > value pairs. We can refactor this part of code 1. to remove the need for a > iterator and 2. to optimize the code path for per (key, value) based (instead > of (key, value iterator)) processing. It would be also great if we can > measure the performance after the optimizations and compare to performance > prior to HIVE-15580. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Comment Edited] (HIVE-14990) run all tests for MM tables and fix the issues that are found
[ https://issues.apache.org/jira/browse/HIVE-14990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15668996#comment-15668996 ] Sergey Shelukhin edited comment on HIVE-14990 at 2/8/17 1:12 AM: - Updated test list to fix/declare irrelevant before closing this. Only updated the CliDriver list actually, haven't made my way thru it yet {panel} TestCliDriver: -stats_list_bucket- -show_tablestatus- -vector_udf2- -list_bucket_dml_14- autoColumnStats_9 stats_noscan_2 symlink_text_input_format temp_table_precedence offset_limit_global_optimizer rand_partitionpruner2 materialized_view_authorization_sqlstd,materialized_* merge_dynamic_partition, merge_dynamic_partition* orc_vectorization_ppd parquet_join2 repl_3_exim_metadata sample6 sample_islocalmode_hook smb_mapjoin_2,smb_mapjoin_3,smb_mapjoin_7 orc_createas1 exim_16_part_external,exim_17_part_managed, TestEncryptedHDFSCliDriver: encryption_ctas encryption_drop_partition encryption_insert_values encryption_join_unencrypted_tbl encryption_load_data_to_encrypted_tables MiniLlapLocal: exchgpartition2lel cbo_rp_lineage2 create_merge_compressed deleteAnalyze delete_where_no_match delete_where_non_partitioned dynpart_sort_optimization escape2 insert1 lineage2 lineage3 orc_llap schema_evol_orc_nonvec_part schema_evol_orc_vec_part schema_evol_text_nonvec_part schema_evol_text_vec_part schema_evol_text_vecrow_part smb_mapjoin_6 tez_dml union_fast_stats update_all_types update_tmp_table update_where_no_match update_where_non_partitioned vector_outer_join1 vector_outer_join4 MiniLlap: load_fs2 orc_ppd_basic external_table_with_space_in_location_path file_with_header_footer import_exported_table schemeAuthority,schemeAuthority2 table_nonprintable Minimr: infer_bucket_sort_map_operators infer_bucket_sort_merge infer_bucket_sort_reducers_power_two root_dir_external_table scriptfile1 TestSymlinkTextInputFormat#testCombine TestJdbcWithLocalClusterSpark, etc. {panel} was (Author: sershe): Updated test list to fix/declare irrelevant before closing this. Only updated the CliDriver list actually, haven't made my way thru it yet {panel} TestCliDriver: stats_list_bucket show_tablestatus -vector_udf2- list_bucket_dml_14 autoColumnStats_9 stats_noscan_2 symlink_text_input_format temp_table_precedence offset_limit_global_optimizer rand_partitionpruner2 materialized_view_authorization_sqlstd,materialized_* merge_dynamic_partition, merge_dynamic_partition* orc_vectorization_ppd parquet_join2 repl_3_exim_metadata sample6 sample_islocalmode_hook smb_mapjoin_2,smb_mapjoin_3,smb_mapjoin_7 orc_createas1 exim_16_part_external,exim_17_part_managed, TestEncryptedHDFSCliDriver: encryption_ctas encryption_drop_partition encryption_insert_values encryption_join_unencrypted_tbl encryption_load_data_to_encrypted_tables MiniLlapLocal: exchgpartition2lel cbo_rp_lineage2 create_merge_compressed deleteAnalyze delete_where_no_match delete_where_non_partitioned dynpart_sort_optimization escape2 insert1 lineage2 lineage3 orc_llap schema_evol_orc_nonvec_part schema_evol_orc_vec_part schema_evol_text_nonvec_part schema_evol_text_vec_part schema_evol_text_vecrow_part smb_mapjoin_6 tez_dml union_fast_stats update_all_types update_tmp_table update_where_no_match update_where_non_partitioned vector_outer_join1 vector_outer_join4 MiniLlap: load_fs2 orc_ppd_basic external_table_with_space_in_location_path file_with_header_footer import_exported_table schemeAuthority,schemeAuthority2 table_nonprintable Minimr: infer_bucket_sort_map_operators infer_bucket_sort_merge infer_bucket_sort_reducers_power_two root_dir_external_table scriptfile1 TestSymlinkTextInputFormat#testCombine TestJdbcWithLocalClusterSpark, etc. {panel} > run all tests for MM tables and fix the issues that are found > - > > Key: HIVE-14990 > URL: https://issues.apache.org/jira/browse/HIVE-14990 > Project: Hive > Issue Type: Sub-task >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-14990.01.patch, HIVE-14990.02.patch, > HIVE-14990.03.patch, HIVE-14990.04.patch, HIVE-14990.04.patch, > HIVE-14990.05.patch, HIVE-14990.05.patch, HIVE-14990.06.patch, > HIVE-14990.06.patch, HIVE-14990.07.patch, HIVE-14990.08.patch, > HIVE-14990.09.patch, HIVE-14990.10.patch, HIVE-14990.10.patch, > HIVE-14990.10.patch, HIVE-14990.12.patch, HIVE-14990.patch > > > Expected failures > 1) All HCat tests (cannot write MM tables via the HCat writer) > 2) Almost all merge tests (alter .. concat is not supported). > 3) Tests that run dfs commands with specific paths (path changes). > 4) Truncate column (not supported). > 5) Describe formatted will have the new table fields in the output (before > merging MM with ACID). > 6) Many tests w/explain extended - diff in partition "base file name" (path
[jira] [Updated] (HIVE-15844) Add WriteType to Explain Plan of ReduceSinkOperator and FileSinkOperator
[ https://issues.apache.org/jira/browse/HIVE-15844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-15844: -- Description: # both FileSinkDesk and ReduceSinkDesk have special code path for Update/Delete operations. It is not always set correctly for ReduceSink. ReduceSinkDeDuplication is one place where it gets lost. Even when it isn't set correctly, elsewhere we set ROW_ID to be the partition column of the ReduceSinkOperator and UDFToInteger special cases it to extract bucketId from ROW_ID. We need to modify Explain Plan to record Write Type (i.e. insert/update/delete) to make sure we have tests that can catch errors here. # Add some validation at the end of the plan to make sure that RSO/FSO which represent the end of the pipeline and write to acid table have WriteType set (to something other than default). # We don't seem to have any tests where number of buckets is > number of reducers. Add those. was:both FileSinkDesk and ReduceSinkDesk have special code path for Update/Delete operations. It is not always set correctly for ReduceSink. ReduceSinkDeDuplication is one place where it gets lost. Even when it isn't set correctly, elsewhere we set ROW_ID to be the partition column of the ReduceSinkOperator and UDFToInteger special cases it to extract bucketId from ROW_ID. We need to modify Explain Plan to record Write Type (i.e. insert/update/delete) to make sure we have tests that can catch errors here. > Add WriteType to Explain Plan of ReduceSinkOperator and FileSinkOperator > > > Key: HIVE-15844 > URL: https://issues.apache.org/jira/browse/HIVE-15844 > Project: Hive > Issue Type: Bug > Components: Transactions >Reporter: Eugene Koifman >Assignee: Eugene Koifman > Fix For: 1.0.0 > > > # both FileSinkDesk and ReduceSinkDesk have special code path for > Update/Delete operations. It is not always set correctly for ReduceSink. > ReduceSinkDeDuplication is one place where it gets lost. Even when it isn't > set correctly, elsewhere we set ROW_ID to be the partition column of the > ReduceSinkOperator and UDFToInteger special cases it to extract bucketId from > ROW_ID. We need to modify Explain Plan to record Write Type (i.e. > insert/update/delete) to make sure we have tests that can catch errors here. > # Add some validation at the end of the plan to make sure that RSO/FSO which > represent the end of the pipeline and write to acid table have WriteType set > (to something other than default). > # We don't seem to have any tests where number of buckets is > number of > reducers. Add those. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15672) LLAP text cache: improve first query perf II
[ https://issues.apache.org/jira/browse/HIVE-15672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-15672: Attachment: HIVE-15672.08.patch Small update based on RB > LLAP text cache: improve first query perf II > > > Key: HIVE-15672 > URL: https://issues.apache.org/jira/browse/HIVE-15672 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-15672.01.patch, HIVE-15672.02.patch, > HIVE-15672.03.patch, HIVE-15672.04.patch, HIVE-15672.05.patch, > HIVE-15672.06.patch, HIVE-15672.07.patch, HIVE-15672.08.patch > > > 4) Send VRB to the pipeline and write ORC in parallel (in background). -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15791) Remove unused ant files
[ https://issues.apache.org/jira/browse/HIVE-15791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gunther Hagleitner updated HIVE-15791: -- Attachment: HIVE-15791.2.patch > Remove unused ant files > --- > > Key: HIVE-15791 > URL: https://issues.apache.org/jira/browse/HIVE-15791 > Project: Hive > Issue Type: Bug >Reporter: Gunther Hagleitner >Assignee: Gunther Hagleitner > Attachments: HIVE-15791.1.patch, HIVE-15791.2.patch > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15791) Remove unused ant files
[ https://issues.apache.org/jira/browse/HIVE-15791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gunther Hagleitner updated HIVE-15791: -- Status: Patch Available (was: Open) > Remove unused ant files > --- > > Key: HIVE-15791 > URL: https://issues.apache.org/jira/browse/HIVE-15791 > Project: Hive > Issue Type: Bug >Reporter: Gunther Hagleitner >Assignee: Gunther Hagleitner > Attachments: HIVE-15791.1.patch, HIVE-15791.2.patch > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15802) Changes to expected entries for dynamic bloomfilter runtime filtering
[ https://issues.apache.org/jira/browse/HIVE-15802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15857153#comment-15857153 ] Jason Dere commented on HIVE-15802: --- Looks like the golden file for TestMiniLlapLocalCliDriver.testCliDriver[mergejoin] needs to be updated. > Changes to expected entries for dynamic bloomfilter runtime filtering > - > > Key: HIVE-15802 > URL: https://issues.apache.org/jira/browse/HIVE-15802 > Project: Hive > Issue Type: Improvement >Reporter: Jason Dere >Assignee: Jason Dere > Attachments: HIVE-15802.1.patch, HIVE-15802.2.patch > > > - Estimate bloom filter size based on distinct values from column stats if > available > - Cap the bloom filter expected entries size to > hive.tez.max.bloom.filter.entries if the estimated size from stats exceeds > that amount. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Assigned] (HIVE-15844) Add WriteType to Explain Plan of ReduceSinkOperator and FileSinkOperator
[ https://issues.apache.org/jira/browse/HIVE-15844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman reassigned HIVE-15844: - > Add WriteType to Explain Plan of ReduceSinkOperator and FileSinkOperator > > > Key: HIVE-15844 > URL: https://issues.apache.org/jira/browse/HIVE-15844 > Project: Hive > Issue Type: Bug > Components: Transactions >Reporter: Eugene Koifman >Assignee: Eugene Koifman > Fix For: 1.0.0 > > > both FileSinkDesk and ReduceSinkDesk have special code path for Update/Delete > operations. It is not always set correctly for ReduceSink. > ReduceSinkDeDuplication is one place where it gets lost. Even when it isn't > set correctly, elsewhere we set ROW_ID to be the partition column of the > ReduceSinkOperator and UDFToInteger special cases it to extract bucketId from > ROW_ID. We need to modify Explain Plan to record Write Type (i.e. > insert/update/delete) to make sure we have tests that can catch errors here. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15840) Webhcat test TestPig_5 failing with Pig on Tez at check for percent complete of job
[ https://issues.apache.org/jira/browse/HIVE-15840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15857143#comment-15857143 ] Hive QA commented on HIVE-15840: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12851494/HIVE-15840.1.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 10236 tests executed *Failed tests:* {noformat} TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) (batchId=235) org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_join_with_different_encryption_keys] (batchId=159) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] (batchId=223) org.apache.hadoop.hive.cli.TestSparkNegativeCliDriver.org.apache.hadoop.hive.cli.TestSparkNegativeCliDriver (batchId=230) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3423/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3423/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3423/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 4 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12851494 - PreCommit-HIVE-Build > Webhcat test TestPig_5 failing with Pig on Tez at check for percent complete > of job > --- > > Key: HIVE-15840 > URL: https://issues.apache.org/jira/browse/HIVE-15840 > Project: Hive > Issue Type: Bug > Components: HCatalog >Reporter: Daniel Dai >Assignee: Daniel Dai > Attachments: HIVE-15840.1.patch > > > TestPig_5 is failing at percentage check if the job is Pig on Tez: > check_job_percent_complete failed. got percentComplete , expected 100% > complete > Test command: > curl -d user.name=daijy -d arg=-p -d arg=INPDIR=/tmp/templeton_test_data -d > arg=-p -d arg=OUTDIR=/tmp/output -d file=loadstore.pig -X POST > http://localhost:50111/templeton/v1/pig > curl > http://localhost:50111/templeton/v1/jobs/job_1486502484681_0003?user.name=daijy > This is similar to HIVE-9351, which fixes Hive on Tez. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15473) Progress Bar on Beeline client
[ https://issues.apache.org/jira/browse/HIVE-15473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15857132#comment-15857132 ] Thejas M Nair commented on HIVE-15473: -- [~anishek] Can you please create a follow up jira to address these concerns ? > Progress Bar on Beeline client > -- > > Key: HIVE-15473 > URL: https://issues.apache.org/jira/browse/HIVE-15473 > Project: Hive > Issue Type: Improvement > Components: Beeline, HiveServer2 >Affects Versions: 2.1.1 >Reporter: anishek >Assignee: anishek >Priority: Minor > Fix For: 2.2.0 > > Attachments: HIVE-15473.10.patch, HIVE-15473.11.patch, > HIVE-15473.2.patch, HIVE-15473.3.patch, HIVE-15473.4.patch, > HIVE-15473.5.patch, HIVE-15473.6.patch, HIVE-15473.7.patch, > HIVE-15473.8.patch, HIVE-15473.9.patch, io_summary_after_patch.png, > io_summary_before_patch.png, screen_shot_beeline.jpg, status_after_patch.png, > status_before_patch.png, summary_after_patch.png, summary_before_patch.png > > > Hive Cli allows showing progress bar for tez execution engine as shown in > https://issues.apache.org/jira/secure/attachment/12678767/ux-demo.gif > it would be great to have similar progress bar displayed when user is > connecting via beeline command line client as well. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15683) Measure performance impact on group by by HIVE-15580
[ https://issues.apache.org/jira/browse/HIVE-15683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15857123#comment-15857123 ] Xuefu Zhang commented on HIVE-15683: The following measurement comes up with static allocation with out prod cluster for a query designed to measure group by performance. It offers a comparison between performance w/ and w/o HIVE-15580. {code} Query: select count(*) from (select driver_uuid, avg(base_fare_usd) from dwh.fact_trip where datestr > '2017-01-01' group by driver_uuid) x; Origin: 55.1, 42.1, 39.6, 39.1, 39.1, 33.06, 61.6 AVG: 44.24 Patch: 59.1, 65.2, 58.3, 35.1, 45.1, 39.4, 47.3 AVG: 49.93 => 1.13X slower {code} The performance degradation seems noticeable however insignificant. For this, we plan to add a configuration to allow user to switch the two implementations. However, our cluster is notoriously for large performance variations. Thus, it's great if others can also conduct some test to confirm. FYI, [~Ferd]/[~dapengsun]. > Measure performance impact on group by by HIVE-15580 > > > Key: HIVE-15683 > URL: https://issues.apache.org/jira/browse/HIVE-15683 > Project: Hive > Issue Type: Improvement > Components: Spark >Affects Versions: 2.2.0 >Reporter: Xuefu Zhang >Assignee: Xuefu Zhang > > HIVE-15580 changed the way the data is shuffled for order by: instead of > using Spark's groupByKey to shuffle data, Hive on Spark now uses > repartitionAndSortWithinPartitions(), which generates (key, value) pairs > instead of original (key, value iterator). This might have some performance > implications, but it's needed to get rid of unbound memory usage by > {{groupByKey}}. > Here we'd like to compare group by performance with or w/o HIVE-15580. If the > impact is significant, we can provide a configuration that allows user to > switch back to the original way of shuffling. > This work should be ideally done after HIVE-15682 as the optimization there > should help the performance here as well. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Comment Edited] (HIVE-15691) Create StrictRegexWriter to work with RegexSerializer for Flume Hive Sink
[ https://issues.apache.org/jira/browse/HIVE-15691?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15857108#comment-15857108 ] Eugene Koifman edited comment on HIVE-15691 at 2/8/17 12:36 AM: What is main use case this is designed for that DelimitedWriter can't handle? Using different delimiters in the same row? Maybe the unit tests should be more elaborate to illustrate that. Why are there so many StrictRegexWriter() c'tors? It seems like a StrictRegexWriter w/o a regex or EndPoint or Connection, etc is not useful - it will just NPE in various places. [~roshan_naik] do you have any comments on this patch? was (Author: ekoifman): What is main use case this is designed for that DelimitedWriter can't handle? Using different delimiters in the same row? Maybe the unit tests should be more elaborate to illustrate that. Why are there so many StrictRegexWriter() c'tors? It seems like a StrictRegexWriter w/o a regex or EndPoint or Connection, etc is not useful - it will just NPE in various places. [~roshan_naik] do you have any comments on this? > Create StrictRegexWriter to work with RegexSerializer for Flume Hive Sink > - > > Key: HIVE-15691 > URL: https://issues.apache.org/jira/browse/HIVE-15691 > Project: Hive > Issue Type: New Feature > Components: HCatalog, Transactions >Reporter: Kalyan >Assignee: Kalyan > Attachments: HIVE-15691.1.patch, HIVE-15691.patch, > HIVE-15691-updated.patch > > > Create StrictRegexWriter to work with RegexSerializer for Flume Hive Sink. > It is similar to StrictJsonWriter available in hive. > Dependency is there in flume to commit. > FLUME-3036 : Create a RegexSerializer for Hive Sink. > Patch is available for Flume, Please verify the below link > https://github.com/kalyanhadooptraining/flume/commit/1c651e81395404321f9964c8d9d2af6f4a2aaef9 -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15682) Eliminate per-row based dummy iterator creation
[ https://issues.apache.org/jira/browse/HIVE-15682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15857110#comment-15857110 ] Ferdinand Xu commented on HIVE-15682: - Hi [~xuefuz] {noformat} select count(*) from (select request_lat from dwh.fact_trip where datestr > '2017-01-27' order by request_lat) x; Origin: 246.56, 342.78, 216.40, 216.587, 270.805, 449.232, 233.406 AVG: 282.25 patch: 125.21, 123.22, 166.31, 168.30, 120.428, 119.21, 120.385AVG: 134.72 {noformat} What kind of data scales do you use to evaluate the performance? We can evaluate this patch using TPC-DS and TPCx-BB. > Eliminate per-row based dummy iterator creation > --- > > Key: HIVE-15682 > URL: https://issues.apache.org/jira/browse/HIVE-15682 > Project: Hive > Issue Type: Improvement > Components: Spark >Affects Versions: 2.2.0 >Reporter: Xuefu Zhang >Assignee: Xuefu Zhang > Fix For: 2.2.0 > > Attachments: HIVE-15682.patch > > > HIVE-15580 introduced a dummy iterator per input row which can be eliminated. > This is because {{SparkReduceRecordHandler}} is able to handle single key > value pairs. We can refactor this part of code 1. to remove the need for a > iterator and 2. to optimize the code path for per (key, value) based (instead > of (key, value iterator)) processing. It would be also great if we can > measure the performance after the optimizations and compare to performance > prior to HIVE-15580. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Comment Edited] (HIVE-15691) Create StrictRegexWriter to work with RegexSerializer for Flume Hive Sink
[ https://issues.apache.org/jira/browse/HIVE-15691?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15857108#comment-15857108 ] Eugene Koifman edited comment on HIVE-15691 at 2/8/17 12:36 AM: What is main use case this is designed for that DelimitedWriter can't handle? Using different delimiters in the same row? Maybe the unit tests should be more elaborate to illustrate that. Why are there so many StrictRegexWriter() c'tors? It seems like a StrictRegexWriter w/o a regex or EndPoint or Connection, etc is not useful - it will just NPE in various places. [~roshan_naik] do you have any comments on this? was (Author: ekoifman): What is main use case this is designed for that DelimitedWriter can't handle? Using different delimiters in the same row? Maybe the unit tests should be more elaborate to illustrate that. Why are there so many StrictRegexWriter() c'tors? It seems like a StrictRegexWriter w/o a regex or EndPoint or Connection, etc is not useful - it will just NPE in various places. > Create StrictRegexWriter to work with RegexSerializer for Flume Hive Sink > - > > Key: HIVE-15691 > URL: https://issues.apache.org/jira/browse/HIVE-15691 > Project: Hive > Issue Type: New Feature > Components: HCatalog, Transactions >Reporter: Kalyan >Assignee: Kalyan > Attachments: HIVE-15691.1.patch, HIVE-15691.patch, > HIVE-15691-updated.patch > > > Create StrictRegexWriter to work with RegexSerializer for Flume Hive Sink. > It is similar to StrictJsonWriter available in hive. > Dependency is there in flume to commit. > FLUME-3036 : Create a RegexSerializer for Hive Sink. > Patch is available for Flume, Please verify the below link > https://github.com/kalyanhadooptraining/flume/commit/1c651e81395404321f9964c8d9d2af6f4a2aaef9 -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15691) Create StrictRegexWriter to work with RegexSerializer for Flume Hive Sink
[ https://issues.apache.org/jira/browse/HIVE-15691?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15857108#comment-15857108 ] Eugene Koifman commented on HIVE-15691: --- What is main use case this is designed for that DelimitedWriter can't handle? Using different delimiters in the same row? Maybe the unit tests should be more elaborate to illustrate that. Why are there so many StrictRegexWriter() c'tors? It seems like a StrictRegexWriter w/o a regex or EndPoint or Connection, etc is not useful - it will just NPE in various places. > Create StrictRegexWriter to work with RegexSerializer for Flume Hive Sink > - > > Key: HIVE-15691 > URL: https://issues.apache.org/jira/browse/HIVE-15691 > Project: Hive > Issue Type: New Feature > Components: HCatalog, Transactions >Reporter: Kalyan >Assignee: Kalyan > Attachments: HIVE-15691.1.patch, HIVE-15691.patch, > HIVE-15691-updated.patch > > > Create StrictRegexWriter to work with RegexSerializer for Flume Hive Sink. > It is similar to StrictJsonWriter available in hive. > Dependency is there in flume to commit. > FLUME-3036 : Create a RegexSerializer for Hive Sink. > Patch is available for Flume, Please verify the below link > https://github.com/kalyanhadooptraining/flume/commit/1c651e81395404321f9964c8d9d2af6f4a2aaef9 -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15682) Eliminate per-row based dummy iterator creation
[ https://issues.apache.org/jira/browse/HIVE-15682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15857098#comment-15857098 ] Xuefu Zhang commented on HIVE-15682: Hi [~Ferd]/[~dapengsun], it would be great if you guys can also run the test and confirm the conclusion drawn here. Thanks. > Eliminate per-row based dummy iterator creation > --- > > Key: HIVE-15682 > URL: https://issues.apache.org/jira/browse/HIVE-15682 > Project: Hive > Issue Type: Improvement > Components: Spark >Affects Versions: 2.2.0 >Reporter: Xuefu Zhang >Assignee: Xuefu Zhang > Fix For: 2.2.0 > > Attachments: HIVE-15682.patch > > > HIVE-15580 introduced a dummy iterator per input row which can be eliminated. > This is because {{SparkReduceRecordHandler}} is able to handle single key > value pairs. We can refactor this part of code 1. to remove the need for a > iterator and 2. to optimize the code path for per (key, value) based (instead > of (key, value iterator)) processing. It would be also great if we can > measure the performance after the optimizations and compare to performance > prior to HIVE-15580. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15222) replace org.json usage in ExplainTask/TezTask related classes with some alternative
[ https://issues.apache.org/jira/browse/HIVE-15222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15857087#comment-15857087 ] Gunther Hagleitner commented on HIVE-15222: --- cc [~pxiong]. i think you have made some changes to the json explain before? > replace org.json usage in ExplainTask/TezTask related classes with some > alternative > --- > > Key: HIVE-15222 > URL: https://issues.apache.org/jira/browse/HIVE-15222 > Project: Hive > Issue Type: Sub-task >Reporter: Zoltan Haindrich >Assignee: Teddy Choi > Fix For: 2.2.0 > > Attachments: HIVE-15222.1.patch, HIVE-15222.2.patch > > > Replace org.json usage in these classes. > It seems to me that json is probably only used to write some information - > but the application never reads it back. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15803) msck can hang when nested partitions are present
[ https://issues.apache.org/jira/browse/HIVE-15803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-15803: Status: Patch Available (was: Open) > msck can hang when nested partitions are present > > > Key: HIVE-15803 > URL: https://issues.apache.org/jira/browse/HIVE-15803 > Project: Hive > Issue Type: Bug > Components: Metastore >Reporter: Rajesh Balamohan >Assignee: Rajesh Balamohan >Priority: Minor > Attachments: HIVE-15803.patch > > > Steps to reproduce. > {noformat} > CREATE TABLE `repairtable`( `col` string) PARTITIONED BY ( `p1` string, > `p2` string) > hive> dfs -mkdir -p /apps/hive/warehouse/test.db/repairtable/p1=c/p2=a/p3=b; > hive> dfs -touchz > /apps/hive/warehouse/test.db/repairtable/p1=c/p2=a/p3=b/datafile; > hive> set hive.mv.files.thread; > hive.mv.files.thread=15 > hive> set hive.mv.files.thread=1; > hive> MSCK TABLE repairtable; > {noformat} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15803) msck can hang when nested partitions are present
[ https://issues.apache.org/jira/browse/HIVE-15803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-15803: Attachment: HIVE-15803.patch > msck can hang when nested partitions are present > > > Key: HIVE-15803 > URL: https://issues.apache.org/jira/browse/HIVE-15803 > Project: Hive > Issue Type: Bug > Components: Metastore >Reporter: Rajesh Balamohan >Assignee: Rajesh Balamohan >Priority: Minor > Attachments: HIVE-15803.patch > > > Steps to reproduce. > {noformat} > CREATE TABLE `repairtable`( `col` string) PARTITIONED BY ( `p1` string, > `p2` string) > hive> dfs -mkdir -p /apps/hive/warehouse/test.db/repairtable/p1=c/p2=a/p3=b; > hive> dfs -touchz > /apps/hive/warehouse/test.db/repairtable/p1=c/p2=a/p3=b/datafile; > hive> set hive.mv.files.thread; > hive.mv.files.thread=15 > hive> set hive.mv.files.thread=1; > hive> MSCK TABLE repairtable; > {noformat} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15843) disable slider YARN resource normalization for LLAP
[ https://issues.apache.org/jira/browse/HIVE-15843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15857080#comment-15857080 ] Sergey Shelukhin commented on HIVE-15843: - Hmm, I don't have access to a new enough version of Slider, but I do see the setting applied in the package; and it doesn't fail due to an unknown setting on Slider 0.91, which is something else I wanted to test. > disable slider YARN resource normalization for LLAP > --- > > Key: HIVE-15843 > URL: https://issues.apache.org/jira/browse/HIVE-15843 > Project: Hive > Issue Type: Bug >Reporter: Siddharth Seth >Assignee: Sergey Shelukhin > Attachments: HIVE-15843.patch > > > The normalization can lead to LLAP starting with invalid configuration with > regard to cache size, jmx and container size. If the memory configuration is > invalid, it should fail immediately. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15473) Progress Bar on Beeline client
[ https://issues.apache.org/jira/browse/HIVE-15473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15857076#comment-15857076 ] Prasanth Jayachandran commented on HIVE-15473: -- The refresh rate is slow. Following video will show it before patch: https://asciinema.org/a/2fgcncxg5gjavcpxt6lfb8jg9 after patch: https://asciinema.org/a/2tht5jf6l9b2dc3ylt5gtztqg > Progress Bar on Beeline client > -- > > Key: HIVE-15473 > URL: https://issues.apache.org/jira/browse/HIVE-15473 > Project: Hive > Issue Type: Improvement > Components: Beeline, HiveServer2 >Affects Versions: 2.1.1 >Reporter: anishek >Assignee: anishek >Priority: Minor > Fix For: 2.2.0 > > Attachments: HIVE-15473.10.patch, HIVE-15473.11.patch, > HIVE-15473.2.patch, HIVE-15473.3.patch, HIVE-15473.4.patch, > HIVE-15473.5.patch, HIVE-15473.6.patch, HIVE-15473.7.patch, > HIVE-15473.8.patch, HIVE-15473.9.patch, io_summary_after_patch.png, > io_summary_before_patch.png, screen_shot_beeline.jpg, status_after_patch.png, > status_before_patch.png, summary_after_patch.png, summary_before_patch.png > > > Hive Cli allows showing progress bar for tez execution engine as shown in > https://issues.apache.org/jira/secure/attachment/12678767/ux-demo.gif > it would be great to have similar progress bar displayed when user is > connecting via beeline command line client as well. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Assigned] (HIVE-15473) Progress Bar on Beeline client
[ https://issues.apache.org/jira/browse/HIVE-15473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran reassigned HIVE-15473: Assignee: anishek (was: Prasanth Jayachandran) > Progress Bar on Beeline client > -- > > Key: HIVE-15473 > URL: https://issues.apache.org/jira/browse/HIVE-15473 > Project: Hive > Issue Type: Improvement > Components: Beeline, HiveServer2 >Affects Versions: 2.1.1 >Reporter: anishek >Assignee: anishek >Priority: Minor > Fix For: 2.2.0 > > Attachments: HIVE-15473.10.patch, HIVE-15473.11.patch, > HIVE-15473.2.patch, HIVE-15473.3.patch, HIVE-15473.4.patch, > HIVE-15473.5.patch, HIVE-15473.6.patch, HIVE-15473.7.patch, > HIVE-15473.8.patch, HIVE-15473.9.patch, io_summary_after_patch.png, > io_summary_before_patch.png, screen_shot_beeline.jpg, status_after_patch.png, > status_before_patch.png, summary_after_patch.png, summary_before_patch.png > > > Hive Cli allows showing progress bar for tez execution engine as shown in > https://issues.apache.org/jira/secure/attachment/12678767/ux-demo.gif > it would be great to have similar progress bar displayed when user is > connecting via beeline command line client as well. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15802) Changes to expected entries for dynamic bloomfilter runtime filtering
[ https://issues.apache.org/jira/browse/HIVE-15802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15857074#comment-15857074 ] Hive QA commented on HIVE-15802: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12851479/HIVE-15802.2.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 10206 tests executed *Failed tests:* {noformat} TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) (batchId=235) TestMiniSparkOnYarnCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=162) [scriptfile1.q,vector_outer_join5.q,file_with_header_footer.q,bucket4.q,input16_cc.q,bucket5.q,infer_bucket_sort_merge.q,constprog_partitioner.q,orc_merge2.q,reduce_deduplicate.q,schemeAuthority2.q,load_fs2.q,orc_merge8.q,orc_merge_incompat2.q,infer_bucket_sort_bucketed_table.q,vector_outer_join4.q,disable_merge_for_bucketing.q,vector_inner_join.q,orc_merge7.q] TestSparkCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=106) [bucketsortoptimize_insert_4.q,multi_insert_mixed.q,vectorization_10.q,auto_join18_multi_distinct.q,join_cond_pushdown_3.q,custom_input_output_format.q,skewjoinopt5.q,vectorization_part_project.q,vector_count_distinct.q,skewjoinopt4.q,count.q,parallel.q,union33.q,union_lateralview.q,nullgroup4.q] org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_join_with_different_encryption_keys] (batchId=159) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[mergejoin] (batchId=150) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] (batchId=223) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3422/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3422/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3422/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 6 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12851479 - PreCommit-HIVE-Build > Changes to expected entries for dynamic bloomfilter runtime filtering > - > > Key: HIVE-15802 > URL: https://issues.apache.org/jira/browse/HIVE-15802 > Project: Hive > Issue Type: Improvement >Reporter: Jason Dere >Assignee: Jason Dere > Attachments: HIVE-15802.1.patch, HIVE-15802.2.patch > > > - Estimate bloom filter size based on distinct values from column stats if > available > - Cap the bloom filter expected entries size to > hive.tez.max.bloom.filter.entries if the estimated size from stats exceeds > that amount. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15473) Progress Bar on Beeline client
[ https://issues.apache.org/jira/browse/HIVE-15473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-15473: - Attachment: summary_before_patch.png summary_after_patch.png status_before_patch.png status_after_patch.png io_summary_before_patch.png io_summary_after_patch.png Attaching files with differences before and after this patch. > Progress Bar on Beeline client > -- > > Key: HIVE-15473 > URL: https://issues.apache.org/jira/browse/HIVE-15473 > Project: Hive > Issue Type: Improvement > Components: Beeline, HiveServer2 >Affects Versions: 2.1.1 >Reporter: anishek >Assignee: Prasanth Jayachandran >Priority: Minor > Fix For: 2.2.0 > > Attachments: HIVE-15473.10.patch, HIVE-15473.11.patch, > HIVE-15473.2.patch, HIVE-15473.3.patch, HIVE-15473.4.patch, > HIVE-15473.5.patch, HIVE-15473.6.patch, HIVE-15473.7.patch, > HIVE-15473.8.patch, HIVE-15473.9.patch, io_summary_after_patch.png, > io_summary_before_patch.png, screen_shot_beeline.jpg, status_after_patch.png, > status_before_patch.png, summary_after_patch.png, summary_before_patch.png > > > Hive Cli allows showing progress bar for tez execution engine as shown in > https://issues.apache.org/jira/secure/attachment/12678767/ux-demo.gif > it would be great to have similar progress bar displayed when user is > connecting via beeline command line client as well. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Assigned] (HIVE-15473) Progress Bar on Beeline client
[ https://issues.apache.org/jira/browse/HIVE-15473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran reassigned HIVE-15473: Assignee: Prasanth Jayachandran (was: anishek) > Progress Bar on Beeline client > -- > > Key: HIVE-15473 > URL: https://issues.apache.org/jira/browse/HIVE-15473 > Project: Hive > Issue Type: Improvement > Components: Beeline, HiveServer2 >Affects Versions: 2.1.1 >Reporter: anishek >Assignee: Prasanth Jayachandran >Priority: Minor > Fix For: 2.2.0 > > Attachments: HIVE-15473.10.patch, HIVE-15473.11.patch, > HIVE-15473.2.patch, HIVE-15473.3.patch, HIVE-15473.4.patch, > HIVE-15473.5.patch, HIVE-15473.6.patch, HIVE-15473.7.patch, > HIVE-15473.8.patch, HIVE-15473.9.patch, io_summary_after_patch.png, > io_summary_before_patch.png, screen_shot_beeline.jpg, status_after_patch.png, > status_before_patch.png, summary_after_patch.png, summary_before_patch.png > > > Hive Cli allows showing progress bar for tez execution engine as shown in > https://issues.apache.org/jira/secure/attachment/12678767/ux-demo.gif > it would be great to have similar progress bar displayed when user is > connecting via beeline command line client as well. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15843) disable slider YARN resource normalization for LLAP
[ https://issues.apache.org/jira/browse/HIVE-15843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-15843: Description: The normalization can lead to LLAP starting with invalid configuration with regard to cache size, jmx and container size. If the memory configuration is invalid, it should fail immediately. (was: This can lead to LLAP starting with an invalid config with regard to cache size, jmx and container size. If the memory configuration is invalid, it should fail immediately.) > disable slider YARN resource normalization for LLAP > --- > > Key: HIVE-15843 > URL: https://issues.apache.org/jira/browse/HIVE-15843 > Project: Hive > Issue Type: Bug >Reporter: Siddharth Seth >Assignee: Sergey Shelukhin > Attachments: HIVE-15843.patch > > > The normalization can lead to LLAP starting with invalid configuration with > regard to cache size, jmx and container size. If the memory configuration is > invalid, it should fail immediately. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15843) disable slider YARN resource normalization for LLAP
[ https://issues.apache.org/jira/browse/HIVE-15843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-15843: Status: Patch Available (was: Open) > disable slider YARN resource normalization for LLAP > --- > > Key: HIVE-15843 > URL: https://issues.apache.org/jira/browse/HIVE-15843 > Project: Hive > Issue Type: Bug >Reporter: Siddharth Seth >Assignee: Sergey Shelukhin > Attachments: HIVE-15843.patch > > > This can lead to LLAP starting with an invalid config with regard to cache > size, jmx and container size. If the memory configuration is invalid, it > should fail immediately. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15843) disable slider YARN resource normalization for LLAP
[ https://issues.apache.org/jira/browse/HIVE-15843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-15843: Attachment: HIVE-15843.patch The patch. Need to test in the cluster. cc [~sseth] > disable slider YARN resource normalization for LLAP > --- > > Key: HIVE-15843 > URL: https://issues.apache.org/jira/browse/HIVE-15843 > Project: Hive > Issue Type: Bug >Reporter: Siddharth Seth >Assignee: Sergey Shelukhin > Attachments: HIVE-15843.patch > > > This can lead to LLAP starting with an invalid config with regard to cache > size, jmx and container size. If the memory configuration is invalid, it > should fail immediately. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Assigned] (HIVE-15843) disable slider YARN resource normalization for LLAP
[ https://issues.apache.org/jira/browse/HIVE-15843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin reassigned HIVE-15843: --- > disable slider YARN resource normalization for LLAP > --- > > Key: HIVE-15843 > URL: https://issues.apache.org/jira/browse/HIVE-15843 > Project: Hive > Issue Type: Bug >Reporter: Siddharth Seth >Assignee: Sergey Shelukhin > > This can lead to LLAP starting with an invalid config with regard to cache > size, jmx and container size. If the memory configuration is invalid, it > should fail immediately. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (HIVE-15842) disable slider YARN resource normalization for LLAP
[ https://issues.apache.org/jira/browse/HIVE-15842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin resolved HIVE-15842. - Resolution: Invalid > disable slider YARN resource normalization for LLAP > --- > > Key: HIVE-15842 > URL: https://issues.apache.org/jira/browse/HIVE-15842 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > > This can lead to LLAP starting with an invalid config with regard to cache > size, jmx and container size. If the memory configuration is invalid, it > should fail immediately. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Assigned] (HIVE-15842) disable slider YARN resource normalization for LLAP
[ https://issues.apache.org/jira/browse/HIVE-15842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin reassigned HIVE-15842: --- > disable slider YARN resource normalization for LLAP > --- > > Key: HIVE-15842 > URL: https://issues.apache.org/jira/browse/HIVE-15842 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > > This can lead to LLAP starting with an invalid config with regard to cache > size, jmx and container size. If the memory configuration is invalid, it > should fail immediately. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-14007) Replace ORC module with ORC release
[ https://issues.apache.org/jira/browse/HIVE-14007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15857046#comment-15857046 ] ASF GitHub Bot commented on HIVE-14007: --- Github user omalley closed the pull request at: https://github.com/apache/hive/pull/81 > Replace ORC module with ORC release > --- > > Key: HIVE-14007 > URL: https://issues.apache.org/jira/browse/HIVE-14007 > Project: Hive > Issue Type: Bug > Components: ORC >Affects Versions: 2.2.0 >Reporter: Owen O'Malley >Assignee: Owen O'Malley > Labels: TODOC2.2 > Fix For: 2.2.0 > > Attachments: HIVE-14007.patch, HIVE-14007.patch, HIVE-14007.patch, > HIVE-14007.patch, HIVE-14007.patch, HIVE-14007.patch, HIVE-14007.patch, > HIVE-14007.patch, HIVE-14007.patch, HIVE-14007.patch > > > This completes moving the core ORC reader & writer to the ORC project. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15688) LlapServiceDriver - an option to start the cluster immediately
[ https://issues.apache.org/jira/browse/HIVE-15688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15857044#comment-15857044 ] Siddharth Seth commented on HIVE-15688: --- +1 for the latest patch. > LlapServiceDriver - an option to start the cluster immediately > -- > > Key: HIVE-15688 > URL: https://issues.apache.org/jira/browse/HIVE-15688 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-15688.01.patch, HIVE-15688.02.patch, > HIVE-15688.03.patch, HIVE-15688.04.patch, HIVE-15688.patch > > > run.sh is very slow because it's 4 calls to slider, which means 4 JVMs, 4 > connections to RM and other crap, for 2-5sec. of overhead per call, > depending on the machine/cluster. > What we need is a mode for llapservicedriver that would not generate run.sh, > but would rather run the cluster immediately by calling the corresponding 4 > slider APIs. Should probably be the default, too. For compat with scripts we > might generate blank run.sh for now. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15840) Webhcat test TestPig_5 failing with Pig on Tez at check for percent complete of job
[ https://issues.apache.org/jira/browse/HIVE-15840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15857030#comment-15857030 ] Thejas M Nair commented on HIVE-15840: -- +1 > Webhcat test TestPig_5 failing with Pig on Tez at check for percent complete > of job > --- > > Key: HIVE-15840 > URL: https://issues.apache.org/jira/browse/HIVE-15840 > Project: Hive > Issue Type: Bug > Components: HCatalog >Reporter: Daniel Dai >Assignee: Daniel Dai > Attachments: HIVE-15840.1.patch > > > TestPig_5 is failing at percentage check if the job is Pig on Tez: > check_job_percent_complete failed. got percentComplete , expected 100% > complete > Test command: > curl -d user.name=daijy -d arg=-p -d arg=INPDIR=/tmp/templeton_test_data -d > arg=-p -d arg=OUTDIR=/tmp/output -d file=loadstore.pig -X POST > http://localhost:50111/templeton/v1/pig > curl > http://localhost:50111/templeton/v1/jobs/job_1486502484681_0003?user.name=daijy > This is similar to HIVE-9351, which fixes Hive on Tez. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15841) Upgrade Hive to ORC 1.3.2
[ https://issues.apache.org/jira/browse/HIVE-15841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15857023#comment-15857023 ] ASF GitHub Bot commented on HIVE-15841: --- GitHub user omalley opened a pull request: https://github.com/apache/hive/pull/142 HIVE-15841. Upgrade to ORC 1.3.2. You can merge this pull request into a Git repository by running: $ git pull https://github.com/omalley/hive hive-15841 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/hive/pull/142.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #142 commit 3078b0f32fec97607b97fa31e29527da50631099 Author: Owen O'MalleyDate: 2017-02-07T23:27:31Z HIVE-15841. Upgrade to ORC 1.3.2. > Upgrade Hive to ORC 1.3.2 > - > > Key: HIVE-15841 > URL: https://issues.apache.org/jira/browse/HIVE-15841 > Project: Hive > Issue Type: Bug >Reporter: Owen O'Malley > > Hive needs ORC-141 and ORC-135, so we should upgrade to ORC 1.3.2 once it > releases. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-10562) Add version column to NOTIFICATION_LOG table and DbNotificationListener
[ https://issues.apache.org/jira/browse/HIVE-10562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15856998#comment-15856998 ] Daniel Dai commented on HIVE-10562: --- andFilter is a new feature and we shall add a test. Also I notice this patch also piggyback varchar -> clob change, we shall include this in Jira title. Otherwise looks good. > Add version column to NOTIFICATION_LOG table and DbNotificationListener > --- > > Key: HIVE-10562 > URL: https://issues.apache.org/jira/browse/HIVE-10562 > Project: Hive > Issue Type: Sub-task > Components: Import/Export >Affects Versions: 1.2.0 >Reporter: Sushanth Sowmyan >Assignee: Sushanth Sowmyan > Attachments: HIVE-10562.2.patch, HIVE-10562.3.patch, > HIVE-10562.4.patch, HIVE-10562.patch > > > Currently, we have a JSON encoded message being stored in the > NOTIFICATION_LOG table. > If we want to be future proof, we need to allow for versioning of this > message, since we might change what gets stored in the message. A prime > example of what we'd want to change is as in HIVE-10393. > MessageFactory already has stubs to allow for versioning of messages, and we > could expand on this further in the future. NotificationListener currently > encodes the message version into the header for the JMS message it sends, > which seems to be the right place for a message version (instead of being > contained in the message, for eg.). > So, we should have a similar ability for DbEventListener as well, and the > place this makes the most sense is to and add a version column to the > NOTIFICATION_LOG table. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15754) exchange partition is not generating notifications
[ https://issues.apache.org/jira/browse/HIVE-15754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15856991#comment-15856991 ] Sergio Peña commented on HIVE-15754: Test failures are not related, and flaky tests are already reported on HIVE-15058. +1 > exchange partition is not generating notifications > -- > > Key: HIVE-15754 > URL: https://issues.apache.org/jira/browse/HIVE-15754 > Project: Hive > Issue Type: Bug > Components: HCatalog >Affects Versions: 2.2.0 >Reporter: Nachiket Vaidya >Assignee: Nachiket Vaidya >Priority: Critical > Fix For: 2.2.0 > > Attachments: HIVE-15754.0.patch, HIVE-15754.1.patch > > > exchange partition event is not generating notifications in notification_log. > There should multiple events generated. one add_partition event and several > drop_partition events. > for example: > {noformat} > ALTER TABLE tab1 EXCHANGE PARTITION (part=1) WITH TABLE tab2; > {noformat} > There should be the following events: > ADD_PARTITION on tab2 on partition (part=1) > DROP_PARTITION on tab1 on partition (part=1) -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15841) Upgrade Hive to ORC 1.3.2
[ https://issues.apache.org/jira/browse/HIVE-15841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley updated HIVE-15841: - Description: Hive needs ORC-141 and ORC-135, so we should upgrade to ORC 1.3.2 once it releases. (was: Hive needs ORC-141 and ORC-135, so we should upgrade to ORC-1.3.2 once it releases.) > Upgrade Hive to ORC 1.3.2 > - > > Key: HIVE-15841 > URL: https://issues.apache.org/jira/browse/HIVE-15841 > Project: Hive > Issue Type: Bug >Reporter: Owen O'Malley > > Hive needs ORC-141 and ORC-135, so we should upgrade to ORC 1.3.2 once it > releases. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15840) Webhcat test TestPig_5 failing with Pig on Tez at check for percent complete of job
[ https://issues.apache.org/jira/browse/HIVE-15840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-15840: -- Status: Patch Available (was: Open) > Webhcat test TestPig_5 failing with Pig on Tez at check for percent complete > of job > --- > > Key: HIVE-15840 > URL: https://issues.apache.org/jira/browse/HIVE-15840 > Project: Hive > Issue Type: Bug > Components: HCatalog >Reporter: Daniel Dai >Assignee: Daniel Dai > Attachments: HIVE-15840.1.patch > > > TestPig_5 is failing at percentage check if the job is Pig on Tez: > check_job_percent_complete failed. got percentComplete , expected 100% > complete > Test command: > curl -d user.name=daijy -d arg=-p -d arg=INPDIR=/tmp/templeton_test_data -d > arg=-p -d arg=OUTDIR=/tmp/output -d file=loadstore.pig -X POST > http://localhost:50111/templeton/v1/pig > curl > http://localhost:50111/templeton/v1/jobs/job_1486502484681_0003?user.name=daijy > This is similar to HIVE-9351, which fixes Hive on Tez. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15840) Webhcat test TestPig_5 failing with Pig on Tez at check for percent complete of job
[ https://issues.apache.org/jira/browse/HIVE-15840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-15840: -- Attachment: HIVE-15840.1.patch > Webhcat test TestPig_5 failing with Pig on Tez at check for percent complete > of job > --- > > Key: HIVE-15840 > URL: https://issues.apache.org/jira/browse/HIVE-15840 > Project: Hive > Issue Type: Bug > Components: HCatalog >Reporter: Daniel Dai >Assignee: Daniel Dai > Attachments: HIVE-15840.1.patch > > > TestPig_5 is failing at percentage check if the job is Pig on Tez: > check_job_percent_complete failed. got percentComplete , expected 100% > complete > Test command: > curl -d user.name=daijy -d arg=-p -d arg=INPDIR=/tmp/templeton_test_data -d > arg=-p -d arg=OUTDIR=/tmp/output -d file=loadstore.pig -X POST > http://localhost:50111/templeton/v1/pig > curl > http://localhost:50111/templeton/v1/jobs/job_1486502484681_0003?user.name=daijy > This is similar to HIVE-9351, which fixes Hive on Tez. -- This message was sent by Atlassian JIRA (v6.3.15#6346)