[jira] [Updated] (HIVE-16115) Stop printing progress info from operation logs with beeline progress bar
[ https://issues.apache.org/jira/browse/HIVE-16115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] anishek updated HIVE-16115: --- Attachment: HIVE-16115.4.patch fixing the test case > Stop printing progress info from operation logs with beeline progress bar > - > > Key: HIVE-16115 > URL: https://issues.apache.org/jira/browse/HIVE-16115 > Project: Hive > Issue Type: Improvement > Components: HiveServer2 >Affects Versions: 2.2.0 >Reporter: anishek >Assignee: anishek >Priority: Minor > Fix For: 2.2.0 > > Attachments: HIVE-16115.1.patch, HIVE-16115.2.patch, > HIVE-16115.3.patch, HIVE-16115.4.patch > > > when in progress bar is enabled, we should not print the progress information > via the operations logs. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15912) Executor kill task and Failed to get spark memory/core info
[ https://issues.apache.org/jira/browse/HIVE-15912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15904644#comment-15904644 ] Yi Yao commented on HIVE-15912: --- Hi [~lirui], Thanks for your quick response. Below is the hive log. [LOG] 2017-03-09 17:23:29,761 WARN [main]: impl.RemoteSparkJobStatus (RemoteSparkJobStatus.java:getSparkStageInfo(162)) - Error getting stage info java.util.concurrent.TimeoutException at io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:49) at org.apache.hadoop.hive.ql.exec.spark.status.impl.RemoteSparkJobStatus.getSparkStageInfo(RemoteSparkJobStatus.java:160) at org.apache.hadoop.hive.ql.exec.spark.status.impl.RemoteSparkJobStatus.getSparkStageProgress(RemoteSparkJobStatus.java:96) at org.apache.hadoop.hive.ql.exec.spark.status.RemoteSparkJobMonitor.startMonitor(RemoteSparkJobMonitor.java:82) at org.apache.hadoop.hive.ql.exec.spark.status.impl.RemoteSparkJobRef.monitorJob(RemoteSparkJobRef.java:60) at org.apache.hadoop.hive.ql.exec.spark.SparkTask.execute(SparkTask.java:109) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:214) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1976) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1689) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1421) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1205) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1195) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:220) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:172) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:383) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:318) at org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:416) at org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:432) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:726) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:693) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:628) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.run(RunJar.java:221) at org.apache.hadoop.util.RunJar.main(RunJar.java:136) 2017-03-09 17:23:29,764 ERROR [main]: status.SparkJobMonitor (RemoteSparkJobMonitor.java:startMonitor(132)) - Failed to monitor Job[ 2] with exception 'java.lang.IllegalStateException(RPC channel is closed.)' java.lang.IllegalStateException: RPC channel is closed. at com.google.common.base.Preconditions.checkState(Preconditions.java:145) at org.apache.hive.spark.client.rpc.Rpc.call(Rpc.java:276) at org.apache.hive.spark.client.SparkClientImpl$ClientProtocol.run(SparkClientImpl.java:550) at org.apache.hive.spark.client.SparkClientImpl.run(SparkClientImpl.java:145) at org.apache.hadoop.hive.ql.exec.spark.status.impl.RemoteSparkJobStatus.getSparkStageInfo(RemoteSparkJobStatus.java:158) at org.apache.hadoop.hive.ql.exec.spark.status.impl.RemoteSparkJobStatus.getSparkStageProgress(RemoteSparkJobStatus.java:96) at org.apache.hadoop.hive.ql.exec.spark.status.RemoteSparkJobMonitor.startMonitor(RemoteSparkJobMonitor.java:82) at org.apache.hadoop.hive.ql.exec.spark.status.impl.RemoteSparkJobRef.monitorJob(RemoteSparkJobRef.java:60) at org.apache.hadoop.hive.ql.exec.spark.SparkTask.execute(SparkTask.java:109) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:214) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1976) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1689) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1421) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1205) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1195) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:220) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:172) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:383) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:318) at org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:416) at
[jira] [Updated] (HIVE-15849) hplsql should add enterGlobalScope func to UDF
[ https://issues.apache.org/jira/browse/HIVE-15849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fei Hui updated HIVE-15849: --- Status: Open (was: Patch Available) > hplsql should add enterGlobalScope func to UDF > -- > > Key: HIVE-15849 > URL: https://issues.apache.org/jira/browse/HIVE-15849 > Project: Hive > Issue Type: Bug > Components: hpl/sql >Affects Versions: 2.2.0 >Reporter: Fei Hui >Assignee: Fei Hui > Attachments: HIVE-15849.1.patch, HIVE-15849.patch > > > code in Udf.java > {code:title=Udf.java|borderStyle=solid} > if (exec == null) { > exec = new Exec(); > String query = queryOI.getPrimitiveJavaObject(arguments[0].get()); > String[] args = { "-e", query, "-trace" }; > try { > exec.setUdfRun(true); > exec.init(args); > } catch (Exception e) { > throw new HiveException(e.getMessage()); > } > } > if (arguments.length > 1) { > setParameters(arguments); > } > Var result = exec.run(); > if (result != null) { > return result.toString(); > } > {code} > Here is my thoughts > {quote} > we should add 'exec.enterGlobalScope(); ' between 'exec = new Exec();' and > 'setParameters(arguments);' > Because if we do not call exec.enterGlobalScope(), setParameters(arguments) > will useless. Vars are not added into scope , but exec.run() will use vars > which we set. The vars are parameters passed to UDF, [, :1, :2, ...n] which > are description in Udf.java > {quote} > Before add this function, the result as follow. we get the wrong result, > because the result contains empty string > {quote} > Starting pre-SQL statement > Starting pre-SQL statement > Starting pre-SQL statement > Starting pre-SQL statement > Starting pre-SQL statement > Starting query > Query executed successfully (2.30 sec) > Ln:8 SELECT completed successfully > Ln:8 Standalone SELECT executed: 1 columns in the result set > Hello, ! > Hello, ! > {quote} > After add this function, we get the right result > {quote} > Starting pre-SQL statement > Starting pre-SQL statement > Starting pre-SQL statement > Starting pre-SQL statement > Starting pre-SQL statement > Starting query > Query executed successfully (2.35 sec) > Ln:8 SELECT completed successfully > Ln:8 Standalone SELECT executed: 1 columns in the result set > Hello, fei! > Hello, fei! > {quote} > tests come from http://www.hplsql.org/udf -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15849) hplsql should add enterGlobalScope func to UDF
[ https://issues.apache.org/jira/browse/HIVE-15849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fei Hui updated HIVE-15849: --- Status: Patch Available (was: Open) > hplsql should add enterGlobalScope func to UDF > -- > > Key: HIVE-15849 > URL: https://issues.apache.org/jira/browse/HIVE-15849 > Project: Hive > Issue Type: Bug > Components: hpl/sql >Affects Versions: 2.2.0 >Reporter: Fei Hui >Assignee: Fei Hui > Attachments: HIVE-15849.1.patch, HIVE-15849.patch > > > code in Udf.java > {code:title=Udf.java|borderStyle=solid} > if (exec == null) { > exec = new Exec(); > String query = queryOI.getPrimitiveJavaObject(arguments[0].get()); > String[] args = { "-e", query, "-trace" }; > try { > exec.setUdfRun(true); > exec.init(args); > } catch (Exception e) { > throw new HiveException(e.getMessage()); > } > } > if (arguments.length > 1) { > setParameters(arguments); > } > Var result = exec.run(); > if (result != null) { > return result.toString(); > } > {code} > Here is my thoughts > {quote} > we should add 'exec.enterGlobalScope(); ' between 'exec = new Exec();' and > 'setParameters(arguments);' > Because if we do not call exec.enterGlobalScope(), setParameters(arguments) > will useless. Vars are not added into scope , but exec.run() will use vars > which we set. The vars are parameters passed to UDF, [, :1, :2, ...n] which > are description in Udf.java > {quote} > Before add this function, the result as follow. we get the wrong result, > because the result contains empty string > {quote} > Starting pre-SQL statement > Starting pre-SQL statement > Starting pre-SQL statement > Starting pre-SQL statement > Starting pre-SQL statement > Starting query > Query executed successfully (2.30 sec) > Ln:8 SELECT completed successfully > Ln:8 Standalone SELECT executed: 1 columns in the result set > Hello, ! > Hello, ! > {quote} > After add this function, we get the right result > {quote} > Starting pre-SQL statement > Starting pre-SQL statement > Starting pre-SQL statement > Starting pre-SQL statement > Starting pre-SQL statement > Starting query > Query executed successfully (2.35 sec) > Ln:8 SELECT completed successfully > Ln:8 Standalone SELECT executed: 1 columns in the result set > Hello, fei! > Hello, fei! > {quote} > tests come from http://www.hplsql.org/udf -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16171) Support truncate table for replication
[ https://issues.apache.org/jira/browse/HIVE-16171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sankar Hariappan updated HIVE-16171: Description: Need to support truncate table for replication. Key points to note. 1. For non-partitioned table, truncate table will remove all the rows from the table. 2. For partitioned tables, need to consider how truncate behaves if drop a partition or rename the partition or so. was: Need to support truncate table for replication. Key points to note. 1. For non-partitioned table, truncate table doesn't make any sense as only drop table shall delete the data. 2. For partitioned tables, need to consider how truncate behaves if drop a partition or rename the partition or so. > Support truncate table for replication > -- > > Key: HIVE-16171 > URL: https://issues.apache.org/jira/browse/HIVE-16171 > Project: Hive > Issue Type: Sub-task > Components: repl >Reporter: Sankar Hariappan >Assignee: Sankar Hariappan > > Need to support truncate table for replication. Key points to note. > 1. For non-partitioned table, truncate table will remove all the rows from > the table. > 2. For partitioned tables, need to consider how truncate behaves if drop a > partition or rename the partition or so. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16172) Switch to a fairness lock to synchronize HS2 thrift client
[ https://issues.apache.org/jira/browse/HIVE-16172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15904627#comment-15904627 ] Tao Li commented on HIVE-16172: --- cc [~gopalv] in case there is any perf concern. Forcing the fairness does hurt the throughput, compared with no fairness, according to: https://docs.oracle.com/javase/7/docs/api/java/util/concurrent/locks/ReentrantLock.html > Switch to a fairness lock to synchronize HS2 thrift client > -- > > Key: HIVE-16172 > URL: https://issues.apache.org/jira/browse/HIVE-16172 > Project: Hive > Issue Type: Bug >Reporter: Tao Li >Assignee: Tao Li > Attachments: HIVE-16172.1.patch > > > A synchronized block is used in > "org.apache.hive.jdbc.HiveConnection.SynchronizedHandler.invoke(Object, > Method, Object[])" to synchronize the client invocations. The problem is that > it does not guarantee any fairness. One issue we were seeing is that a > cancellation request was not able to be issued to HS2 until all the > getOperationStatus() calls are finished from a while loop. Thus the > cancellation cannot take effect. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16172) Switch to a fairness lock to synchronize HS2 thrift client
[ https://issues.apache.org/jira/browse/HIVE-16172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15904623#comment-15904623 ] Tao Li commented on HIVE-16172: --- I have verified the change by cancelling the SQL query while it's executed. Without the change, the query was not able to be cancelled (see details in the JIRA description). With the change the cancellation worked. Note that we need to set fairness to true when creating the lock, otherwise the effect is similar to the original synchronized block. [~daijy] Can you please review this change? > Switch to a fairness lock to synchronize HS2 thrift client > -- > > Key: HIVE-16172 > URL: https://issues.apache.org/jira/browse/HIVE-16172 > Project: Hive > Issue Type: Bug >Reporter: Tao Li >Assignee: Tao Li > Attachments: HIVE-16172.1.patch > > > A synchronized block is used in > "org.apache.hive.jdbc.HiveConnection.SynchronizedHandler.invoke(Object, > Method, Object[])" to synchronize the client invocations. The problem is that > it does not guarantee any fairness. One issue we were seeing is that a > cancellation request was not able to be issued to HS2 until all the > getOperationStatus() calls are finished from a while loop. Thus the > cancellation cannot take effect. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16172) Switch to a fairness lock to synchronize HS2 thrift client
[ https://issues.apache.org/jira/browse/HIVE-16172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Li updated HIVE-16172: -- Description: A synchronized block is used in "org.apache.hive.jdbc.HiveConnection.SynchronizedHandler.invoke(Object, Method, Object[])" to synchronize the client invocations. The problem is that it does not guarantee any fairness. One issue we were seeing is that a cancellation request was not able to be issued to HS2 until all the getOperationStatus() calls are finished from a while loop. Thus the cancellation cannot take effect. > Switch to a fairness lock to synchronize HS2 thrift client > -- > > Key: HIVE-16172 > URL: https://issues.apache.org/jira/browse/HIVE-16172 > Project: Hive > Issue Type: Bug >Reporter: Tao Li >Assignee: Tao Li > Attachments: HIVE-16172.1.patch > > > A synchronized block is used in > "org.apache.hive.jdbc.HiveConnection.SynchronizedHandler.invoke(Object, > Method, Object[])" to synchronize the client invocations. The problem is that > it does not guarantee any fairness. One issue we were seeing is that a > cancellation request was not able to be issued to HS2 until all the > getOperationStatus() calls are finished from a while loop. Thus the > cancellation cannot take effect. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16172) Switch to a fairness lock to synchronize HS2 thrift client
[ https://issues.apache.org/jira/browse/HIVE-16172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Li updated HIVE-16172: -- Attachment: HIVE-16172.1.patch > Switch to a fairness lock to synchronize HS2 thrift client > -- > > Key: HIVE-16172 > URL: https://issues.apache.org/jira/browse/HIVE-16172 > Project: Hive > Issue Type: Bug >Reporter: Tao Li >Assignee: Tao Li > Attachments: HIVE-16172.1.patch > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16172) Switch to a fairness lock to synchronize HS2 thrift client
[ https://issues.apache.org/jira/browse/HIVE-16172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Li updated HIVE-16172: -- Status: Patch Available (was: Open) > Switch to a fairness lock to synchronize HS2 thrift client > -- > > Key: HIVE-16172 > URL: https://issues.apache.org/jira/browse/HIVE-16172 > Project: Hive > Issue Type: Bug >Reporter: Tao Li >Assignee: Tao Li > Attachments: HIVE-16172.1.patch > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Assigned] (HIVE-16172) Switch to a fairness lock to synchronize HS2 thrift client
[ https://issues.apache.org/jira/browse/HIVE-16172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Li reassigned HIVE-16172: - > Switch to a fairness lock to synchronize HS2 thrift client > -- > > Key: HIVE-16172 > URL: https://issues.apache.org/jira/browse/HIVE-16172 > Project: Hive > Issue Type: Bug >Reporter: Tao Li >Assignee: Tao Li > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16161) Disable "packaging.minimizeJar" for JDBC build
[ https://issues.apache.org/jira/browse/HIVE-16161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15904607#comment-15904607 ] Tao Li commented on HIVE-16161: --- [~vgumashta] Thanks! Can you commit it please? > Disable "packaging.minimizeJar" for JDBC build > -- > > Key: HIVE-16161 > URL: https://issues.apache.org/jira/browse/HIVE-16161 > Project: Hive > Issue Type: Bug >Reporter: Tao Li >Assignee: Tao Li >Priority: Critical > Attachments: HIVE-16161.1.patch > > > "packaging.minimizeJar" is set to true for jdbc/pom.xml, which causes the > standalone JDBC jar not having some necessary classes like > "org.apache.hive.org.apache.commons.logging.impl.LogFactoryImpl". We need to > set it to false to have the classes shaded into the jdbc jar as expected. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15849) hplsql should add enterGlobalScope func to UDF
[ https://issues.apache.org/jira/browse/HIVE-15849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fei Hui updated HIVE-15849: --- Status: Patch Available (was: Open) > hplsql should add enterGlobalScope func to UDF > -- > > Key: HIVE-15849 > URL: https://issues.apache.org/jira/browse/HIVE-15849 > Project: Hive > Issue Type: Bug > Components: hpl/sql >Affects Versions: 2.2.0 >Reporter: Fei Hui >Assignee: Fei Hui > Attachments: HIVE-15849.1.patch, HIVE-15849.patch > > > code in Udf.java > {code:title=Udf.java|borderStyle=solid} > if (exec == null) { > exec = new Exec(); > String query = queryOI.getPrimitiveJavaObject(arguments[0].get()); > String[] args = { "-e", query, "-trace" }; > try { > exec.setUdfRun(true); > exec.init(args); > } catch (Exception e) { > throw new HiveException(e.getMessage()); > } > } > if (arguments.length > 1) { > setParameters(arguments); > } > Var result = exec.run(); > if (result != null) { > return result.toString(); > } > {code} > Here is my thoughts > {quote} > we should add 'exec.enterGlobalScope(); ' between 'exec = new Exec();' and > 'setParameters(arguments);' > Because if we do not call exec.enterGlobalScope(), setParameters(arguments) > will useless. Vars are not added into scope , but exec.run() will use vars > which we set. The vars are parameters passed to UDF, [, :1, :2, ...n] which > are description in Udf.java > {quote} > Before add this function, the result as follow. we get the wrong result, > because the result contains empty string > {quote} > Starting pre-SQL statement > Starting pre-SQL statement > Starting pre-SQL statement > Starting pre-SQL statement > Starting pre-SQL statement > Starting query > Query executed successfully (2.30 sec) > Ln:8 SELECT completed successfully > Ln:8 Standalone SELECT executed: 1 columns in the result set > Hello, ! > Hello, ! > {quote} > After add this function, we get the right result > {quote} > Starting pre-SQL statement > Starting pre-SQL statement > Starting pre-SQL statement > Starting pre-SQL statement > Starting pre-SQL statement > Starting query > Query executed successfully (2.35 sec) > Ln:8 SELECT completed successfully > Ln:8 Standalone SELECT executed: 1 columns in the result set > Hello, fei! > Hello, fei! > {quote} > tests come from http://www.hplsql.org/udf -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15849) hplsql should add enterGlobalScope func to UDF
[ https://issues.apache.org/jira/browse/HIVE-15849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fei Hui updated HIVE-15849: --- Status: Open (was: Patch Available) > hplsql should add enterGlobalScope func to UDF > -- > > Key: HIVE-15849 > URL: https://issues.apache.org/jira/browse/HIVE-15849 > Project: Hive > Issue Type: Bug > Components: hpl/sql >Affects Versions: 2.2.0 >Reporter: Fei Hui >Assignee: Fei Hui > Attachments: HIVE-15849.1.patch, HIVE-15849.patch > > > code in Udf.java > {code:title=Udf.java|borderStyle=solid} > if (exec == null) { > exec = new Exec(); > String query = queryOI.getPrimitiveJavaObject(arguments[0].get()); > String[] args = { "-e", query, "-trace" }; > try { > exec.setUdfRun(true); > exec.init(args); > } catch (Exception e) { > throw new HiveException(e.getMessage()); > } > } > if (arguments.length > 1) { > setParameters(arguments); > } > Var result = exec.run(); > if (result != null) { > return result.toString(); > } > {code} > Here is my thoughts > {quote} > we should add 'exec.enterGlobalScope(); ' between 'exec = new Exec();' and > 'setParameters(arguments);' > Because if we do not call exec.enterGlobalScope(), setParameters(arguments) > will useless. Vars are not added into scope , but exec.run() will use vars > which we set. The vars are parameters passed to UDF, [, :1, :2, ...n] which > are description in Udf.java > {quote} > Before add this function, the result as follow. we get the wrong result, > because the result contains empty string > {quote} > Starting pre-SQL statement > Starting pre-SQL statement > Starting pre-SQL statement > Starting pre-SQL statement > Starting pre-SQL statement > Starting query > Query executed successfully (2.30 sec) > Ln:8 SELECT completed successfully > Ln:8 Standalone SELECT executed: 1 columns in the result set > Hello, ! > Hello, ! > {quote} > After add this function, we get the right result > {quote} > Starting pre-SQL statement > Starting pre-SQL statement > Starting pre-SQL statement > Starting pre-SQL statement > Starting pre-SQL statement > Starting query > Query executed successfully (2.35 sec) > Ln:8 SELECT completed successfully > Ln:8 Standalone SELECT executed: 1 columns in the result set > Hello, fei! > Hello, fei! > {quote} > tests come from http://www.hplsql.org/udf -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15912) Executor kill task and Failed to get spark memory/core info
[ https://issues.apache.org/jira/browse/HIVE-15912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15904593#comment-15904593 ] Rui Li commented on HIVE-15912: --- Hi [~yiyao], for yarn-client mode, you should check the hive log, which contains logs from both hive and RemoteDriver. Or you can upload the log here. > Executor kill task and Failed to get spark memory/core info > --- > > Key: HIVE-15912 > URL: https://issues.apache.org/jira/browse/HIVE-15912 > Project: Hive > Issue Type: Bug > Components: Hive, Spark >Affects Versions: 2.2.0 > Environment: hadoop2.7.1 > spark2.0.2 > Hive2.2 >Reporter: KaiXu > > Hive on Spark, failed with error: > Starting Spark Job = 12a8cb8c-ed0d-4049-ae06-8d32d13fe285 > Failed to monitor Job[ 6] with exception 'java.lang.IllegalStateException(RPC > channel is closed.)' > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.spark.SparkTask > Hive's log: > 2017-02-14T19:03:09,147 INFO [stderr-redir-1] client.SparkClientImpl: > 17/02/14 19:03:09 INFO yarn.Client: Application report for > application_1486905599813_0403 (state: ACCEPTED) > 2017-02-14T19:03:10,817 WARN [5bcf13e5-cb54-4cfe-a0d4-9a6556ab48b1 main] > spark.SetSparkReducerParallelism: Failed to get spark memory/core info > java.util.concurrent.TimeoutException > at > io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:49) > ~[netty-all-4.0.29.Final.jar:4.0.29.Final] > at > org.apache.hadoop.hive.ql.exec.spark.RemoteHiveSparkClient.getExecutorCount(RemoteHiveSparkClient.java:155) > ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.spark.RemoteHiveSparkClient.getExecutorCount(RemoteHiveSparkClient.java:165) > ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionImpl.getMemoryAndCores(SparkSessionImpl.java:77) > ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.optimizer.spark.SetSparkReducerParallelism.process(SetSparkReducerParallelism.java:119) > ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90) > ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:105) > ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:89) > ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.walk(DefaultGraphWalker.java:158) > ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:120) > ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.parse.spark.SparkCompiler.runJoinOptimizations(SparkCompiler.java:291) > ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.parse.spark.SparkCompiler.optimizeOperatorPlan(SparkCompiler.java:120) > ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.parse.TaskCompiler.compile(TaskCompiler.java:140) > ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:11085) > ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:279) > ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:258) > ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:510) > ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] > at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1302) > ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1442) > ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1222) > ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1212) > ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] > at > org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:233) > ~[hive-cli-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] > at > org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:184) > ~[hive-cli-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] > at >
[jira] [Updated] (HIVE-16170) Exclude relocation of org.apache.hadoop.security.* in the JDBC standalone jar
[ https://issues.apache.org/jira/browse/HIVE-16170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-16170: -- Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 2.2.0 Status: Resolved (was: Patch Available) +1. Patch pushed to master. Thanks Tao! > Exclude relocation of org.apache.hadoop.security.* in the JDBC standalone jar > - > > Key: HIVE-16170 > URL: https://issues.apache.org/jira/browse/HIVE-16170 > Project: Hive > Issue Type: Bug >Reporter: Tao Li >Assignee: Tao Li > Fix For: 2.2.0 > > Attachments: HIVE-16170.1.patch > > > There has been a use case that core-site.xml file is used along with the JDBC > jar, which sets "hadoop.security.group.mapping" using the class names such as > "org.apache.hadoop.security.JniBasedUnixGroupsMappingWithFallback". This will > cause CNF errors due to the renaming. So we need to exclude those security > related classes in the relocation part. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16091) Support subqueries in project/select
[ https://issues.apache.org/jira/browse/HIVE-16091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15904454#comment-15904454 ] Hive QA commented on HIVE-16091: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12857142/HIVE-16091.2.patch {color:green}SUCCESS:{color} +1 due to 6 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 10342 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[literal_decimal] (batchId=12) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[union_offcbo] (batchId=42) org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[subquery_in_select] (batchId=86) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/4061/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/4061/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-4061/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 3 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12857142 - PreCommit-HIVE-Build > Support subqueries in project/select > > > Key: HIVE-16091 > URL: https://issues.apache.org/jira/browse/HIVE-16091 > Project: Hive > Issue Type: Sub-task > Components: Logical Optimizer >Reporter: Vineet Garg >Assignee: Vineet Garg > Attachments: HIVE-16091.1.patch, HIVE-16091.2.patch > > > Currently scalar subqueries are supported in filter only (WHERE/HAVING). -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Assigned] (HIVE-16171) Support truncate table for replication
[ https://issues.apache.org/jira/browse/HIVE-16171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sankar Hariappan reassigned HIVE-16171: --- > Support truncate table for replication > -- > > Key: HIVE-16171 > URL: https://issues.apache.org/jira/browse/HIVE-16171 > Project: Hive > Issue Type: Sub-task > Components: repl >Reporter: Sankar Hariappan >Assignee: Sankar Hariappan > > Need to support truncate table for replication. Key points to note. > 1. For non-partitioned table, truncate table doesn't make any sense as only > drop table shall delete the data. > 2. For partitioned tables, need to consider how truncate behaves if drop a > partition or rename the partition or so. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16170) Exclude relocation of org.apache.hadoop.security.* in the JDBC standalone jar
[ https://issues.apache.org/jira/browse/HIVE-16170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15904422#comment-15904422 ] Tao Li commented on HIVE-16170: --- [~daijy] Can you please take look at this change? > Exclude relocation of org.apache.hadoop.security.* in the JDBC standalone jar > - > > Key: HIVE-16170 > URL: https://issues.apache.org/jira/browse/HIVE-16170 > Project: Hive > Issue Type: Bug >Reporter: Tao Li >Assignee: Tao Li > Attachments: HIVE-16170.1.patch > > > There has been a use case that core-site.xml file is used along with the JDBC > jar, which sets "hadoop.security.group.mapping" using the class names such as > "org.apache.hadoop.security.JniBasedUnixGroupsMappingWithFallback". This will > cause CNF errors due to the renaming. So we need to exclude those security > related classes in the relocation part. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16170) Exclude relocation of org.apache.hadoop.security.* in the JDBC standalone jar
[ https://issues.apache.org/jira/browse/HIVE-16170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Li updated HIVE-16170: -- Status: Patch Available (was: Open) > Exclude relocation of org.apache.hadoop.security.* in the JDBC standalone jar > - > > Key: HIVE-16170 > URL: https://issues.apache.org/jira/browse/HIVE-16170 > Project: Hive > Issue Type: Bug >Reporter: Tao Li >Assignee: Tao Li > Attachments: HIVE-16170.1.patch > > > There has been a use case that core-site.xml file is used along with the JDBC > jar, which sets "hadoop.security.group.mapping" using the class names such as > "org.apache.hadoop.security.JniBasedUnixGroupsMappingWithFallback". This will > cause CNF errors due to the renaming. So we need to exclude those security > related classes in the relocation part. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16170) Exclude relocation of org.apache.hadoop.security.* in the JDBC standalone jar
[ https://issues.apache.org/jira/browse/HIVE-16170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Li updated HIVE-16170: -- Attachment: HIVE-16170.1.patch > Exclude relocation of org.apache.hadoop.security.* in the JDBC standalone jar > - > > Key: HIVE-16170 > URL: https://issues.apache.org/jira/browse/HIVE-16170 > Project: Hive > Issue Type: Bug >Reporter: Tao Li >Assignee: Tao Li > Attachments: HIVE-16170.1.patch > > > There has been a use case that core-site.xml file is used along with the JDBC > jar, which sets "hadoop.security.group.mapping" using the class names such as > "org.apache.hadoop.security.JniBasedUnixGroupsMappingWithFallback". This will > cause CNF errors due to the renaming. So we need to exclude those security > related classes in the relocation part. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Comment Edited] (HIVE-15912) Executor kill task and Failed to get spark memory/core info
[ https://issues.apache.org/jira/browse/HIVE-15912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15904417#comment-15904417 ] Yi Yao edited comment on HIVE-15912 at 3/10/17 4:26 AM: Hi [~lirui], I also encountered this issue. I'm using yarn-client mode. My workload failed in 20 mins. [Settings] set hive.spark.job.monitor.timeout=3600s; set spark.network.timeout=3600s; [Hive's log] 2017-03-09 17:22:29,771 WARN [RPC-Handler-3]: client.SparkClientImpl (SparkClientImpl.java:rpcClosed(130)) - Client RPC channel closed unexpectedly. [App master log] 17/03/09 17:22:30 INFO yarn.YarnAllocator: Driver requested a total number of 0 executor(s). 17/03/09 17:22:30 INFO yarn.YarnAllocator: Canceling requests for 51 executor containers 17/03/09 17:22:30 INFO yarn.ApplicationMaster$AMEndpoint: Driver terminated or disconnected! Shutting down. 10.54.5.129:48757 17/03/09 17:22:30 INFO yarn.ApplicationMaster$AMEndpoint: Driver terminated or disconnected! Shutting down. master.titan.cluster.gao:48757 17/03/09 17:22:30 INFO yarn.ApplicationMaster: Final app status: SUCCEEDED, exitCode: 0 17/03/09 17:22:30 INFO yarn.ApplicationMaster: Unregistering ApplicationMaster with SUCCEEDED 17/03/09 17:22:30 INFO impl.AMRMClientImpl: Waiting for application to be successfully unregistered. 17/03/09 17:22:30 INFO yarn.ApplicationMaster: Deleting staging directory .sparkStaging/application_1488974634357_0022 17/03/09 17:22:30 INFO util.ShutdownHookManager: Shutdown hook calle was (Author: yiyao): I also encountered this issue. I'm using yarn-client mode. My workload failed in 20 mins. [~lirui] [Settings] set hive.spark.job.monitor.timeout=3600s; set spark.network.timeout=3600s; [Hive's log] 2017-03-09 17:22:29,771 WARN [RPC-Handler-3]: client.SparkClientImpl (SparkClientImpl.java:rpcClosed(130)) - Client RPC channel closed unexpectedly. [App master log] 17/03/09 17:22:30 INFO yarn.YarnAllocator: Driver requested a total number of 0 executor(s). 17/03/09 17:22:30 INFO yarn.YarnAllocator: Canceling requests for 51 executor containers 17/03/09 17:22:30 INFO yarn.ApplicationMaster$AMEndpoint: Driver terminated or disconnected! Shutting down. 10.54.5.129:48757 17/03/09 17:22:30 INFO yarn.ApplicationMaster$AMEndpoint: Driver terminated or disconnected! Shutting down. master.titan.cluster.gao:48757 17/03/09 17:22:30 INFO yarn.ApplicationMaster: Final app status: SUCCEEDED, exitCode: 0 17/03/09 17:22:30 INFO yarn.ApplicationMaster: Unregistering ApplicationMaster with SUCCEEDED 17/03/09 17:22:30 INFO impl.AMRMClientImpl: Waiting for application to be successfully unregistered. 17/03/09 17:22:30 INFO yarn.ApplicationMaster: Deleting staging directory .sparkStaging/application_1488974634357_0022 17/03/09 17:22:30 INFO util.ShutdownHookManager: Shutdown hook calle > Executor kill task and Failed to get spark memory/core info > --- > > Key: HIVE-15912 > URL: https://issues.apache.org/jira/browse/HIVE-15912 > Project: Hive > Issue Type: Bug > Components: Hive, Spark >Affects Versions: 2.2.0 > Environment: hadoop2.7.1 > spark2.0.2 > Hive2.2 >Reporter: KaiXu > > Hive on Spark, failed with error: > Starting Spark Job = 12a8cb8c-ed0d-4049-ae06-8d32d13fe285 > Failed to monitor Job[ 6] with exception 'java.lang.IllegalStateException(RPC > channel is closed.)' > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.spark.SparkTask > Hive's log: > 2017-02-14T19:03:09,147 INFO [stderr-redir-1] client.SparkClientImpl: > 17/02/14 19:03:09 INFO yarn.Client: Application report for > application_1486905599813_0403 (state: ACCEPTED) > 2017-02-14T19:03:10,817 WARN [5bcf13e5-cb54-4cfe-a0d4-9a6556ab48b1 main] > spark.SetSparkReducerParallelism: Failed to get spark memory/core info > java.util.concurrent.TimeoutException > at > io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:49) > ~[netty-all-4.0.29.Final.jar:4.0.29.Final] > at > org.apache.hadoop.hive.ql.exec.spark.RemoteHiveSparkClient.getExecutorCount(RemoteHiveSparkClient.java:155) > ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.spark.RemoteHiveSparkClient.getExecutorCount(RemoteHiveSparkClient.java:165) > ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionImpl.getMemoryAndCores(SparkSessionImpl.java:77) > ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.optimizer.spark.SetSparkReducerParallelism.process(SetSparkReducerParallelism.java:119) > ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90) >
[jira] [Commented] (HIVE-15912) Executor kill task and Failed to get spark memory/core info
[ https://issues.apache.org/jira/browse/HIVE-15912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15904417#comment-15904417 ] Yi Yao commented on HIVE-15912: --- I also encountered this issue. I'm using yarn-client mode. My workload failed in 20 mins. [~lirui] [Settings] set hive.spark.job.monitor.timeout=3600s; set spark.network.timeout=3600s; [Hive's log] 2017-03-09 17:22:29,771 WARN [RPC-Handler-3]: client.SparkClientImpl (SparkClientImpl.java:rpcClosed(130)) - Client RPC channel closed unexpectedly. [App master log] 17/03/09 17:22:30 INFO yarn.YarnAllocator: Driver requested a total number of 0 executor(s). 17/03/09 17:22:30 INFO yarn.YarnAllocator: Canceling requests for 51 executor containers 17/03/09 17:22:30 INFO yarn.ApplicationMaster$AMEndpoint: Driver terminated or disconnected! Shutting down. 10.54.5.129:48757 17/03/09 17:22:30 INFO yarn.ApplicationMaster$AMEndpoint: Driver terminated or disconnected! Shutting down. master.titan.cluster.gao:48757 17/03/09 17:22:30 INFO yarn.ApplicationMaster: Final app status: SUCCEEDED, exitCode: 0 17/03/09 17:22:30 INFO yarn.ApplicationMaster: Unregistering ApplicationMaster with SUCCEEDED 17/03/09 17:22:30 INFO impl.AMRMClientImpl: Waiting for application to be successfully unregistered. 17/03/09 17:22:30 INFO yarn.ApplicationMaster: Deleting staging directory .sparkStaging/application_1488974634357_0022 17/03/09 17:22:30 INFO util.ShutdownHookManager: Shutdown hook calle > Executor kill task and Failed to get spark memory/core info > --- > > Key: HIVE-15912 > URL: https://issues.apache.org/jira/browse/HIVE-15912 > Project: Hive > Issue Type: Bug > Components: Hive, Spark >Affects Versions: 2.2.0 > Environment: hadoop2.7.1 > spark2.0.2 > Hive2.2 >Reporter: KaiXu > > Hive on Spark, failed with error: > Starting Spark Job = 12a8cb8c-ed0d-4049-ae06-8d32d13fe285 > Failed to monitor Job[ 6] with exception 'java.lang.IllegalStateException(RPC > channel is closed.)' > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.spark.SparkTask > Hive's log: > 2017-02-14T19:03:09,147 INFO [stderr-redir-1] client.SparkClientImpl: > 17/02/14 19:03:09 INFO yarn.Client: Application report for > application_1486905599813_0403 (state: ACCEPTED) > 2017-02-14T19:03:10,817 WARN [5bcf13e5-cb54-4cfe-a0d4-9a6556ab48b1 main] > spark.SetSparkReducerParallelism: Failed to get spark memory/core info > java.util.concurrent.TimeoutException > at > io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:49) > ~[netty-all-4.0.29.Final.jar:4.0.29.Final] > at > org.apache.hadoop.hive.ql.exec.spark.RemoteHiveSparkClient.getExecutorCount(RemoteHiveSparkClient.java:155) > ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.spark.RemoteHiveSparkClient.getExecutorCount(RemoteHiveSparkClient.java:165) > ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionImpl.getMemoryAndCores(SparkSessionImpl.java:77) > ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.optimizer.spark.SetSparkReducerParallelism.process(SetSparkReducerParallelism.java:119) > ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90) > ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:105) > ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:89) > ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.walk(DefaultGraphWalker.java:158) > ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:120) > ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.parse.spark.SparkCompiler.runJoinOptimizations(SparkCompiler.java:291) > ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.parse.spark.SparkCompiler.optimizeOperatorPlan(SparkCompiler.java:120) > ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.parse.TaskCompiler.compile(TaskCompiler.java:140) > ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:11085) > ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:279) >
[jira] [Commented] (HIVE-15849) hplsql should add enterGlobalScope func to UDF
[ https://issues.apache.org/jira/browse/HIVE-15849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15904414#comment-15904414 ] Fei Hui commented on HIVE-15849: [~alangates] could you please take a look? add unit test , checking the var. We do not get the result just because the vars are not added to scope > hplsql should add enterGlobalScope func to UDF > -- > > Key: HIVE-15849 > URL: https://issues.apache.org/jira/browse/HIVE-15849 > Project: Hive > Issue Type: Bug > Components: hpl/sql >Affects Versions: 2.2.0 >Reporter: Fei Hui >Assignee: Fei Hui > Attachments: HIVE-15849.1.patch, HIVE-15849.patch > > > code in Udf.java > {code:title=Udf.java|borderStyle=solid} > if (exec == null) { > exec = new Exec(); > String query = queryOI.getPrimitiveJavaObject(arguments[0].get()); > String[] args = { "-e", query, "-trace" }; > try { > exec.setUdfRun(true); > exec.init(args); > } catch (Exception e) { > throw new HiveException(e.getMessage()); > } > } > if (arguments.length > 1) { > setParameters(arguments); > } > Var result = exec.run(); > if (result != null) { > return result.toString(); > } > {code} > Here is my thoughts > {quote} > we should add 'exec.enterGlobalScope(); ' between 'exec = new Exec();' and > 'setParameters(arguments);' > Because if we do not call exec.enterGlobalScope(), setParameters(arguments) > will useless. Vars are not added into scope , but exec.run() will use vars > which we set. The vars are parameters passed to UDF, [, :1, :2, ...n] which > are description in Udf.java > {quote} > Before add this function, the result as follow. we get the wrong result, > because the result contains empty string > {quote} > Starting pre-SQL statement > Starting pre-SQL statement > Starting pre-SQL statement > Starting pre-SQL statement > Starting pre-SQL statement > Starting query > Query executed successfully (2.30 sec) > Ln:8 SELECT completed successfully > Ln:8 Standalone SELECT executed: 1 columns in the result set > Hello, ! > Hello, ! > {quote} > After add this function, we get the right result > {quote} > Starting pre-SQL statement > Starting pre-SQL statement > Starting pre-SQL statement > Starting pre-SQL statement > Starting pre-SQL statement > Starting query > Query executed successfully (2.35 sec) > Ln:8 SELECT completed successfully > Ln:8 Standalone SELECT executed: 1 columns in the result set > Hello, fei! > Hello, fei! > {quote} > tests come from http://www.hplsql.org/udf -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Assigned] (HIVE-16170) Exclude relocation of org.apache.hadoop.security.* in the JDBC standalone jar
[ https://issues.apache.org/jira/browse/HIVE-16170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Li reassigned HIVE-16170: - > Exclude relocation of org.apache.hadoop.security.* in the JDBC standalone jar > - > > Key: HIVE-16170 > URL: https://issues.apache.org/jira/browse/HIVE-16170 > Project: Hive > Issue Type: Bug >Reporter: Tao Li >Assignee: Tao Li > > There has been a use case that core-site.xml file is used along with the JDBC > jar, which sets "hadoop.security.group.mapping" using the class names such as > "org.apache.hadoop.security.JniBasedUnixGroupsMappingWithFallback". This will > cause CNF errors due to the renaming. So we need to exclude those security > related classes in the relocation part. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15849) hplsql should add enterGlobalScope func to UDF
[ https://issues.apache.org/jira/browse/HIVE-15849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fei Hui updated HIVE-15849: --- Attachment: HIVE-15849.1.patch patch update with ut affter apply this patch, get the right result {quote} hive> add jar /usr/lib/hive-current/lib/antlr-runtime-4.5.jar; Added [/usr/lib/hive-current/lib/antlr-runtime-4.5.jar] to class path Added resources: [/usr/lib/hive-current/lib/antlr-runtime-4.5.jar] hive> add jar /usr/lib/hive-current/lib/hplsql.jar; Added [/usr/lib/hive-current/lib/hplsql.jar] to class path Added resources: [/usr/lib/hive-current/lib/hplsql.jar] hive> add file hplsqlrc; Added resources: [hplsqlrc] hive> CREATE TEMPORARY FUNCTION hplsql AS 'org.apache.hive.hplsql.Udf'; OK Time taken: 0.023 seconds hive> SELECT hplsql('hello(:1)', name) FROM users; OK Configuration file: file:/etc/emr/hive-conf-2.0.1/hplsql-site.xml Parser tree: (program (block (stmt (expr_stmt (expr (expr_func (ident hello) ( (expr_func_params (func_param (expr (expr_atom (ident :1) ))) INLCUDE CONTENT hplsqlrc (non-empty) Ln:1 CREATE FUNCTION hello Ln:1 EXEC FUNCTION hello Ln:1 SET PARAM text = fei Ln:4 RETURN Hello, fei! Time taken: 0.653 seconds, Fetched: 1 row(s) {quote} > hplsql should add enterGlobalScope func to UDF > -- > > Key: HIVE-15849 > URL: https://issues.apache.org/jira/browse/HIVE-15849 > Project: Hive > Issue Type: Bug > Components: hpl/sql >Affects Versions: 2.2.0 >Reporter: Fei Hui >Assignee: Fei Hui > Attachments: HIVE-15849.1.patch, HIVE-15849.patch > > > code in Udf.java > {code:title=Udf.java|borderStyle=solid} > if (exec == null) { > exec = new Exec(); > String query = queryOI.getPrimitiveJavaObject(arguments[0].get()); > String[] args = { "-e", query, "-trace" }; > try { > exec.setUdfRun(true); > exec.init(args); > } catch (Exception e) { > throw new HiveException(e.getMessage()); > } > } > if (arguments.length > 1) { > setParameters(arguments); > } > Var result = exec.run(); > if (result != null) { > return result.toString(); > } > {code} > Here is my thoughts > {quote} > we should add 'exec.enterGlobalScope(); ' between 'exec = new Exec();' and > 'setParameters(arguments);' > Because if we do not call exec.enterGlobalScope(), setParameters(arguments) > will useless. Vars are not added into scope , but exec.run() will use vars > which we set. The vars are parameters passed to UDF, [, :1, :2, ...n] which > are description in Udf.java > {quote} > Before add this function, the result as follow. we get the wrong result, > because the result contains empty string > {quote} > Starting pre-SQL statement > Starting pre-SQL statement > Starting pre-SQL statement > Starting pre-SQL statement > Starting pre-SQL statement > Starting query > Query executed successfully (2.30 sec) > Ln:8 SELECT completed successfully > Ln:8 Standalone SELECT executed: 1 columns in the result set > Hello, ! > Hello, ! > {quote} > After add this function, we get the right result > {quote} > Starting pre-SQL statement > Starting pre-SQL statement > Starting pre-SQL statement > Starting pre-SQL statement > Starting pre-SQL statement > Starting query > Query executed successfully (2.35 sec) > Ln:8 SELECT completed successfully > Ln:8 Standalone SELECT executed: 1 columns in the result set > Hello, fei! > Hello, fei! > {quote} > tests come from http://www.hplsql.org/udf -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Comment Edited] (HIVE-16115) Stop printing progress info from operation logs with beeline progress bar
[ https://issues.apache.org/jira/browse/HIVE-16115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15904395#comment-15904395 ] anishek edited comment on HIVE-16115 at 3/10/17 4:12 AM: - [~vgumashta] yes the failure is related, though I was trying to fix it with rebasing the patch on latest apache master and now its failing in @BeforeClass method, it seems to fail in some statistics calculation when dropping a table, basically the failure is different than what the run above has reported. I am investigating the same. was (Author: anishek): [~vgumashta] yes the failure is related, though I was trying to fix it with rebasing the patch on latest apache master and now its failing in @BeforeClass method, it seems to fail in some statistics calculation when dropping a table. I am investigating the same. > Stop printing progress info from operation logs with beeline progress bar > - > > Key: HIVE-16115 > URL: https://issues.apache.org/jira/browse/HIVE-16115 > Project: Hive > Issue Type: Improvement > Components: HiveServer2 >Affects Versions: 2.2.0 >Reporter: anishek >Assignee: anishek >Priority: Minor > Fix For: 2.2.0 > > Attachments: HIVE-16115.1.patch, HIVE-16115.2.patch, > HIVE-16115.3.patch > > > when in progress bar is enabled, we should not print the progress information > via the operations logs. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15983) Support the named columns join
[ https://issues.apache.org/jira/browse/HIVE-15983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15904401#comment-15904401 ] Hive QA commented on HIVE-15983: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12857137/HIVE-15983.02.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 10337 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[named_column_join] (batchId=71) org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[join_alt_syntax_comma_on] (batchId=86) org.apache.hive.jdbc.TestJdbcWithMiniHS2.testParallelCompilation (batchId=219) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/4060/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/4060/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-4060/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 3 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12857137 - PreCommit-HIVE-Build > Support the named columns join > -- > > Key: HIVE-15983 > URL: https://issues.apache.org/jira/browse/HIVE-15983 > Project: Hive > Issue Type: Sub-task > Components: SQL >Reporter: Carter Shanklin >Assignee: Pengcheng Xiong > Attachments: HIVE-15983.01.patch, HIVE-15983.02.patch > > > The named columns join is a common shortcut allowing joins on identically > named keys. Example: select * from t1 join t2 using c1 is equivalent to > select * from t1 join t2 on t1.c1 = t2.c1. SQL standard reference: Section 7.7 -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16115) Stop printing progress info from operation logs with beeline progress bar
[ https://issues.apache.org/jira/browse/HIVE-16115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15904395#comment-15904395 ] anishek commented on HIVE-16115: [~vgumashta] yes the failure is related, though I was trying to fix it with rebasing the patch on latest apache master and now its failing in @BeforeClass method, it seems to fail in some statistics calculation when dropping a table. I am investigating the same. > Stop printing progress info from operation logs with beeline progress bar > - > > Key: HIVE-16115 > URL: https://issues.apache.org/jira/browse/HIVE-16115 > Project: Hive > Issue Type: Improvement > Components: HiveServer2 >Affects Versions: 2.2.0 >Reporter: anishek >Assignee: anishek >Priority: Minor > Fix For: 2.2.0 > > Attachments: HIVE-16115.1.patch, HIVE-16115.2.patch, > HIVE-16115.3.patch > > > when in progress bar is enabled, we should not print the progress information > via the operations logs. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16104) LLAP: preemption may be too aggressive if the pre-empted task doesn't die immediately
[ https://issues.apache.org/jira/browse/HIVE-16104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15904375#comment-15904375 ] Siddharth Seth commented on HIVE-16104: --- bq. Sorry didn't get that. You mean for the waiting for completion? Wouldn't that block other ops like adding tasks? No. It's a wait on the giant lock. So there's no lock held. When a new operation happens, it can signal this. (The same lock used in the main scheduling loop) - which gets a signal from all other opeartions. Also, I think the loop can be simplified. trySChedule does not need to be done twice. > LLAP: preemption may be too aggressive if the pre-empted task doesn't die > immediately > - > > Key: HIVE-16104 > URL: https://issues.apache.org/jira/browse/HIVE-16104 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-16104.01.patch, HIVE-16104.02.patch, > HIVE-16104.patch > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16158) Correct mistake in documentation for ALTER TABLE … ADD/REPLACE COLUMNS CASCADE
[ https://issues.apache.org/jira/browse/HIVE-16158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15904347#comment-15904347 ] Lefty Leverenz commented on HIVE-16158: --- I also changed 0.15 to 1.1.0 in these docs: * [Permission Inheritance in Hive -- Behavior | https://cwiki.apache.org/confluence/display/Hive/Permission+Inheritance+in+Hive#PermissionInheritanceinHive-Behavior] * [ORC -- ORC File Dump Utility | https://cwiki.apache.org/confluence/display/Hive/LanguageManual+ORC#LanguageManualORC-ORCFileDumpUtility] * [Metastore Admin -- Additional Configuration Parameters -- hive.metastore.filter.hook | https://cwiki.apache.org/confluence/display/Hive/AdminManual+MetastoreAdmin#AdminManualMetastoreAdmin-AdditionalConfigurationParameters] But in this doc, 0.15 was wrong so I changed it to 1.0.0: * [HiveODBCDriver -- warning box after table of contents | https://cwiki.apache.org/confluence/display/Hive/HiveODBC#HiveODBC-HiveODBCDriver] > Correct mistake in documentation for ALTER TABLE … ADD/REPLACE COLUMNS CASCADE > -- > > Key: HIVE-16158 > URL: https://issues.apache.org/jira/browse/HIVE-16158 > Project: Hive > Issue Type: Bug > Components: Documentation >Affects Versions: 1.0.0 >Reporter: Illya Yalovyy >Assignee: Lefty Leverenz > > Current documentation says that key word CASCADE was introduced in Hive 0.15 > release. That information is incorrect and confuses users. The feature was > actually released in Hive 1.1.0. (HIVE-8839) > https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-Add/ReplaceColumns -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15506) Problem with creating table in hive
[ https://issues.apache.org/jira/browse/HIVE-15506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15904345#comment-15904345 ] Fan Yunbo commented on HIVE-15506: -- could you attach the log file for the further info? > Problem with creating table in hive > --- > > Key: HIVE-15506 > URL: https://issues.apache.org/jira/browse/HIVE-15506 > Project: Hive > Issue Type: Bug > Components: Database/Schema >Affects Versions: 2.1.0 > Environment: windows >Reporter: Maher Hattabi >Assignee: Maher Hattabi >Priority: Blocker > > HI > Am using Apache Hive 2.1.0 version under windows.I've deployed hive under > windows using MySQL as metastore .Am able to launch the hive shell.Once done > ,i am able to create databases,show tables but am not able to create table > ,i've struggled to be able to do it but i couldn't. > I've got this bug" > Error org.apache.hive.ql.exec.DDLTask.MetaException connections,we don't support retries at the client level> > Please find the hive-site.xml in this url. > http://www.mediafire.com/file/9864d5o308h6pk4/hive-site.xml > Please am not able to do anything ,thanks for your help. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Comment Edited] (HIVE-16104) LLAP: preemption may be too aggressive if the pre-empted task doesn't die immediately
[ https://issues.apache.org/jira/browse/HIVE-16104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15904326#comment-15904326 ] Sergey Shelukhin edited comment on HIVE-16104 at 3/10/17 3:15 AM: -- See https://docs.oracle.com/javase/7/docs/api/java/lang/System.html#nanoTime%28%29, it can be negative. Anyway there's no reason to have a magic value when it can be null. {quote} That's a style choice. There's nothing wrong with using an executor with a single thread. If anything, this should have been Executors.newSingleThreadExecutor {quote} That's complexity for no reason.. so not really a style choice. {quote} Think we're better off using the mega lock in the scheduler for the wait - that way it gets interrupted if some other task completes, another task asks to be scheduled, etc. (Doesn't need to wait for the specific fragment to end) {quote} Sorry didn't get that. You mean for the waiting for completion? Wouldn't that block other ops like adding tasks? {quote} Using isComplete as the wait mechanism has a race when the setIsCompleted() invocation happens in TaskRunnerCallable. {quote} What is the race? The wait is on the lock and so both wait and notify have to execute when the lock is taken. isComplete is also only examined under the same lock. So either the checker sees isComplete=true (after someone has set it and ran notify) and doesn't wait, or it sees false, so whoever notifies-All cannot notify before the checker releases the lock by going into wait, cause he needs the lock to notify; hence the waiting thread will get notified. Also isComplete is never unset, so that makes it even less problematic cause noone will set it to false. was (Author: sershe): See https://docs.oracle.com/javase/7/docs/api/java/lang/System.html#nanoTime%28%29, it can be negative. Anyway there's no reason to have a magic value when it can be null. {quote} That's a style choice. There's nothing wrong with using an executor with a single thread. If anything, this should have been Executors.newSingleThreadExecutor {quote} That's complexity for no reason.. so not really a style choice. {quote} Think we're better off using the mega lock in the scheduler for the wait - that way it gets interrupted if some other task completes, another task asks to be scheduled, etc. (Doesn't need to wait for the specific fragment to end) {quote} Sorry didn't get that. You mean for the waiting for completion? Wouldn't that block other ops like adding tasks? {quote} Using isComplete as the wait mechanism has a race when the setIsCompleted() invocation happens in TaskRunnerCallable. {quote} What is the race? The wait is on the lock and so both wait and notify have to execute when the lock is taken. isComplete is also only examined under the same lock. So either the checker sees isComplete=true (after someone has set it and ran notify) and doesn't wait, or it sees false, so whoever notifies-All cannot notify before the checker releases the lock, cause he needs the lock to notify. Even if he could, at worst checker would wait needlessly and return value of isComplete. Also isComplete is never unset, so that makes it even less problematic cause noone will set it to false. > LLAP: preemption may be too aggressive if the pre-empted task doesn't die > immediately > - > > Key: HIVE-16104 > URL: https://issues.apache.org/jira/browse/HIVE-16104 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-16104.01.patch, HIVE-16104.02.patch, > HIVE-16104.patch > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16133) Footer cache in Tez AM can take too much memory
[ https://issues.apache.org/jira/browse/HIVE-16133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15904331#comment-15904331 ] Hive QA commented on HIVE-16133: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12857116/HIVE-16133.02.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 10336 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[tez_vector_dynpart_hashjoin_1] (batchId=153) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_join_nulls] (batchId=144) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vectorized_ptf] (batchId=152) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[hybridgrace_hashjoin_1] (batchId=94) org.apache.hive.beeline.TestBeeLineWithArgs.testQueryProgress (batchId=213) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/4059/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/4059/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-4059/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 5 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12857116 - PreCommit-HIVE-Build > Footer cache in Tez AM can take too much memory > --- > > Key: HIVE-16133 > URL: https://issues.apache.org/jira/browse/HIVE-16133 > Project: Hive > Issue Type: Bug >Reporter: Siddharth Seth >Assignee: Sergey Shelukhin > Attachments: HIVE-16133.01.patch, HIVE-16133.02.patch, > HIVE-16133.02.patch, HIVE-16133.patch > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16133) Footer cache in Tez AM can take too much memory
[ https://issues.apache.org/jira/browse/HIVE-16133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15904330#comment-15904330 ] Sergey Shelukhin commented on HIVE-16133: - Looking at test failures > Footer cache in Tez AM can take too much memory > --- > > Key: HIVE-16133 > URL: https://issues.apache.org/jira/browse/HIVE-16133 > Project: Hive > Issue Type: Bug >Reporter: Siddharth Seth >Assignee: Sergey Shelukhin > Attachments: HIVE-16133.01.patch, HIVE-16133.02.patch, > HIVE-16133.02.patch, HIVE-16133.patch > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16104) LLAP: preemption may be too aggressive if the pre-empted task doesn't die immediately
[ https://issues.apache.org/jira/browse/HIVE-16104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15904326#comment-15904326 ] Sergey Shelukhin commented on HIVE-16104: - See https://docs.oracle.com/javase/7/docs/api/java/lang/System.html#nanoTime%28%29, it can be negative. Anyway there's no reason to have a magic value when it can be null. {quote} That's a style choice. There's nothing wrong with using an executor with a single thread. If anything, this should have been Executors.newSingleThreadExecutor {quote} That's complexity for no reason.. so not really a style choice. {quote} Think we're better off using the mega lock in the scheduler for the wait - that way it gets interrupted if some other task completes, another task asks to be scheduled, etc. (Doesn't need to wait for the specific fragment to end) {quote} Sorry didn't get that. You mean for the waiting for completion? Wouldn't that block other ops like adding tasks? {quote} Using isComplete as the wait mechanism has a race when the setIsCompleted() invocation happens in TaskRunnerCallable. {quote} What is the race? The wait is on the lock and so both wait and notify have to execute when the lock is taken. isComplete is also only examined under the same lock. So either the checker sees isComplete=true (after someone has set it and ran notify) and doesn't wait, or it sees false, so whoever notifies-All cannot notify before the checker releases the lock, cause he needs the lock to notify. Even if he could, at worst checker would wait needlessly and return value of isComplete. Also isComplete is never unset, so that makes it even less problematic cause noone will set it to false. > LLAP: preemption may be too aggressive if the pre-empted task doesn't die > immediately > - > > Key: HIVE-16104 > URL: https://issues.apache.org/jira/browse/HIVE-16104 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-16104.01.patch, HIVE-16104.02.patch, > HIVE-16104.patch > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-8864) Fix permission inheritance with HDFS encryption
[ https://issues.apache.org/jira/browse/HIVE-8864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15904319#comment-15904319 ] Lefty Leverenz commented on HIVE-8864: -- (Hive 0.15 became Hive 1.1.0.) > Fix permission inheritance with HDFS encryption > --- > > Key: HIVE-8864 > URL: https://issues.apache.org/jira/browse/HIVE-8864 > Project: Hive > Issue Type: Sub-task >Affects Versions: encryption-branch >Reporter: Szehon Ho >Assignee: Szehon Ho > Fix For: encryption-branch, 1.1.0 > > Attachments: HIVE-8864.patch > > > NO PRECOMMIT TESTS > Was trying the HDFS encryption patch and found some issues with permission > inheritance. This is to track keeping permission inheritance with new tmp > table scheme. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-8864) Fix permission inheritance with HDFS encryption
[ https://issues.apache.org/jira/browse/HIVE-8864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lefty Leverenz updated HIVE-8864: - Fix Version/s: 1.1.0 > Fix permission inheritance with HDFS encryption > --- > > Key: HIVE-8864 > URL: https://issues.apache.org/jira/browse/HIVE-8864 > Project: Hive > Issue Type: Sub-task >Affects Versions: encryption-branch >Reporter: Szehon Ho >Assignee: Szehon Ho > Fix For: encryption-branch, 1.1.0 > > Attachments: HIVE-8864.patch > > > NO PRECOMMIT TESTS > Was trying the HDFS encryption patch and found some issues with permission > inheritance. This is to track keeping permission inheritance with new tmp > table scheme. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-8065) Support HDFS encryption functionality on Hive
[ https://issues.apache.org/jira/browse/HIVE-8065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15904317#comment-15904317 ] Lefty Leverenz commented on HIVE-8065: -- The encryption branch was merged to trunk for release 1.1.0 (formerly known as 0.15). See HIVE-9264. > Support HDFS encryption functionality on Hive > - > > Key: HIVE-8065 > URL: https://issues.apache.org/jira/browse/HIVE-8065 > Project: Hive > Issue Type: Improvement >Affects Versions: 0.13.1 >Reporter: Sergio Peña >Assignee: Sergio Peña > Labels: TODOC15 > Fix For: 1.1.0 > > > The new encryption support on HDFS makes Hive incompatible and unusable when > this feature is used. > HDFS encryption is designed so that an user can configure different > encryption zones (or directories) for multi-tenant environments. An > encryption zone has an exclusive encryption key, such as AES-128 or AES-256. > Because of security compliance, the HDFS does not allow to move/rename files > between encryption zones. Renames are allowed only inside the same encryption > zone. A copy is allowed between encryption zones. > See HDFS-6134 for more details about HDFS encryption design. > Hive currently uses a scratch directory (like /tmp/$user/$random). This > scratch directory is used for the output of intermediate data (between MR > jobs) and for the final output of the hive query which is later moved to the > table directory location. > If Hive tables are in different encryption zones than the scratch directory, > then Hive won't be able to renames those files/directories, and it will make > Hive unusable. > To handle this problem, we can change the scratch directory of the > query/statement to be inside the same encryption zone of the table directory > location. This way, the renaming process will be successful. > Also, for statements that move files between encryption zones (i.e. LOAD > DATA), a copy may be executed instead of a rename. This will cause an > overhead when copying large data files, but it won't break the encryption on > Hive. > Another security thing to consider is when using joins selects. If Hive joins > different tables with different encryption key strengths, then the results of > the select might break the security compliance of the tables. Let's say two > tables with 128 bits and 256 bits encryption are joined, then the temporary > results might be stored in the 128 bits encryption zone. This will conflict > with the table encrypted with 256 bits temporary. > To fix this, Hive should be able to select the scratch directory that is more > secured/encrypted in order to save the intermediate data temporary with no > compliance issues. > For instance: > {noformat} > SELECT * FROM table-aes128 t1 JOIN table-aes256 t2 WHERE t1.id == t2.id; > {noformat} > - This should use a scratch directory (or staging directory) inside the > table-aes256 table location. > {noformat} > INSERT OVERWRITE TABLE table-unencrypted SELECT * FROM table-aes1; > {noformat} > - This should use a scratch directory inside the table-aes1 location. > {noformat} > FROM table-unencrypted > INSERT OVERWRITE TABLE table-aes128 SELECT id, name > INSERT OVERWRITE TABLE table-aes256 SELECT id, name > {noformat} > - This should use a scratch directory on each of the tables locations. > - The first SELECT will have its scratch directory on table-aes128 directory. > - The second SELECT will have its scratch directory on table-aes256 directory. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-8065) Support HDFS encryption functionality on Hive
[ https://issues.apache.org/jira/browse/HIVE-8065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lefty Leverenz updated HIVE-8065: - Fix Version/s: 1.1.0 > Support HDFS encryption functionality on Hive > - > > Key: HIVE-8065 > URL: https://issues.apache.org/jira/browse/HIVE-8065 > Project: Hive > Issue Type: Improvement >Affects Versions: 0.13.1 >Reporter: Sergio Peña >Assignee: Sergio Peña > Labels: TODOC15 > Fix For: 1.1.0 > > > The new encryption support on HDFS makes Hive incompatible and unusable when > this feature is used. > HDFS encryption is designed so that an user can configure different > encryption zones (or directories) for multi-tenant environments. An > encryption zone has an exclusive encryption key, such as AES-128 or AES-256. > Because of security compliance, the HDFS does not allow to move/rename files > between encryption zones. Renames are allowed only inside the same encryption > zone. A copy is allowed between encryption zones. > See HDFS-6134 for more details about HDFS encryption design. > Hive currently uses a scratch directory (like /tmp/$user/$random). This > scratch directory is used for the output of intermediate data (between MR > jobs) and for the final output of the hive query which is later moved to the > table directory location. > If Hive tables are in different encryption zones than the scratch directory, > then Hive won't be able to renames those files/directories, and it will make > Hive unusable. > To handle this problem, we can change the scratch directory of the > query/statement to be inside the same encryption zone of the table directory > location. This way, the renaming process will be successful. > Also, for statements that move files between encryption zones (i.e. LOAD > DATA), a copy may be executed instead of a rename. This will cause an > overhead when copying large data files, but it won't break the encryption on > Hive. > Another security thing to consider is when using joins selects. If Hive joins > different tables with different encryption key strengths, then the results of > the select might break the security compliance of the tables. Let's say two > tables with 128 bits and 256 bits encryption are joined, then the temporary > results might be stored in the 128 bits encryption zone. This will conflict > with the table encrypted with 256 bits temporary. > To fix this, Hive should be able to select the scratch directory that is more > secured/encrypted in order to save the intermediate data temporary with no > compliance issues. > For instance: > {noformat} > SELECT * FROM table-aes128 t1 JOIN table-aes256 t2 WHERE t1.id == t2.id; > {noformat} > - This should use a scratch directory (or staging directory) inside the > table-aes256 table location. > {noformat} > INSERT OVERWRITE TABLE table-unencrypted SELECT * FROM table-aes1; > {noformat} > - This should use a scratch directory inside the table-aes1 location. > {noformat} > FROM table-unencrypted > INSERT OVERWRITE TABLE table-aes128 SELECT id, name > INSERT OVERWRITE TABLE table-aes256 SELECT id, name > {noformat} > - This should use a scratch directory on each of the tables locations. > - The first SELECT will have its scratch directory on table-aes128 directory. > - The second SELECT will have its scratch directory on table-aes256 directory. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16164) Provide mechanism for passing HMS notification ID between transactional and non-transactional listeners.
[ https://issues.apache.org/jira/browse/HIVE-16164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15904294#comment-15904294 ] Alexander Kolbasov commented on HIVE-16164: --- I am also wondering why transactional listeners and non-transactional listeners do not reuse the same CreateTableEvent? > Provide mechanism for passing HMS notification ID between transactional and > non-transactional listeners. > > > Key: HIVE-16164 > URL: https://issues.apache.org/jira/browse/HIVE-16164 > Project: Hive > Issue Type: Improvement > Components: Metastore >Reporter: Sergio Peña >Assignee: Sergio Peña > > The HMS DB notification listener currently stores an event ID on the HMS > backend DB so that external applications (such as backup apps) can request > incremental notifications based on the last event ID requested. > The HMS DB notification and backup applications are asynchronous. However, > there are sometimes that applications may be required to be in sync with the > latest HMS event in order to process an action. These applications will > provide a listener implementation that is called by the HMS after an HMS > transaction happened. > The problem is that the listener running after the transaction (or during the > non-transactional context) may need the DB event ID in order to sync all > events happened previous to that event ID, but this ID is never passed to the > non-transactional listeners. > We can pass this event information through the EnvironmentContext found on > each ListenerEvent implementations (such as CreateTableEvent), and send the > EnvironmentContext to the non-transactional listeners to get the event ID. > The DbNotificactionListener already knows the event ID after calling the > ObjectStore.addNotificationEvent(). We just need to set this event ID to the > EnvironmentContext from each of the event notifications and make sure that > this EnvironmentContext is sent to the non-transactional listeners. > Here's the code example when creating a table on {{create_table_core}}: > {noformat} > ms.createTable(tbl); > if (transactionalListeners.size() > 0) { > CreateTableEvent createTableEvent = new CreateTableEvent(tbl, true, this); > createTableEvent.setEnvironmentContext(envContext); > for (MetaStoreEventListener transactionalListener : > transactionalListeners) { > transactionalListener.onCreateTable(createTableEvent); // <- > Here the notification ID is generated > } > } > success = ms.commitTransaction(); > } finally { > if (!success) { > ms.rollbackTransaction(); > if (madeDir) { > wh.deleteDir(tblPath, true); > } > } > for (MetaStoreEventListener listener : listeners) { > CreateTableEvent createTableEvent = > new CreateTableEvent(tbl, success, this); > createTableEvent.setEnvironmentContext(envContext); > listener.onCreateTable(createTableEvent);// <- > Here we would like to consume notification ID > } > {noformat} > We could use a specific key name that will be used on the EnvironmentContext, > such as DB_NOTIFICATION_EVENT_ID. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Issue Comment Deleted] (HIVE-16164) Provide mechanism for passing HMS notification ID between transactional and non-transactional listeners.
[ https://issues.apache.org/jira/browse/HIVE-16164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexander Kolbasov updated HIVE-16164: -- Comment: was deleted (was: Note that in the code snippet above the non-transactional listener gets a new createTableEvent, so anything set in the environment context of the transactional listener os lost. Why the event is not reused between transactional and non-transactional listeners?) > Provide mechanism for passing HMS notification ID between transactional and > non-transactional listeners. > > > Key: HIVE-16164 > URL: https://issues.apache.org/jira/browse/HIVE-16164 > Project: Hive > Issue Type: Improvement > Components: Metastore >Reporter: Sergio Peña >Assignee: Sergio Peña > > The HMS DB notification listener currently stores an event ID on the HMS > backend DB so that external applications (such as backup apps) can request > incremental notifications based on the last event ID requested. > The HMS DB notification and backup applications are asynchronous. However, > there are sometimes that applications may be required to be in sync with the > latest HMS event in order to process an action. These applications will > provide a listener implementation that is called by the HMS after an HMS > transaction happened. > The problem is that the listener running after the transaction (or during the > non-transactional context) may need the DB event ID in order to sync all > events happened previous to that event ID, but this ID is never passed to the > non-transactional listeners. > We can pass this event information through the EnvironmentContext found on > each ListenerEvent implementations (such as CreateTableEvent), and send the > EnvironmentContext to the non-transactional listeners to get the event ID. > The DbNotificactionListener already knows the event ID after calling the > ObjectStore.addNotificationEvent(). We just need to set this event ID to the > EnvironmentContext from each of the event notifications and make sure that > this EnvironmentContext is sent to the non-transactional listeners. > Here's the code example when creating a table on {{create_table_core}}: > {noformat} > ms.createTable(tbl); > if (transactionalListeners.size() > 0) { > CreateTableEvent createTableEvent = new CreateTableEvent(tbl, true, this); > createTableEvent.setEnvironmentContext(envContext); > for (MetaStoreEventListener transactionalListener : > transactionalListeners) { > transactionalListener.onCreateTable(createTableEvent); // <- > Here the notification ID is generated > } > } > success = ms.commitTransaction(); > } finally { > if (!success) { > ms.rollbackTransaction(); > if (madeDir) { > wh.deleteDir(tblPath, true); > } > } > for (MetaStoreEventListener listener : listeners) { > CreateTableEvent createTableEvent = > new CreateTableEvent(tbl, success, this); > createTableEvent.setEnvironmentContext(envContext); > listener.onCreateTable(createTableEvent);// <- > Here we would like to consume notification ID > } > {noformat} > We could use a specific key name that will be used on the EnvironmentContext, > such as DB_NOTIFICATION_EVENT_ID. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Comment Edited] (HIVE-16164) Provide mechanism for passing HMS notification ID between transactional and non-transactional listeners.
[ https://issues.apache.org/jira/browse/HIVE-16164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15904284#comment-15904284 ] Alexander Kolbasov edited comment on HIVE-16164 at 3/10/17 2:34 AM: Note that in the code snippet above the non-transactional listener gets a new createTableEvent, so anything set in the environment context of the transactional listener os lost. Why the event is not reused between transactional and non-transactional listeners? was (Author: akolb): Note that in the comment snippet above the non-transactional listener gets a new createTableEvent, so anything set in the environment context of the transactional listener os lost. Why the event is not reused between transactional and non-transactional listeners? > Provide mechanism for passing HMS notification ID between transactional and > non-transactional listeners. > > > Key: HIVE-16164 > URL: https://issues.apache.org/jira/browse/HIVE-16164 > Project: Hive > Issue Type: Improvement > Components: Metastore >Reporter: Sergio Peña >Assignee: Sergio Peña > > The HMS DB notification listener currently stores an event ID on the HMS > backend DB so that external applications (such as backup apps) can request > incremental notifications based on the last event ID requested. > The HMS DB notification and backup applications are asynchronous. However, > there are sometimes that applications may be required to be in sync with the > latest HMS event in order to process an action. These applications will > provide a listener implementation that is called by the HMS after an HMS > transaction happened. > The problem is that the listener running after the transaction (or during the > non-transactional context) may need the DB event ID in order to sync all > events happened previous to that event ID, but this ID is never passed to the > non-transactional listeners. > We can pass this event information through the EnvironmentContext found on > each ListenerEvent implementations (such as CreateTableEvent), and send the > EnvironmentContext to the non-transactional listeners to get the event ID. > The DbNotificactionListener already knows the event ID after calling the > ObjectStore.addNotificationEvent(). We just need to set this event ID to the > EnvironmentContext from each of the event notifications and make sure that > this EnvironmentContext is sent to the non-transactional listeners. > Here's the code example when creating a table on {{create_table_core}}: > {noformat} > ms.createTable(tbl); > if (transactionalListeners.size() > 0) { > CreateTableEvent createTableEvent = new CreateTableEvent(tbl, true, this); > createTableEvent.setEnvironmentContext(envContext); > for (MetaStoreEventListener transactionalListener : > transactionalListeners) { > transactionalListener.onCreateTable(createTableEvent); // <- > Here the notification ID is generated > } > } > success = ms.commitTransaction(); > } finally { > if (!success) { > ms.rollbackTransaction(); > if (madeDir) { > wh.deleteDir(tblPath, true); > } > } > for (MetaStoreEventListener listener : listeners) { > CreateTableEvent createTableEvent = > new CreateTableEvent(tbl, success, this); > createTableEvent.setEnvironmentContext(envContext); > listener.onCreateTable(createTableEvent);// <- > Here we would like to consume notification ID > } > {noformat} > We could use a specific key name that will be used on the EnvironmentContext, > such as DB_NOTIFICATION_EVENT_ID. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16132) DataSize stats don't seem correct in semijoin opt branch
[ https://issues.apache.org/jira/browse/HIVE-16132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Deepak Jaiswal updated HIVE-16132: -- Attachment: HIVE-16132.3.patch Updating results > DataSize stats don't seem correct in semijoin opt branch > > > Key: HIVE-16132 > URL: https://issues.apache.org/jira/browse/HIVE-16132 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Reporter: Deepak Jaiswal >Assignee: Deepak Jaiswal > Attachments: HIVE-16132.1.patch, HIVE-16132.2.patch, > HIVE-16132.3.patch > > > For the following operator tree snippet, the second Select is the start of a > semijoin optimization branch. Take a look at the Data size - it is the same > as the data size for its parent Select, even though the second select has > only a single bigint column in its projection (the parent has 2 columns). I > would expect the size to be 533328 (16 bytes * 3). > Fixing this estimate may become important if we need to estimate the cost of > generating the min/max/bloomfilter. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16164) Provide mechanism for passing HMS notification ID between transactional and non-transactional listeners.
[ https://issues.apache.org/jira/browse/HIVE-16164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15904284#comment-15904284 ] Alexander Kolbasov commented on HIVE-16164: --- Note that in the comment snippet above the non-transactional listener gets a new createTableEvent, so anything set in the environment context of the transactional listener os lost. Why the event is not reused between transactional and non-transactional listeners? > Provide mechanism for passing HMS notification ID between transactional and > non-transactional listeners. > > > Key: HIVE-16164 > URL: https://issues.apache.org/jira/browse/HIVE-16164 > Project: Hive > Issue Type: Improvement > Components: Metastore >Reporter: Sergio Peña >Assignee: Sergio Peña > > The HMS DB notification listener currently stores an event ID on the HMS > backend DB so that external applications (such as backup apps) can request > incremental notifications based on the last event ID requested. > The HMS DB notification and backup applications are asynchronous. However, > there are sometimes that applications may be required to be in sync with the > latest HMS event in order to process an action. These applications will > provide a listener implementation that is called by the HMS after an HMS > transaction happened. > The problem is that the listener running after the transaction (or during the > non-transactional context) may need the DB event ID in order to sync all > events happened previous to that event ID, but this ID is never passed to the > non-transactional listeners. > We can pass this event information through the EnvironmentContext found on > each ListenerEvent implementations (such as CreateTableEvent), and send the > EnvironmentContext to the non-transactional listeners to get the event ID. > The DbNotificactionListener already knows the event ID after calling the > ObjectStore.addNotificationEvent(). We just need to set this event ID to the > EnvironmentContext from each of the event notifications and make sure that > this EnvironmentContext is sent to the non-transactional listeners. > Here's the code example when creating a table on {{create_table_core}}: > {noformat} > ms.createTable(tbl); > if (transactionalListeners.size() > 0) { > CreateTableEvent createTableEvent = new CreateTableEvent(tbl, true, this); > createTableEvent.setEnvironmentContext(envContext); > for (MetaStoreEventListener transactionalListener : > transactionalListeners) { > transactionalListener.onCreateTable(createTableEvent); // <- > Here the notification ID is generated > } > } > success = ms.commitTransaction(); > } finally { > if (!success) { > ms.rollbackTransaction(); > if (madeDir) { > wh.deleteDir(tblPath, true); > } > } > for (MetaStoreEventListener listener : listeners) { > CreateTableEvent createTableEvent = > new CreateTableEvent(tbl, success, this); > createTableEvent.setEnvironmentContext(envContext); > listener.onCreateTable(createTableEvent);// <- > Here we would like to consume notification ID > } > {noformat} > We could use a specific key name that will be used on the EnvironmentContext, > such as DB_NOTIFICATION_EVENT_ID. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Issue Comment Deleted] (HIVE-16132) DataSize stats don't seem correct in semijoin opt branch
[ https://issues.apache.org/jira/browse/HIVE-16132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Deepak Jaiswal updated HIVE-16132: -- Comment: was deleted (was: Remove analyze for srcpart table in the test.) > DataSize stats don't seem correct in semijoin opt branch > > > Key: HIVE-16132 > URL: https://issues.apache.org/jira/browse/HIVE-16132 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Reporter: Deepak Jaiswal >Assignee: Deepak Jaiswal > Attachments: HIVE-16132.1.patch, HIVE-16132.2.patch > > > For the following operator tree snippet, the second Select is the start of a > semijoin optimization branch. Take a look at the Data size - it is the same > as the data size for its parent Select, even though the second select has > only a single bigint column in its projection (the parent has 2 columns). I > would expect the size to be 533328 (16 bytes * 3). > Fixing this estimate may become important if we need to estimate the cost of > generating the min/max/bloomfilter. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16132) DataSize stats don't seem correct in semijoin opt branch
[ https://issues.apache.org/jira/browse/HIVE-16132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Deepak Jaiswal updated HIVE-16132: -- Attachment: (was: HIVE-16132.3.patch) > DataSize stats don't seem correct in semijoin opt branch > > > Key: HIVE-16132 > URL: https://issues.apache.org/jira/browse/HIVE-16132 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Reporter: Deepak Jaiswal >Assignee: Deepak Jaiswal > Attachments: HIVE-16132.1.patch, HIVE-16132.2.patch > > > For the following operator tree snippet, the second Select is the start of a > semijoin optimization branch. Take a look at the Data size - it is the same > as the data size for its parent Select, even though the second select has > only a single bigint column in its projection (the parent has 2 columns). I > would expect the size to be 533328 (16 bytes * 3). > Fixing this estimate may become important if we need to estimate the cost of > generating the min/max/bloomfilter. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Assigned] (HIVE-16169) Improve StatsOptimizer to deal with groupby partition columns
[ https://issues.apache.org/jira/browse/HIVE-16169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong reassigned HIVE-16169: -- > Improve StatsOptimizer to deal with groupby partition columns > - > > Key: HIVE-16169 > URL: https://issues.apache.org/jira/browse/HIVE-16169 > Project: Hive > Issue Type: Bug >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > > As reported by [~ashutoshc] > 1) select sum(c), count(c),... from T group by b; > 2) select max(c), min(c), ... from T group by b; > If b happens to be a partition column, we can also answer these from > metadata. Currently, StatsOptimizer don't handle these queries, but we can > extend it to handle those as well. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16132) DataSize stats don't seem correct in semijoin opt branch
[ https://issues.apache.org/jira/browse/HIVE-16132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Deepak Jaiswal updated HIVE-16132: -- Attachment: HIVE-16132.3.patch Remove analyze for srcpart table in the test. > DataSize stats don't seem correct in semijoin opt branch > > > Key: HIVE-16132 > URL: https://issues.apache.org/jira/browse/HIVE-16132 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Reporter: Deepak Jaiswal >Assignee: Deepak Jaiswal > Attachments: HIVE-16132.1.patch, HIVE-16132.2.patch, > HIVE-16132.3.patch > > > For the following operator tree snippet, the second Select is the start of a > semijoin optimization branch. Take a look at the Data size - it is the same > as the data size for its parent Select, even though the second select has > only a single bigint column in its projection (the parent has 2 columns). I > would expect the size to be 533328 (16 bytes * 3). > Fixing this estimate may become important if we need to estimate the cost of > generating the min/max/bloomfilter. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16104) LLAP: preemption may be too aggressive if the pre-empted task doesn't die immediately
[ https://issues.apache.org/jira/browse/HIVE-16104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15904273#comment-15904273 ] Siddharth Seth commented on HIVE-16104: --- bq. Wrt special values, monotonic/nanotime can take any value so special values should not be used. How about "-1" as a value, don't think we'll see that as a Time.monotonicNow bq. #3 is completely unrelated, strange that you would ask for it here I'll make the change. Think it is related, since the change increases the chances of hitting this (seen while walking through the patch) bq. The class has so many TODO comments that adding another relevant one should not be a problem That's a style choice. There's nothing wrong with using an executor with a single thread. If anything, this should have been Executors.newSingleThreadExecutor Think we're better off using the mega lock in the scheduler for the wait - that way it gets interrupted if some other task completes, another task asks to be scheduled, etc. (Doesn't need to wait for the specific fragment to end) Using isComplete as the wait mechanism has a race when the setIsCompleted() invocation happens in TaskRunnerCallable. {code} lastVictim = handleScheduleAttemptedRejection(task); // We killed something. lastKillTimeMs = clock.getTime(); {code} Should this check for lastVictim being null? > LLAP: preemption may be too aggressive if the pre-empted task doesn't die > immediately > - > > Key: HIVE-16104 > URL: https://issues.apache.org/jira/browse/HIVE-16104 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-16104.01.patch, HIVE-16104.02.patch, > HIVE-16104.patch > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16158) Correct mistake in documentation for ALTER TABLE … ADD/REPLACE COLUMNS CASCADE
[ https://issues.apache.org/jira/browse/HIVE-16158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15904268#comment-15904268 ] Lefty Leverenz commented on HIVE-16158: --- Good catch [~yalovyyi]! I've fixed that mistake and a few more in the DDL doc. * [Alter Column -- ChangeColumnName/Type/Position/Comment | https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-ChangeColumnName/Type/Position/Comment] * [Alter Column -- Add/ReplaceColumns | https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-Add/ReplaceColumns] * [Describe Database | https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-DescribeDatabase] The confusion about 0.15.0 and 1.1.0 is documented on the wiki home page: * [Home -- Hive Versions and Branches | https://cwiki.apache.org/confluence/display/Hive/Home#Home-HiveVersionsandBranches] > Correct mistake in documentation for ALTER TABLE … ADD/REPLACE COLUMNS CASCADE > -- > > Key: HIVE-16158 > URL: https://issues.apache.org/jira/browse/HIVE-16158 > Project: Hive > Issue Type: Bug > Components: Documentation >Affects Versions: 1.0.0 >Reporter: Illya Yalovyy >Assignee: Lefty Leverenz > > Current documentation says that key word CASCADE was introduced in Hive 0.15 > release. That information is incorrect and confuses users. The feature was > actually released in Hive 1.1.0. (HIVE-8839) > https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-Add/ReplaceColumns -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16132) DataSize stats don't seem correct in semijoin opt branch
[ https://issues.apache.org/jira/browse/HIVE-16132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15904266#comment-15904266 ] Hive QA commented on HIVE-16132: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12857117/HIVE-16132.2.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 10331 tests executed *Failed tests:* {noformat} TestHs2Hooks - did not produce a TEST-*.xml file (likely timed out) (batchId=210) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[dynamic_semijoin_reduction] (batchId=148) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[mergejoin] (batchId=151) org.apache.hadoop.hive.cli.TestSparkNegativeCliDriver.org.apache.hadoop.hive.cli.TestSparkNegativeCliDriver (batchId=232) org.apache.hive.service.server.TestHS2HttpServer.testContextRootUrlRewrite (batchId=187) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/4058/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/4058/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-4058/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 5 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12857117 - PreCommit-HIVE-Build > DataSize stats don't seem correct in semijoin opt branch > > > Key: HIVE-16132 > URL: https://issues.apache.org/jira/browse/HIVE-16132 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Reporter: Deepak Jaiswal >Assignee: Deepak Jaiswal > Attachments: HIVE-16132.1.patch, HIVE-16132.2.patch > > > For the following operator tree snippet, the second Select is the start of a > semijoin optimization branch. Take a look at the Data size - it is the same > as the data size for its parent Select, even though the second select has > only a single bigint column in its projection (the parent has 2 columns). I > would expect the size to be 533328 (16 bytes * 3). > Fixing this estimate may become important if we need to estimate the cost of > generating the min/max/bloomfilter. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16168) llap log links should use the NM nodeId port instead of web port
[ https://issues.apache.org/jira/browse/HIVE-16168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15904260#comment-15904260 ] Prasanth Jayachandran commented on HIVE-16168: -- // TODO Move the following 2 properties out of Configuration to a constant. move to llap-common? Looks good otherwise. +1 > llap log links should use the NM nodeId port instead of web port > > > Key: HIVE-16168 > URL: https://issues.apache.org/jira/browse/HIVE-16168 > Project: Hive > Issue Type: Bug > Components: llap >Reporter: Siddharth Seth >Assignee: Siddharth Seth > Attachments: HIVE-16168.01.patch > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-14606) Beeline fails if quoted string ends with \\
[ https://issues.apache.org/jira/browse/HIVE-14606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sahil Takiar updated HIVE-14606: Attachment: HIVE-14606.2.patch Re-based patch, addressed Vihang's comments. > Beeline fails if quoted string ends with \\ > --- > > Key: HIVE-14606 > URL: https://issues.apache.org/jira/browse/HIVE-14606 > Project: Hive > Issue Type: Bug > Components: Beeline >Reporter: Sahil Takiar >Assignee: Sahil Takiar > Attachments: HIVE-14606.1.patch, HIVE-14606.2.patch > > > The following query fails in Beeline > {code} > select '\\' as literal; > {code} > Exception: > {code} > FAILED: ParseException line 1:22 extraneous input ';' expecting EOF near > '' > 16/08/22 15:46:15 [023ddb3b-1f3c-4db6-bd4e-bba392d6e4bb main]: ERROR > ql.Driver: FAILED: ParseException line 1:22 extraneous input ';' expecting > EOF near '' > org.apache.hadoop.hive.ql.parse.ParseException: line 1:22 extraneous input > ';' expecting EOF near '' > at > org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:215) > at > org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:166) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:414) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:335) > at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1226) > at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1195) > at > org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:197) > at > org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:282) > at > org.apache.hive.service.cli.operation.Operation.run(Operation.java:324) > at > org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:497) > at > org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:485) > at > org.apache.hive.service.cli.CLIService.executeStatementAsync(CLIService.java:294) > at > org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:505) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.hive.jdbc.HiveConnection$SynchronizedHandler.invoke(HiveConnection.java:1412) > at com.sun.proxy.$Proxy30.ExecuteStatement(Unknown Source) > at > org.apache.hive.jdbc.HiveStatement.runAsyncOnServer(HiveStatement.java:309) > at org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:250) > at org.apache.hive.beeline.Commands.executeInternal(Commands.java:976) > at org.apache.hive.beeline.Commands.execute(Commands.java:1132) > at org.apache.hive.beeline.Commands.sql(Commands.java:1062) > at org.apache.hive.beeline.BeeLine.dispatch(BeeLine.java:1168) > at org.apache.hive.beeline.BeeLine.execute(BeeLine.java:999) > at org.apache.hive.beeline.BeeLine.begin(BeeLine.java:909) > at > org.apache.hive.beeline.BeeLine.mainWithInputRedirection(BeeLine.java:511) > at org.apache.hive.beeline.BeeLine.main(BeeLine.java:494) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at org.apache.hadoop.util.RunJar.run(RunJar.java:221) > at org.apache.hadoop.util.RunJar.main(RunJar.java:136) > {code} > This bug is a regression introduced by HIVE-12646 -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16168) llap log links should use the NM nodeId port instead of web port
[ https://issues.apache.org/jira/browse/HIVE-16168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Seth updated HIVE-16168: -- Attachment: HIVE-16168.01.patch [~prasanth_j] - could you please take a look. > llap log links should use the NM nodeId port instead of web port > > > Key: HIVE-16168 > URL: https://issues.apache.org/jira/browse/HIVE-16168 > Project: Hive > Issue Type: Bug > Components: llap >Reporter: Siddharth Seth >Assignee: Siddharth Seth > Attachments: HIVE-16168.01.patch > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16168) llap log links should use the NM nodeId port instead of web port
[ https://issues.apache.org/jira/browse/HIVE-16168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Seth updated HIVE-16168: -- Status: Patch Available (was: Open) > llap log links should use the NM nodeId port instead of web port > > > Key: HIVE-16168 > URL: https://issues.apache.org/jira/browse/HIVE-16168 > Project: Hive > Issue Type: Bug > Components: llap >Reporter: Siddharth Seth >Assignee: Siddharth Seth > Attachments: HIVE-16168.01.patch > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Assigned] (HIVE-16158) Correct mistake in documentation for ALTER TABLE … ADD/REPLACE COLUMNS CASCADE
[ https://issues.apache.org/jira/browse/HIVE-16158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lefty Leverenz reassigned HIVE-16158: - Assignee: Lefty Leverenz > Correct mistake in documentation for ALTER TABLE … ADD/REPLACE COLUMNS CASCADE > -- > > Key: HIVE-16158 > URL: https://issues.apache.org/jira/browse/HIVE-16158 > Project: Hive > Issue Type: Bug > Components: Documentation >Affects Versions: 1.0.0 >Reporter: Illya Yalovyy >Assignee: Lefty Leverenz > > Current documentation says that key word CASCADE was introduced in Hive 0.15 > release. That information is incorrect and confuses users. The feature was > actually released in Hive 1.1.0. (HIVE-8839) > https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-Add/ReplaceColumns -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Assigned] (HIVE-16168) llap log links should use the NM nodeId port instead of web port
[ https://issues.apache.org/jira/browse/HIVE-16168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Seth reassigned HIVE-16168: - > llap log links should use the NM nodeId port instead of web port > > > Key: HIVE-16168 > URL: https://issues.apache.org/jira/browse/HIVE-16168 > Project: Hive > Issue Type: Bug > Components: llap >Reporter: Siddharth Seth >Assignee: Siddharth Seth > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16166) HS2 may still waste up to 15% of memory on duplicate strings
[ https://issues.apache.org/jira/browse/HIVE-16166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Misha Dmitriev updated HIVE-16166: -- Attachment: HIVE-16166.01.patch > HS2 may still waste up to 15% of memory on duplicate strings > > > Key: HIVE-16166 > URL: https://issues.apache.org/jira/browse/HIVE-16166 > Project: Hive > Issue Type: Improvement >Reporter: Misha Dmitriev >Assignee: Misha Dmitriev > Attachments: ch_2_excerpt.txt, HIVE-16166.01.patch > > > A heap dump obtained from one of our users shows that 15% of memory is wasted > on duplicate strings, despite the recent optimizations that I made. The > problematic strings just come from different sources this time. See the > excerpt from the jxray (www.jxray.com) analysis attached. > Adding String.intern() calls in the appropriate places reduces the overhead > of duplicate strings with this workload to ~6%. The remaining duplicates come > mostly from JDK internal and MapReduce data structures, and thus are more > difficult to fix. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-14550) HiveServer2: enable ThriftJDBCBinarySerde use by default
[ https://issues.apache.org/jira/browse/HIVE-14550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ziyang Zhao updated HIVE-14550: --- Status: Patch Available (was: Open) > HiveServer2: enable ThriftJDBCBinarySerde use by default > > > Key: HIVE-14550 > URL: https://issues.apache.org/jira/browse/HIVE-14550 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2, JDBC, ODBC >Affects Versions: 2.1.0 >Reporter: Vaibhav Gumashta >Assignee: Ziyang Zhao > Attachments: HIVE-14550.1.patch, HIVE-14550.1.patch, > HIVE-14550.2.patch > > > We've covered all items in HIVE-12427 and created HIVE-14549 for part2 of the > effort. Before closing the umbrella jira, we should enable this feature by > default. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-14550) HiveServer2: enable ThriftJDBCBinarySerde use by default
[ https://issues.apache.org/jira/browse/HIVE-14550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ziyang Zhao updated HIVE-14550: --- Status: Open (was: Patch Available) > HiveServer2: enable ThriftJDBCBinarySerde use by default > > > Key: HIVE-14550 > URL: https://issues.apache.org/jira/browse/HIVE-14550 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2, JDBC, ODBC >Affects Versions: 2.1.0 >Reporter: Vaibhav Gumashta >Assignee: Ziyang Zhao > Attachments: HIVE-14550.1.patch, HIVE-14550.1.patch, > HIVE-14550.2.patch > > > We've covered all items in HIVE-12427 and created HIVE-14549 for part2 of the > effort. Before closing the umbrella jira, we should enable this feature by > default. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Comment Edited] (HIVE-7172) Potential resource leak in HiveSchemaTool#getMetaStoreSchemaVersion()
[ https://issues.apache.org/jira/browse/HIVE-7172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14561965#comment-14561965 ] Ted Yu edited comment on HIVE-7172 at 3/10/17 1:43 AM: --- lgtm {code} 141 } catch (SQLException e) { 142 //do nothing. {code} Consider logging the exception. was (Author: yuzhih...@gmail.com): lgtm {code} 141 } catch (SQLException e) { 142 //do nothing. {code} Consider logging the exception. > Potential resource leak in HiveSchemaTool#getMetaStoreSchemaVersion() > - > > Key: HIVE-7172 > URL: https://issues.apache.org/jira/browse/HIVE-7172 > Project: Hive > Issue Type: Bug >Reporter: Ted Yu >Assignee: DJ Choi >Priority: Minor > Attachments: HIVE-7172.patch > > > {code} > ResultSet res = stmt.executeQuery(versionQuery); > if (!res.next()) { > throw new HiveMetaException("Didn't find version data in metastore"); > } > String currentSchemaVersion = res.getString(1); > metastoreConn.close(); > {code} > When HiveMetaException is thrown, metastoreConn.close() would be skipped. > stmt is not closed upon return from the method. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16076) LLAP packaging - include aux libs
[ https://issues.apache.org/jira/browse/HIVE-16076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-16076: Resolution: Fixed Fix Version/s: 2.2.0 Status: Resolved (was: Patch Available) Committed to master with changes on commit as discussed. Thanks for the review! > LLAP packaging - include aux libs > -- > > Key: HIVE-16076 > URL: https://issues.apache.org/jira/browse/HIVE-16076 > Project: Hive > Issue Type: Bug >Reporter: Gunther Hagleitner >Assignee: Sergey Shelukhin > Fix For: 2.2.0 > > Attachments: HIVE-16076.01.patch, HIVE-16076.02.patch, > HIVE-16076.patch > > > The old auxlibs (or whatever) should be packaged by default, if present. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16167) Remove transitive dependency on mysql connector jar
[ https://issues.apache.org/jira/browse/HIVE-16167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-16167: Status: Patch Available (was: Open) > Remove transitive dependency on mysql connector jar > --- > > Key: HIVE-16167 > URL: https://issues.apache.org/jira/browse/HIVE-16167 > Project: Hive > Issue Type: Bug > Components: Build Infrastructure, Druid integration >Reporter: Ashutosh Chauhan >Assignee: Ashutosh Chauhan > Attachments: HIVE-16167.patch > > > Brought in by druid storage handler transitively. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16167) Remove transitive dependency on mysql connector jar
[ https://issues.apache.org/jira/browse/HIVE-16167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-16167: Attachment: HIVE-16167.patch > Remove transitive dependency on mysql connector jar > --- > > Key: HIVE-16167 > URL: https://issues.apache.org/jira/browse/HIVE-16167 > Project: Hive > Issue Type: Bug > Components: Build Infrastructure, Druid integration >Reporter: Ashutosh Chauhan >Assignee: Ashutosh Chauhan > Attachments: HIVE-16167.patch > > > Brought in by druid storage handler transitively. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Assigned] (HIVE-16167) Remove transitive dependency on mysql connector jar
[ https://issues.apache.org/jira/browse/HIVE-16167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan reassigned HIVE-16167: --- > Remove transitive dependency on mysql connector jar > --- > > Key: HIVE-16167 > URL: https://issues.apache.org/jira/browse/HIVE-16167 > Project: Hive > Issue Type: Bug > Components: Build Infrastructure, Druid integration >Reporter: Ashutosh Chauhan >Assignee: Ashutosh Chauhan > > Brought in by druid storage handler transitively. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16133) Footer cache in Tez AM can take too much memory
[ https://issues.apache.org/jira/browse/HIVE-16133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15904208#comment-15904208 ] Hive QA commented on HIVE-16133: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12857116/HIVE-16133.02.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 10336 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_auto_smb_mapjoin_14] (batchId=145) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_join30] (batchId=146) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_join_filters] (batchId=151) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_join_nulls] (batchId=144) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_nullsafe_join] (batchId=156) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_outer_join1] (batchId=148) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_4] (batchId=95) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/4057/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/4057/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-4057/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 7 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12857116 - PreCommit-HIVE-Build > Footer cache in Tez AM can take too much memory > --- > > Key: HIVE-16133 > URL: https://issues.apache.org/jira/browse/HIVE-16133 > Project: Hive > Issue Type: Bug >Reporter: Siddharth Seth >Assignee: Sergey Shelukhin > Attachments: HIVE-16133.01.patch, HIVE-16133.02.patch, > HIVE-16133.02.patch, HIVE-16133.patch > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16166) HS2 may still waste up to 15% of memory on duplicate strings
[ https://issues.apache.org/jira/browse/HIVE-16166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Misha Dmitriev updated HIVE-16166: -- Attachment: ch_2_excerpt.txt Results of jxray analysis for duplicate strings, baseline code. > HS2 may still waste up to 15% of memory on duplicate strings > > > Key: HIVE-16166 > URL: https://issues.apache.org/jira/browse/HIVE-16166 > Project: Hive > Issue Type: Improvement >Reporter: Misha Dmitriev >Assignee: Misha Dmitriev > Attachments: ch_2_excerpt.txt > > > A heap dump obtained from one of our users shows that 15% of memory is wasted > on duplicate strings, despite the recent optimizations that I made. The > problematic strings just come from different sources this time. See the > excerpt from the jxray (www.jxray.com) analysis attached. > Adding String.intern() calls in the appropriate places reduces the overhead > of duplicate strings with this workload to ~6%. The remaining duplicates come > mostly from JDK internal and MapReduce data structures, and thus are more > difficult to fix. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Assigned] (HIVE-16166) HS2 may still waste up to 15% of memory on duplicate strings
[ https://issues.apache.org/jira/browse/HIVE-16166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Misha Dmitriev reassigned HIVE-16166: - > HS2 may still waste up to 15% of memory on duplicate strings > > > Key: HIVE-16166 > URL: https://issues.apache.org/jira/browse/HIVE-16166 > Project: Hive > Issue Type: Improvement >Reporter: Misha Dmitriev >Assignee: Misha Dmitriev > > A heap dump obtained from one of our users shows that 15% of memory is wasted > on duplicate strings, despite the recent optimizations that I made. The > problematic strings just come from different sources this time. See the > excerpt from the jxray (www.jxray.com) analysis attached. > Adding String.intern() calls in the appropriate places reduces the overhead > of duplicate strings with this workload to ~6%. The remaining duplicates come > mostly from JDK internal and MapReduce data structures, and thus are more > difficult to fix. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (HIVE-16165) use database broken on master
[ https://issues.apache.org/jira/browse/HIVE-16165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Seth resolved HIVE-16165. --- Resolution: Invalid Likely needs a mvn "clean". Will re-open if i hit this after a clean. > use database broken on master > - > > Key: HIVE-16165 > URL: https://issues.apache.org/jira/browse/HIVE-16165 > Project: Hive > Issue Type: Bug >Affects Versions: 2.2.0 >Reporter: Siddharth Seth >Priority: Blocker > > {code} > 2017-03-09T19:37:20,765 INFO [9171ecb2-4f38-4254-8138-f78a73c24181 main] > parse.CalcitePlanner: Starting Semantic Analysis > 2017-03-09T19:37:20,766 INFO [9171ecb2-4f38-4254-8138-f78a73c24181 main] > parse.CalcitePlanner: Completed phase 1 of Semantic Analysis > 2017-03-09T19:37:20,766 INFO [9171ecb2-4f38-4254-8138-f78a73c24181 main] > parse.CalcitePlanner: Get metadata for source tables > 2017-03-09T19:37:20,766 INFO [9171ecb2-4f38-4254-8138-f78a73c24181 main] > parse.CalcitePlanner: Get metadata for subqueries > 2017-03-09T19:37:20,766 INFO [9171ecb2-4f38-4254-8138-f78a73c24181 main] > parse.CalcitePlanner: Get metadata for destination tables > 2017-03-09T19:37:20,766 INFO [9171ecb2-4f38-4254-8138-f78a73c24181 main] > parse.CalcitePlanner: Completed getting MetaData in Semantic Analysis > 2017-03-09T19:37:20,766 INFO [9171ecb2-4f38-4254-8138-f78a73c24181 main] > parse.BaseSemanticAnalyzer: Not invoking CBO because the statement doesn't > have QUERY or EXPLAIN as root and not a CTAS; is not a query with at least > one source table or there is a subquery without a source table, or CTAS, or > insert > 2017-03-09T19:37:20,810 INFO [9171ecb2-4f38-4254-8138-f78a73c24181 main] > ql.Context: New scratch dir is > hdfs://cn105-10.l42scl.hortonworks.com:8020/tmp/hive/sseth/9171ecb2-4f38-4254-8138-f78a73c24181/hive_2017-03-09_19-37-20_763_6998351573308778636-1 > 2017-03-09T19:37:20,894 INFO [9171ecb2-4f38-4254-8138-f78a73c24181 main] > ppd.OpProcFactory: Processing for TS(0) > 2017-03-09T19:37:20,900 ERROR [9171ecb2-4f38-4254-8138-f78a73c24181 main] > ql.Driver: FAILED: NullPointerException null > java.lang.NullPointerException > at > org.apache.hadoop.hive.ql.stats.StatsUtils.estimateRowSizeFromSchema(StatsUtils.java:543) > at > org.apache.hadoop.hive.ql.stats.StatsUtils.getNumRows(StatsUtils.java:180) > at > org.apache.hadoop.hive.ql.stats.StatsUtils.collectStatistics(StatsUtils.java:204) > at > org.apache.hadoop.hive.ql.stats.StatsUtils.collectStatistics(StatsUtils.java:154) > at > org.apache.hadoop.hive.ql.stats.StatsUtils.collectStatistics(StatsUtils.java:142) > at > org.apache.hadoop.hive.ql.optimizer.stats.annotation.StatsRulesProcFactory$TableScanStatsRule.process(StatsRulesProcFactory.java:130) > at > org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90) > at > org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:105) > at > org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:89) > at > org.apache.hadoop.hive.ql.lib.LevelOrderWalker.walk(LevelOrderWalker.java:143) > at > org.apache.hadoop.hive.ql.lib.LevelOrderWalker.startWalking(LevelOrderWalker.java:122) > at > org.apache.hadoop.hive.ql.optimizer.stats.annotation.AnnotateWithStatistics.transform(AnnotateWithStatistics.java:78) > at > org.apache.hadoop.hive.ql.parse.TezCompiler.runStatsAnnotation(TezCompiler.java:302) > at > org.apache.hadoop.hive.ql.parse.TezCompiler.optimizeOperatorPlan(TezCompiler.java:96) > at > org.apache.hadoop.hive.ql.parse.TaskCompiler.compile(TaskCompiler.java:140) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:11174) > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:285) > at > org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:258) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:511) > at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1316) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1456) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1236) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1226) > at > org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:233) > at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:184) > at > org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403) > at > org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:821) > at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:759) >
[jira] [Commented] (HIVE-16165) use database broken on master
[ https://issues.apache.org/jira/browse/HIVE-16165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15904174#comment-15904174 ] Siddharth Seth commented on HIVE-16165: --- cc [~ashutoshc] > use database broken on master > - > > Key: HIVE-16165 > URL: https://issues.apache.org/jira/browse/HIVE-16165 > Project: Hive > Issue Type: Bug >Affects Versions: 2.2.0 >Reporter: Siddharth Seth >Priority: Blocker > > {code} > 2017-03-09T19:37:20,765 INFO [9171ecb2-4f38-4254-8138-f78a73c24181 main] > parse.CalcitePlanner: Starting Semantic Analysis > 2017-03-09T19:37:20,766 INFO [9171ecb2-4f38-4254-8138-f78a73c24181 main] > parse.CalcitePlanner: Completed phase 1 of Semantic Analysis > 2017-03-09T19:37:20,766 INFO [9171ecb2-4f38-4254-8138-f78a73c24181 main] > parse.CalcitePlanner: Get metadata for source tables > 2017-03-09T19:37:20,766 INFO [9171ecb2-4f38-4254-8138-f78a73c24181 main] > parse.CalcitePlanner: Get metadata for subqueries > 2017-03-09T19:37:20,766 INFO [9171ecb2-4f38-4254-8138-f78a73c24181 main] > parse.CalcitePlanner: Get metadata for destination tables > 2017-03-09T19:37:20,766 INFO [9171ecb2-4f38-4254-8138-f78a73c24181 main] > parse.CalcitePlanner: Completed getting MetaData in Semantic Analysis > 2017-03-09T19:37:20,766 INFO [9171ecb2-4f38-4254-8138-f78a73c24181 main] > parse.BaseSemanticAnalyzer: Not invoking CBO because the statement doesn't > have QUERY or EXPLAIN as root and not a CTAS; is not a query with at least > one source table or there is a subquery without a source table, or CTAS, or > insert > 2017-03-09T19:37:20,810 INFO [9171ecb2-4f38-4254-8138-f78a73c24181 main] > ql.Context: New scratch dir is > hdfs://cn105-10.l42scl.hortonworks.com:8020/tmp/hive/sseth/9171ecb2-4f38-4254-8138-f78a73c24181/hive_2017-03-09_19-37-20_763_6998351573308778636-1 > 2017-03-09T19:37:20,894 INFO [9171ecb2-4f38-4254-8138-f78a73c24181 main] > ppd.OpProcFactory: Processing for TS(0) > 2017-03-09T19:37:20,900 ERROR [9171ecb2-4f38-4254-8138-f78a73c24181 main] > ql.Driver: FAILED: NullPointerException null > java.lang.NullPointerException > at > org.apache.hadoop.hive.ql.stats.StatsUtils.estimateRowSizeFromSchema(StatsUtils.java:543) > at > org.apache.hadoop.hive.ql.stats.StatsUtils.getNumRows(StatsUtils.java:180) > at > org.apache.hadoop.hive.ql.stats.StatsUtils.collectStatistics(StatsUtils.java:204) > at > org.apache.hadoop.hive.ql.stats.StatsUtils.collectStatistics(StatsUtils.java:154) > at > org.apache.hadoop.hive.ql.stats.StatsUtils.collectStatistics(StatsUtils.java:142) > at > org.apache.hadoop.hive.ql.optimizer.stats.annotation.StatsRulesProcFactory$TableScanStatsRule.process(StatsRulesProcFactory.java:130) > at > org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90) > at > org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:105) > at > org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:89) > at > org.apache.hadoop.hive.ql.lib.LevelOrderWalker.walk(LevelOrderWalker.java:143) > at > org.apache.hadoop.hive.ql.lib.LevelOrderWalker.startWalking(LevelOrderWalker.java:122) > at > org.apache.hadoop.hive.ql.optimizer.stats.annotation.AnnotateWithStatistics.transform(AnnotateWithStatistics.java:78) > at > org.apache.hadoop.hive.ql.parse.TezCompiler.runStatsAnnotation(TezCompiler.java:302) > at > org.apache.hadoop.hive.ql.parse.TezCompiler.optimizeOperatorPlan(TezCompiler.java:96) > at > org.apache.hadoop.hive.ql.parse.TaskCompiler.compile(TaskCompiler.java:140) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:11174) > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:285) > at > org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:258) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:511) > at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1316) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1456) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1236) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1226) > at > org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:233) > at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:184) > at > org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403) > at > org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:821) > at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:759) > at
[jira] [Commented] (HIVE-16161) Disable "packaging.minimizeJar" for JDBC build
[ https://issues.apache.org/jira/browse/HIVE-16161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15904169#comment-15904169 ] Vaibhav Gumashta commented on HIVE-16161: - +1 > Disable "packaging.minimizeJar" for JDBC build > -- > > Key: HIVE-16161 > URL: https://issues.apache.org/jira/browse/HIVE-16161 > Project: Hive > Issue Type: Bug >Reporter: Tao Li >Assignee: Tao Li >Priority: Critical > Attachments: HIVE-16161.1.patch > > > "packaging.minimizeJar" is set to true for jdbc/pom.xml, which causes the > standalone JDBC jar not having some necessary classes like > "org.apache.hive.org.apache.commons.logging.impl.LogFactoryImpl". We need to > set it to false to have the classes shaded into the jdbc jar as expected. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16161) Disable "packaging.minimizeJar" for JDBC build
[ https://issues.apache.org/jira/browse/HIVE-16161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15904164#comment-15904164 ] Tao Li commented on HIVE-16161: --- The size of jdbc standalone jar increased from 20 MB to 60 MB with this change, which I think is OK. We want to make sure all needed classed are shaded into this jar. > Disable "packaging.minimizeJar" for JDBC build > -- > > Key: HIVE-16161 > URL: https://issues.apache.org/jira/browse/HIVE-16161 > Project: Hive > Issue Type: Bug >Reporter: Tao Li >Assignee: Tao Li >Priority: Critical > Attachments: HIVE-16161.1.patch > > > "packaging.minimizeJar" is set to true for jdbc/pom.xml, which causes the > standalone JDBC jar not having some necessary classes like > "org.apache.hive.org.apache.commons.logging.impl.LogFactoryImpl". We need to > set it to false to have the classes shaded into the jdbc jar as expected. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16091) Support subqueries in project/select
[ https://issues.apache.org/jira/browse/HIVE-16091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-16091: --- Status: Patch Available (was: Open) > Support subqueries in project/select > > > Key: HIVE-16091 > URL: https://issues.apache.org/jira/browse/HIVE-16091 > Project: Hive > Issue Type: Sub-task > Components: Logical Optimizer >Reporter: Vineet Garg >Assignee: Vineet Garg > Attachments: HIVE-16091.1.patch, HIVE-16091.2.patch > > > Currently scalar subqueries are supported in filter only (WHERE/HAVING). -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16091) Support subqueries in project/select
[ https://issues.apache.org/jira/browse/HIVE-16091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-16091: --- Attachment: HIVE-16091.2.patch > Support subqueries in project/select > > > Key: HIVE-16091 > URL: https://issues.apache.org/jira/browse/HIVE-16091 > Project: Hive > Issue Type: Sub-task > Components: Logical Optimizer >Reporter: Vineet Garg >Assignee: Vineet Garg > Attachments: HIVE-16091.1.patch, HIVE-16091.2.patch > > > Currently scalar subqueries are supported in filter only (WHERE/HAVING). -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16091) Support subqueries in project/select
[ https://issues.apache.org/jira/browse/HIVE-16091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-16091: --- Status: Open (was: Patch Available) > Support subqueries in project/select > > > Key: HIVE-16091 > URL: https://issues.apache.org/jira/browse/HIVE-16091 > Project: Hive > Issue Type: Sub-task > Components: Logical Optimizer >Reporter: Vineet Garg >Assignee: Vineet Garg > Attachments: HIVE-16091.1.patch > > > Currently scalar subqueries are supported in filter only (WHERE/HAVING). -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15978) Support regr_* functions
[ https://issues.apache.org/jira/browse/HIVE-15978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15904138#comment-15904138 ] Pengcheng Xiong commented on HIVE-15978: I think you need to create GenericUDAF for each of these? And also you need to support "OVER"? see https://docs.oracle.com/cd/B19306_01/server.102/b14200/functions132.htm > Support regr_* functions > > > Key: HIVE-15978 > URL: https://issues.apache.org/jira/browse/HIVE-15978 > Project: Hive > Issue Type: Sub-task > Components: SQL >Reporter: Carter Shanklin >Assignee: Zoltan Haindrich > > Support the standard regr_* functions, regr_slope, regr_intercept, regr_r2, > regr_sxx, regr_syy, regr_sxy, regr_avgx, regr_avgy, regr_count. SQL reference > section 10.9 -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16104) LLAP: preemption may be too aggressive if the pre-empted task doesn't die immediately
[ https://issues.apache.org/jira/browse/HIVE-16104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-16104: Attachment: HIVE-16104.02.patch The rest of the feedback... will look into unit tests > LLAP: preemption may be too aggressive if the pre-empted task doesn't die > immediately > - > > Key: HIVE-16104 > URL: https://issues.apache.org/jira/browse/HIVE-16104 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-16104.01.patch, HIVE-16104.02.patch, > HIVE-16104.patch > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15981) Allow empty grouping sets
[ https://issues.apache.org/jira/browse/HIVE-15981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15904129#comment-15904129 ] Hive QA commented on HIVE-15981: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12857104/HIVE-15981.1.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:green}SUCCESS:{color} +1 due to 10337 tests passed Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/4056/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/4056/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-4056/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12857104 - PreCommit-HIVE-Build > Allow empty grouping sets > - > > Key: HIVE-15981 > URL: https://issues.apache.org/jira/browse/HIVE-15981 > Project: Hive > Issue Type: Sub-task > Components: SQL >Reporter: Carter Shanklin >Assignee: Zoltan Haindrich > Attachments: HIVE-15981.1.patch > > > group by () should be treated as equivalent to no group by clause. Currently > it throws a parse error -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15691) Create StrictRegexWriter to work with RegexSerializer for Flume Hive Sink
[ https://issues.apache.org/jira/browse/HIVE-15691?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15904127#comment-15904127 ] Kalyan commented on HIVE-15691: --- Hi [~ekoifman], [~roshan_naik] current patch is only works on hive-2.x. if you feel looks ok, i will give the patch to work on hive-1.x currently flume need hive-1.x solution > Create StrictRegexWriter to work with RegexSerializer for Flume Hive Sink > - > > Key: HIVE-15691 > URL: https://issues.apache.org/jira/browse/HIVE-15691 > Project: Hive > Issue Type: New Feature > Components: HCatalog, Transactions >Reporter: Kalyan >Assignee: Kalyan > Attachments: HIVE-15691.1.patch, HIVE-15691.patch, > HIVE-15691-updated.patch > > > Create StrictRegexWriter to work with RegexSerializer for Flume Hive Sink. > It is similar to StrictJsonWriter available in hive. > Dependency is there in flume to commit. > FLUME-3036 : Create a RegexSerializer for Hive Sink. > Patch is available for Flume, Please verify the below link > https://github.com/kalyanhadooptraining/flume/commit/1c651e81395404321f9964c8d9d2af6f4a2aaef9 -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16164) Provide mechanism for passing HMS notification ID between transactional and non-transactional listeners.
[ https://issues.apache.org/jira/browse/HIVE-16164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15904126#comment-15904126 ] Sergio Peña commented on HIVE-16164: [~sershe] Have you worked with the DB notification listener? If so, do you think we could use the EnvironmentContext to pass information from a transactional listener to a non-transactional listener? I see that EnvironmentContext is already used for that, but information is set from the client side. This would be set on the server side by the DbNotificationListener. > Provide mechanism for passing HMS notification ID between transactional and > non-transactional listeners. > > > Key: HIVE-16164 > URL: https://issues.apache.org/jira/browse/HIVE-16164 > Project: Hive > Issue Type: Improvement > Components: Metastore >Reporter: Sergio Peña >Assignee: Sergio Peña > > The HMS DB notification listener currently stores an event ID on the HMS > backend DB so that external applications (such as backup apps) can request > incremental notifications based on the last event ID requested. > The HMS DB notification and backup applications are asynchronous. However, > there are sometimes that applications may be required to be in sync with the > latest HMS event in order to process an action. These applications will > provide a listener implementation that is called by the HMS after an HMS > transaction happened. > The problem is that the listener running after the transaction (or during the > non-transactional context) may need the DB event ID in order to sync all > events happened previous to that event ID, but this ID is never passed to the > non-transactional listeners. > We can pass this event information through the EnvironmentContext found on > each ListenerEvent implementations (such as CreateTableEvent), and send the > EnvironmentContext to the non-transactional listeners to get the event ID. > The DbNotificactionListener already knows the event ID after calling the > ObjectStore.addNotificationEvent(). We just need to set this event ID to the > EnvironmentContext from each of the event notifications and make sure that > this EnvironmentContext is sent to the non-transactional listeners. > Here's the code example when creating a table on {{create_table_core}}: > {noformat} > ms.createTable(tbl); > if (transactionalListeners.size() > 0) { > CreateTableEvent createTableEvent = new CreateTableEvent(tbl, true, this); > createTableEvent.setEnvironmentContext(envContext); > for (MetaStoreEventListener transactionalListener : > transactionalListeners) { > transactionalListener.onCreateTable(createTableEvent); // <- > Here the notification ID is generated > } > } > success = ms.commitTransaction(); > } finally { > if (!success) { > ms.rollbackTransaction(); > if (madeDir) { > wh.deleteDir(tblPath, true); > } > } > for (MetaStoreEventListener listener : listeners) { > CreateTableEvent createTableEvent = > new CreateTableEvent(tbl, success, this); > createTableEvent.setEnvironmentContext(envContext); > listener.onCreateTable(createTableEvent);// <- > Here we would like to consume notification ID > } > {noformat} > We could use a specific key name that will be used on the EnvironmentContext, > such as DB_NOTIFICATION_EVENT_ID. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Comment Edited] (HIVE-15978) Support regr_* functions
[ https://issues.apache.org/jira/browse/HIVE-15978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15904117#comment-15904117 ] Zoltan Haindrich edited comment on HIVE-15978 at 3/10/17 12:17 AM: --- [~pxiong] I see multiple ways this could be achieved...and I'm not sure which one to take :) Most of these functions (more/or less) could be translated into existing UDAF function usage - it needs some tweaking; but it can be done; I don't really want to reimplement all those things again - I think it would be better to reuse them. # if I create some 'cover' UDAF evaluators for each of these functions and do the evaluation of those inside the new evaluator - that could work; but it will be quite a few very similar classes # tho other alternative is to add some slightly extended versions of some existing UDAFs (like:count and variance) - and rewrite somehow the {{regr_sxx(y,x)}} invocations to {{extended_COUNT(x, y) * extended_VAR_POP( y , x )}} I guess from here that the 1. alternative may give slightly better runtimes - but not significantly; but in the 2. case the "original" evalutators would do the real work about why do I need to change a bit the existing UDAFs: all these regr_* functions are required to only do any work when neither of {{x}} and {{y}} is null ({{regr_sxx(x,y)}}) which way seems like the better approach? was (Author: kgyrtkirk): [~pxiong] I see multiple ways this could be achieved...and I'm not sure which one to take :) Most of these functions (more/or less) could be translated into existing UDAF function usage - it needs some tweaking; but it can be done; I don't really want to reimplement all those things again - I think it would be better to reuse them. # if I create some 'cover' UDAF evaluators for each of these functions and do the evaluation of those inside the new evaluator - that could work; but it will be quite a few very similar classes # tho other alternative is to add some slightly extended versions of some existing UDAFs (like:count and variance) - and rewrite somehow the {{regr_sxx(y,x)}} invocations to {{extended_COUNT(x, y) * extended_VAR_POP( y )}} I guess from here that the 1. alternative may give slightly better runtimes - but not significantly; but in the 2. case the "original" evalutators would do the real work about why do I need to change a bit the existing UDAFs: all these regr_* functions are required to only do any work when neither of {{x}} and {{y}} is null ({{regr_sxx(x,y)}}) > Support regr_* functions > > > Key: HIVE-15978 > URL: https://issues.apache.org/jira/browse/HIVE-15978 > Project: Hive > Issue Type: Sub-task > Components: SQL >Reporter: Carter Shanklin >Assignee: Zoltan Haindrich > > Support the standard regr_* functions, regr_slope, regr_intercept, regr_r2, > regr_sxx, regr_syy, regr_sxy, regr_avgx, regr_avgy, regr_count. SQL reference > section 10.9 -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Assigned] (HIVE-16164) Provide mechanism for passing HMS notification ID between transactional and non-transactional listeners.
[ https://issues.apache.org/jira/browse/HIVE-16164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergio Peña reassigned HIVE-16164: -- > Provide mechanism for passing HMS notification ID between transactional and > non-transactional listeners. > > > Key: HIVE-16164 > URL: https://issues.apache.org/jira/browse/HIVE-16164 > Project: Hive > Issue Type: Improvement > Components: Metastore >Reporter: Sergio Peña >Assignee: Sergio Peña > > The HMS DB notification listener currently stores an event ID on the HMS > backend DB so that external applications (such as backup apps) can request > incremental notifications based on the last event ID requested. > The HMS DB notification and backup applications are asynchronous. However, > there are sometimes that applications may be required to be in sync with the > latest HMS event in order to process an action. These applications will > provide a listener implementation that is called by the HMS after an HMS > transaction happened. > The problem is that the listener running after the transaction (or during the > non-transactional context) may need the DB event ID in order to sync all > events happened previous to that event ID, but this ID is never passed to the > non-transactional listeners. > We can pass this event information through the EnvironmentContext found on > each ListenerEvent implementations (such as CreateTableEvent), and send the > EnvironmentContext to the non-transactional listeners to get the event ID. > The DbNotificactionListener already knows the event ID after calling the > ObjectStore.addNotificationEvent(). We just need to set this event ID to the > EnvironmentContext from each of the event notifications and make sure that > this EnvironmentContext is sent to the non-transactional listeners. > Here's the code example when creating a table on {{create_table_core}}: > {noformat} > ms.createTable(tbl); > if (transactionalListeners.size() > 0) { > CreateTableEvent createTableEvent = new CreateTableEvent(tbl, true, this); > createTableEvent.setEnvironmentContext(envContext); > for (MetaStoreEventListener transactionalListener : > transactionalListeners) { > transactionalListener.onCreateTable(createTableEvent); // <- > Here the notification ID is generated > } > } > success = ms.commitTransaction(); > } finally { > if (!success) { > ms.rollbackTransaction(); > if (madeDir) { > wh.deleteDir(tblPath, true); > } > } > for (MetaStoreEventListener listener : listeners) { > CreateTableEvent createTableEvent = > new CreateTableEvent(tbl, success, this); > createTableEvent.setEnvironmentContext(envContext); > listener.onCreateTable(createTableEvent);// <- > Here we would like to consume notification ID > } > {noformat} > We could use a specific key name that will be used on the EnvironmentContext, > such as DB_NOTIFICATION_EVENT_ID. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15978) Support regr_* functions
[ https://issues.apache.org/jira/browse/HIVE-15978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15904117#comment-15904117 ] Zoltan Haindrich commented on HIVE-15978: - [~pxiong] I see multiple ways this could be achieved...and I'm not sure which one to take :) Most of these functions (more/or less) could be translated into existing UDAF function usage - it needs some tweaking; but it can be done; I don't really want to reimplement all those things again - I think it would be better to reuse them. # if I create some 'cover' UDAF evaluators for each of these functions and do the evaluation of those inside the new evaluator - that could work; but it will be quite a few very similar classes # tho other alternative is to add some slightly extended versions of some existing UDAFs (like:count and variance) - and rewrite somehow the {{regr_sxx(y,x)}} invocations to {{extended_COUNT(x, y) * extended_VAR_POP( y )}} I guess from here that the 1. alternative may give slightly better runtimes - but not significantly; but in the 2. case the "original" evalutators would do the real work about why do I need to change a bit the existing UDAFs: all these regr_* functions are required to only do any work when neither of {{x}} and {{y}} is null ({{regr_sxx(x,y)}}) > Support regr_* functions > > > Key: HIVE-15978 > URL: https://issues.apache.org/jira/browse/HIVE-15978 > Project: Hive > Issue Type: Sub-task > Components: SQL >Reporter: Carter Shanklin >Assignee: Zoltan Haindrich > > Support the standard regr_* functions, regr_slope, regr_intercept, regr_r2, > regr_sxx, regr_syy, regr_sxy, regr_avgx, regr_avgy, regr_count. SQL reference > section 10.9 -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15975) Support the MOD function
[ https://issues.apache.org/jira/browse/HIVE-15975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15904109#comment-15904109 ] Lefty Leverenz commented on HIVE-15975: --- Doc note: This needs to be documented in the wiki. * [Hive Operators and UDFs -- Mathematical Functions | https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-MathematicalFunctions] This section should also be updated (perhaps with version information): * [Hive Operators and UDFs -- Operators Precedences | https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-Operatorsprecedences] Added a TODOC2.2 label. > Support the MOD function > > > Key: HIVE-15975 > URL: https://issues.apache.org/jira/browse/HIVE-15975 > Project: Hive > Issue Type: Sub-task > Components: SQL >Reporter: Carter Shanklin >Assignee: Teddy Choi > Labels: TODOC2.2 > Fix For: 2.2.0 > > Attachments: HIVE-15975.1.patch, HIVE-15975.2.patch, > HIVE-15975.3.patch, HIVE-15975.4.patch > > > SQL defines the mod expression as a function allowing 2 numeric value > expressions. Hive allows the infix notation using %. It would be good for > Hive to support the standard approach as well. SQL standard reference T441 -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15983) Support the named columns join
[ https://issues.apache.org/jira/browse/HIVE-15983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-15983: --- Status: Patch Available (was: Open) > Support the named columns join > -- > > Key: HIVE-15983 > URL: https://issues.apache.org/jira/browse/HIVE-15983 > Project: Hive > Issue Type: Sub-task > Components: SQL >Reporter: Carter Shanklin >Assignee: Pengcheng Xiong > Attachments: HIVE-15983.01.patch, HIVE-15983.02.patch > > > The named columns join is a common shortcut allowing joins on identically > named keys. Example: select * from t1 join t2 using c1 is equivalent to > select * from t1 join t2 on t1.c1 = t2.c1. SQL standard reference: Section 7.7 -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15983) Support the named columns join
[ https://issues.apache.org/jira/browse/HIVE-15983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-15983: --- Attachment: HIVE-15983.02.patch > Support the named columns join > -- > > Key: HIVE-15983 > URL: https://issues.apache.org/jira/browse/HIVE-15983 > Project: Hive > Issue Type: Sub-task > Components: SQL >Reporter: Carter Shanklin >Assignee: Pengcheng Xiong > Attachments: HIVE-15983.01.patch, HIVE-15983.02.patch > > > The named columns join is a common shortcut allowing joins on identically > named keys. Example: select * from t1 join t2 using c1 is equivalent to > select * from t1 join t2 on t1.c1 = t2.c1. SQL standard reference: Section 7.7 -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15983) Support the named columns join
[ https://issues.apache.org/jira/browse/HIVE-15983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-15983: --- Status: Open (was: Patch Available) > Support the named columns join > -- > > Key: HIVE-15983 > URL: https://issues.apache.org/jira/browse/HIVE-15983 > Project: Hive > Issue Type: Sub-task > Components: SQL >Reporter: Carter Shanklin >Assignee: Pengcheng Xiong > Attachments: HIVE-15983.01.patch, HIVE-15983.02.patch > > > The named columns join is a common shortcut allowing joins on identically > named keys. Example: select * from t1 join t2 using c1 is equivalent to > select * from t1 join t2 on t1.c1 = t2.c1. SQL standard reference: Section 7.7 -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15975) Support the MOD function
[ https://issues.apache.org/jira/browse/HIVE-15975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lefty Leverenz updated HIVE-15975: -- Labels: TODOC2.2 (was: ) > Support the MOD function > > > Key: HIVE-15975 > URL: https://issues.apache.org/jira/browse/HIVE-15975 > Project: Hive > Issue Type: Sub-task > Components: SQL >Reporter: Carter Shanklin >Assignee: Teddy Choi > Labels: TODOC2.2 > Fix For: 2.2.0 > > Attachments: HIVE-15975.1.patch, HIVE-15975.2.patch, > HIVE-15975.3.patch, HIVE-15975.4.patch > > > SQL defines the mod expression as a function allowing 2 numeric value > expressions. Hive allows the infix notation using %. It would be good for > Hive to support the standard approach as well. SQL standard reference T441 -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Assigned] (HIVE-16162) Pass hiveconf variables to ptest master and from ptest master to ptest executor
[ https://issues.apache.org/jira/browse/HIVE-16162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sahil Takiar reassigned HIVE-16162: --- > Pass hiveconf variables to ptest master and from ptest master to ptest > executor > --- > > Key: HIVE-16162 > URL: https://issues.apache.org/jira/browse/HIVE-16162 > Project: Hive > Issue Type: Bug >Reporter: Sahil Takiar >Assignee: Sahil Takiar > > There should be a way to pass hiveconf variables from the command line to the > hive ptest master. The ptest master should then be able to pass the hiveconf > variables from the master to each executor. This will allow the Jenkins job > that launches ptest to pass hiveconf variables to each executor in a > configurable way. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16161) Disable "packaging.minimizeJar" for JDBC build
[ https://issues.apache.org/jira/browse/HIVE-16161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Li updated HIVE-16161: -- Status: Patch Available (was: Open) > Disable "packaging.minimizeJar" for JDBC build > -- > > Key: HIVE-16161 > URL: https://issues.apache.org/jira/browse/HIVE-16161 > Project: Hive > Issue Type: Bug >Reporter: Tao Li >Assignee: Tao Li >Priority: Critical > Attachments: HIVE-16161.1.patch > > > "packaging.minimizeJar" is set to true for jdbc/pom.xml, which causes the > standalone JDBC jar not having some necessary classes like > "org.apache.hive.org.apache.commons.logging.impl.LogFactoryImpl". We need to > set it to false to have the classes shaded into the jdbc jar as expected. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16161) Disable "packaging.minimizeJar" for JDBC build
[ https://issues.apache.org/jira/browse/HIVE-16161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Li updated HIVE-16161: -- Attachment: HIVE-16161.1.patch > Disable "packaging.minimizeJar" for JDBC build > -- > > Key: HIVE-16161 > URL: https://issues.apache.org/jira/browse/HIVE-16161 > Project: Hive > Issue Type: Bug >Reporter: Tao Li >Assignee: Tao Li >Priority: Critical > Attachments: HIVE-16161.1.patch > > > "packaging.minimizeJar" is set to true for jdbc/pom.xml, which causes the > standalone JDBC jar not having some necessary classes like > "org.apache.hive.org.apache.commons.logging.impl.LogFactoryImpl". We need to > set it to false to have the classes shaded into the jdbc jar as expected. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15691) Create StrictRegexWriter to work with RegexSerializer for Flume Hive Sink
[ https://issues.apache.org/jira/browse/HIVE-15691?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15904081#comment-15904081 ] Eugene Koifman commented on HIVE-15691: --- If there are differences in API between 1.x and 2.x then I think we need 2 patches with 2.x deprecating the c'tors and 1.x not having the "new" c'tor - I think the patch as is won't even compile against Hive1. The build bot is only running gainst Hive2 > Create StrictRegexWriter to work with RegexSerializer for Flume Hive Sink > - > > Key: HIVE-15691 > URL: https://issues.apache.org/jira/browse/HIVE-15691 > Project: Hive > Issue Type: New Feature > Components: HCatalog, Transactions >Reporter: Kalyan >Assignee: Kalyan > Attachments: HIVE-15691.1.patch, HIVE-15691.patch, > HIVE-15691-updated.patch > > > Create StrictRegexWriter to work with RegexSerializer for Flume Hive Sink. > It is similar to StrictJsonWriter available in hive. > Dependency is there in flume to commit. > FLUME-3036 : Create a RegexSerializer for Hive Sink. > Patch is available for Flume, Please verify the below link > https://github.com/kalyanhadooptraining/flume/commit/1c651e81395404321f9964c8d9d2af6f4a2aaef9 -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16104) LLAP: preemption may be too aggressive if the pre-empted task doesn't die immediately
[ https://issues.apache.org/jira/browse/HIVE-16104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15904079#comment-15904079 ] Sergey Shelukhin commented on HIVE-16104: - Wrt special values, monotonic/nanotime can take any value so special values should not be used. #3 is completely unrelated, strange that you would ask for it here ;) I'll make the change. The class has so many TODO comments that adding another relevant one should not be a problem > LLAP: preemption may be too aggressive if the pre-empted task doesn't die > immediately > - > > Key: HIVE-16104 > URL: https://issues.apache.org/jira/browse/HIVE-16104 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-16104.01.patch, HIVE-16104.patch > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16161) Disable "packaging.minimizeJar" for JDBC build
[ https://issues.apache.org/jira/browse/HIVE-16161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15904076#comment-15904076 ] Tao Li commented on HIVE-16161: --- + [~vgumashta] > Disable "packaging.minimizeJar" for JDBC build > -- > > Key: HIVE-16161 > URL: https://issues.apache.org/jira/browse/HIVE-16161 > Project: Hive > Issue Type: Bug >Reporter: Tao Li >Assignee: Tao Li >Priority: Critical > > "packaging.minimizeJar" is set to true for jdbc/pom.xml, which causes the > standalone JDBC jar not having some necessary classes like > "org.apache.hive.org.apache.commons.logging.impl.LogFactoryImpl". We need to > set it to false to have the classes shaded into the jdbc jar as expected. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16161) Disable "packaging.minimizeJar" for JDBC build
[ https://issues.apache.org/jira/browse/HIVE-16161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Li updated HIVE-16161: -- Description: "packaging.minimizeJar" is set to true for jdbc/pom.xml, which causes the standalone JDBC jar not having some necessary classes like "org.apache.hive.org.apache.commons.logging.impl.LogFactoryImpl". We need to set it to false to have the classes shaded into the jdbc jar as expected. > Disable "packaging.minimizeJar" for JDBC build > -- > > Key: HIVE-16161 > URL: https://issues.apache.org/jira/browse/HIVE-16161 > Project: Hive > Issue Type: Bug >Reporter: Tao Li >Assignee: Tao Li >Priority: Critical > > "packaging.minimizeJar" is set to true for jdbc/pom.xml, which causes the > standalone JDBC jar not having some necessary classes like > "org.apache.hive.org.apache.commons.logging.impl.LogFactoryImpl". We need to > set it to false to have the classes shaded into the jdbc jar as expected. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Assigned] (HIVE-16161) Disable "packaging.minimizeJar" for JDBC build
[ https://issues.apache.org/jira/browse/HIVE-16161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Li reassigned HIVE-16161: - > Disable "packaging.minimizeJar" for JDBC build > -- > > Key: HIVE-16161 > URL: https://issues.apache.org/jira/browse/HIVE-16161 > Project: Hive > Issue Type: Bug >Reporter: Tao Li >Assignee: Tao Li >Priority: Critical > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15160) Can't order by an unselected column
[ https://issues.apache.org/jira/browse/HIVE-15160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15904054#comment-15904054 ] Hive QA commented on HIVE-15160: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12857103/HIVE-15160.09.patch {color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 141 failed/errored test(s), 10338 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_queries] (batchId=221) org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[write_final_output_blobstore] (batchId=233) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join_without_localtask] (batchId=1) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[cbo_SortUnionTransposeRule] (batchId=15) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[constant_prop_3] (batchId=40) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[cp_sel] (batchId=57) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[decimal_udf] (batchId=9) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[druid_basic2] (batchId=10) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[dynamic_rdd_cache] (batchId=50) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[groupby2_limit] (batchId=9) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[groupby_grouping_sets_grouping] (batchId=3) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[limit_pushdown2] (batchId=15) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[nested_column_pruning] (batchId=31) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[order3] (batchId=59) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_types_non_dictionary_encoding_vectorization] (batchId=78) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_types_vectorization] (batchId=13) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[pcr] (batchId=55) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[pointlookup2] (batchId=75) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[pointlookup3] (batchId=6) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ppd_udf_case] (batchId=40) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ppd_vc] (batchId=77) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[regex_col] (batchId=15) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_cast_constant] (batchId=8) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_char_2] (batchId=64) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_coalesce] (batchId=10) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_date_1] (batchId=20) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_decimal_round] (batchId=33) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_groupby_reduce] (batchId=52) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_interval_1] (batchId=15) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_interval_arithmetic] (batchId=4) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_mr_diff_schema_alias] (batchId=59) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_orderby_5] (batchId=39) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_reduce_groupby_decimal] (batchId=30) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorization_13] (batchId=46) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorization_14] (batchId=14) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorization_15] (batchId=60) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorization_limit] (batchId=34) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorized_parquet_types] (batchId=61) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[view_alias] (batchId=76) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[view_cbo] (batchId=63) org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver[hbase_queries] (batchId=91) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[dynamic_partition_pruning_2] (batchId=137) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[explainuser_2] (batchId=138) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[bucket_groupby] (batchId=156) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[cbo_gby] (batchId=148) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[cbo_limit] (batchId=156) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[cbo_rp_gby] (batchId=152) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[cbo_rp_limit] (batchId=149) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[cbo_rp_semijoin] (batchId=141)
[jira] [Assigned] (HIVE-16160) OutOfMemoryError: GC overhead limit exceeded on Hiveserver2
[ https://issues.apache.org/jira/browse/HIVE-16160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sushanth Sowmyan reassigned HIVE-16160: --- Assignee: Sushanth Sowmyan > OutOfMemoryError: GC overhead limit exceeded on Hiveserver2 > > > Key: HIVE-16160 > URL: https://issues.apache.org/jira/browse/HIVE-16160 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Reporter: Kavan Suresh >Assignee: Sushanth Sowmyan >Priority: Critical > > Hs2 process killed by OOM: > {code:java} > ERROR [HiveServer2-Handler-Pool: Thread-1361771]: > metastore.RetryingHMSHandler (RetryingHMSHandler.java:invokeInternal(203)) - > java.lang.OutOfMemoryError: GC overhead limit exceeded > at java.lang.String.toLowerCase(String.java:2647) > at > org.datanucleus.Configuration.getInternalNameForProperty(Configuration.java:188) > at > org.datanucleus.ExecutionContextImpl.getProperty(ExecutionContextImpl.java:1012) > at > org.datanucleus.state.AbstractStateManager.updateLevel2CacheForFields(AbstractStateManager.java:979) > at > org.datanucleus.state.AbstractStateManager.loadFieldsInFetchPlan(AbstractStateManager.java:1097) > at > org.datanucleus.ExecutionContextImpl.performDetachAllOnTxnEndPreparation(ExecutionContextImpl.java:4544) > at > org.datanucleus.ExecutionContextImpl.preCommit(ExecutionContextImpl.java:4199) > at > org.datanucleus.ExecutionContextImpl.transactionPreCommit(ExecutionContextImpl.java:770) > at > org.datanucleus.TransactionImpl.internalPreCommit(TransactionImpl.java:385) > at org.datanucleus.TransactionImpl.commit(TransactionImpl.java:275) > at > org.datanucleus.api.jdo.JDOTransaction.commit(JDOTransaction.java:107) > at > org.apache.hadoop.hive.metastore.ObjectStore.commitTransaction(ObjectStore.java:570) > at > org.apache.hadoop.hive.metastore.ObjectStore.getTable(ObjectStore.java:1033) > at sun.reflect.GeneratedMethodAccessor32.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:103) > at com.sun.proxy.$Proxy7.getTable(Unknown Source) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_table_core(HiveMetaStore.java:1915) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_table(HiveMetaStore.java:1887) > at sun.reflect.GeneratedMethodAccessor36.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147) > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:105) > at com.sun.proxy.$Proxy12.get_table(Unknown Source) > at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getTable(HiveMetaStoreClient.java:1271) > at > org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.getTable(SessionHiveMetaStoreClient.java:131) > at sun.reflect.GeneratedMethodAccessor37.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:178) > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16160) OutOfMemoryError: GC overhead limit exceeded on Hiveserver2
[ https://issues.apache.org/jira/browse/HIVE-16160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kavan Suresh updated HIVE-16160: Summary: OutOfMemoryError: GC overhead limit exceeded on Hiveserver2 (was: OutOfMemoryError: GC overhead limit exceeded on Hs2 ) > OutOfMemoryError: GC overhead limit exceeded on Hiveserver2 > > > Key: HIVE-16160 > URL: https://issues.apache.org/jira/browse/HIVE-16160 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Reporter: Kavan Suresh >Priority: Critical > > Hs2 process killed by OOM: > {code:java} > ERROR [HiveServer2-Handler-Pool: Thread-1361771]: > metastore.RetryingHMSHandler (RetryingHMSHandler.java:invokeInternal(203)) - > java.lang.OutOfMemoryError: GC overhead limit exceeded > at java.lang.String.toLowerCase(String.java:2647) > at > org.datanucleus.Configuration.getInternalNameForProperty(Configuration.java:188) > at > org.datanucleus.ExecutionContextImpl.getProperty(ExecutionContextImpl.java:1012) > at > org.datanucleus.state.AbstractStateManager.updateLevel2CacheForFields(AbstractStateManager.java:979) > at > org.datanucleus.state.AbstractStateManager.loadFieldsInFetchPlan(AbstractStateManager.java:1097) > at > org.datanucleus.ExecutionContextImpl.performDetachAllOnTxnEndPreparation(ExecutionContextImpl.java:4544) > at > org.datanucleus.ExecutionContextImpl.preCommit(ExecutionContextImpl.java:4199) > at > org.datanucleus.ExecutionContextImpl.transactionPreCommit(ExecutionContextImpl.java:770) > at > org.datanucleus.TransactionImpl.internalPreCommit(TransactionImpl.java:385) > at org.datanucleus.TransactionImpl.commit(TransactionImpl.java:275) > at > org.datanucleus.api.jdo.JDOTransaction.commit(JDOTransaction.java:107) > at > org.apache.hadoop.hive.metastore.ObjectStore.commitTransaction(ObjectStore.java:570) > at > org.apache.hadoop.hive.metastore.ObjectStore.getTable(ObjectStore.java:1033) > at sun.reflect.GeneratedMethodAccessor32.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:103) > at com.sun.proxy.$Proxy7.getTable(Unknown Source) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_table_core(HiveMetaStore.java:1915) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_table(HiveMetaStore.java:1887) > at sun.reflect.GeneratedMethodAccessor36.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147) > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:105) > at com.sun.proxy.$Proxy12.get_table(Unknown Source) > at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getTable(HiveMetaStoreClient.java:1271) > at > org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.getTable(SessionHiveMetaStoreClient.java:131) > at sun.reflect.GeneratedMethodAccessor37.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:178) > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)