[jira] [Assigned] (HIVE-5618) Hive local task fails to run when run from oozie in a secure cluster
[ https://issues.apache.org/jira/browse/HIVE-5618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasad Mujumdar reassigned HIVE-5618: - Assignee: Prasad Mujumdar Hive local task fails to run when run from oozie in a secure cluster Key: HIVE-5618 URL: https://issues.apache.org/jira/browse/HIVE-5618 Project: Hive Issue Type: Bug Affects Versions: 0.12.0 Environment: Hadoop 2.2.0 Reporter: Venkat Ranganathan Assignee: Prasad Mujumdar When a hive query like the one below == INSERT OVERWRITE DIRECTORY 'outdir' SELECT table1.*, table2.* FROM table1 JOIN table2 ON (table1.col = table2.col); == is run from a hive action in Oozie in a secure cluster, the hive action fails with the following stack trace === org.apache.hadoop.ipc.RemoteException(java.io.IOException): Delegation Token can be issued only with kerberos or web authentication at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getDelegationToken(FSNamesystem.java:5886) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getDelegationToken(NameNodeRpcServer.java:447) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getDelegationToken(ClientNamenodeProtocolServerSideTranslatorPB.java:833) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:59648) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2053) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2047) at org.apache.hadoop.ipc.Client.call(Client.java:1347) at org.apache.hadoop.ipc.Client.call(Client.java:1300) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206) at $Proxy10.getDelegationToken(Unknown Source) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102) at $Proxy10.getDelegationToken(Unknown Source) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getDelegationToken(ClientNamenodeProtocolTranslatorPB.java:805) at org.apache.hadoop.hdfs.DFSClient.getDelegationToken(DFSClient.java:847) at org.apache.hadoop.hdfs.DistributedFileSystem.getDelegationToken(DistributedFileSystem.java:1318) at org.apache.hadoop.hive.shims.HadoopShimsSecure.createDelegationTokenFile(HadoopShimsSecure.java:535) at org.apache.hadoop.hive.ql.exec.SecureCmdDoAs.init(SecureCmdDoAs.java:38) at org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask.execute(MapredLocalTask.java:238) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:151) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:65) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1437) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1215) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1043) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:911) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:259) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:216) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:413) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:348) at org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:446) at org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:456) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:737) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:675) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:614) at org.apache.oozie.action.hadoop.HiveMain.runHive(HiveMain.java:312) at
[jira] [Commented] (HIVE-5514) webhcat_server.sh foreground option does not work as expected
[ https://issues.apache.org/jira/browse/HIVE-5514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13802630#comment-13802630 ] Hudson commented on HIVE-5514: -- FAILURE: Integrated in Hive-trunk-hadoop2 #517 (See [https://builds.apache.org/job/Hive-trunk-hadoop2/517/]) HIVE-5514 - webhcat_server.sh foreground option does not work as expected (brock: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1534662) * /hive/trunk/hcatalog/webhcat/svr/src/main/bin/webhcat_server.sh webhcat_server.sh foreground option does not work as expected - Key: HIVE-5514 URL: https://issues.apache.org/jira/browse/HIVE-5514 Project: Hive Issue Type: Improvement Reporter: Brock Noland Assignee: Brock Noland Priority: Minor Fix For: 0.13.0 Attachments: HIVE-5514.patch Executing webhcat script webhcat_server.sh with the foreground option, it calls calls hadoop without using exec. When you kill the webhcat_server.sh process, it does not kill the real webhcat server. Just need to add the word exec below in webhcat_server.sh: {noformat} function foreground_webhcat() { exec $start_cmd } {noformat} NO PRECOMMIT TESTS -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5600) Fix PTest2 Maven support
[ https://issues.apache.org/jira/browse/HIVE-5600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13802631#comment-13802631 ] Hudson commented on HIVE-5600: -- FAILURE: Integrated in Hive-trunk-hadoop2 #517 (See [https://builds.apache.org/job/Hive-trunk-hadoop2/517/]) HIVE-5600 - Fix PTest2 Maven support (brock: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1534648) * /hive/trunk/testutils/ptest2/src/main/resources/batch-exec.vm * /hive/trunk/testutils/ptest2/src/main/resources/smart-apply-patch.sh * /hive/trunk/testutils/ptest2/src/main/resources/source-prep.vm Fix PTest2 Maven support Key: HIVE-5600 URL: https://issues.apache.org/jira/browse/HIVE-5600 Project: Hive Issue Type: Sub-task Reporter: Brock Noland Assignee: Brock Noland Fix For: 0.13.0 Attachments: HIVE-5600.patch At present we don't download all the dependencies required in the source prep phase therefore tests fail when the maven repo has been cleared. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5441) Async query execution doesn't return resultset status
[ https://issues.apache.org/jira/browse/HIVE-5441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13802645#comment-13802645 ] Prasad Mujumdar commented on HIVE-5441: --- That's correct. The existing logic for checking fetch task is not changed as part of this patch. Async query execution doesn't return resultset status - Key: HIVE-5441 URL: https://issues.apache.org/jira/browse/HIVE-5441 Project: Hive Issue Type: Bug Components: HiveServer2 Affects Versions: 0.12.0 Reporter: Prasad Mujumdar Assignee: Prasad Mujumdar Attachments: HIVE-5441.1.patch, HIVE-5441.3.patch For synchronous statement execution (SQL as well as metadata and other), the operation handle includes a boolean flag indicating whether the statement returns a resultset. In case of async execution, that's always set to false. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Moved] (HIVE-5621) Target tar does not exist in the project hcatalog.
[ https://issues.apache.org/jira/browse/HIVE-5621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andreas Veithen moved ABDERA-353 to HIVE-5621: -- Fix Version/s: (was: 0.4.0) Affects Version/s: (was: 1.1.2) Workflow: no-reopen-closed, patch-avail (was: classic default workflow) Key: HIVE-5621 (was: ABDERA-353) Project: Hive (was: Abdera) Target tar does not exist in the project hcatalog. -- Key: HIVE-5621 URL: https://issues.apache.org/jira/browse/HIVE-5621 Project: Hive Issue Type: Bug Reporter: tony Buildfile: /home/murkuser/hcatalog-src-0.5.0-incubating/build.xml BUILD FAILED Target tar does not exist in the project hcatalog. Total time: 0 seconds -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5403) Move loading of filesystem, ugi, metastore client to hive session
[ https://issues.apache.org/jira/browse/HIVE-5403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13802656#comment-13802656 ] Hive QA commented on HIVE-5403: --- {color:green}Overall{color}: +1 all checks pass Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12609737/HIVE-5403.4.patch {color:green}SUCCESS:{color} +1 4430 tests passed Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1201/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1201/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. Move loading of filesystem, ugi, metastore client to hive session - Key: HIVE-5403 URL: https://issues.apache.org/jira/browse/HIVE-5403 Project: Hive Issue Type: Bug Affects Versions: 0.13.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K Attachments: HIVE-5403.1.patch, HIVE-5403.2.patch, HIVE-5403.3.patch, HIVE-5403.4.patch As part of HIVE-5184, the metastore connection, loading filesystem were done as part of the tez session so as to speed up query times while paying a cost at startup. We can do this more generally in hive to apply to both the mapreduce and tez side of things. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HIVE-5622) Add minHS2 for HiveServer2 testing
Prasad Mujumdar created HIVE-5622: - Summary: Add minHS2 for HiveServer2 testing Key: HIVE-5622 URL: https://issues.apache.org/jira/browse/HIVE-5622 Project: Hive Issue Type: Sub-task Components: HiveServer2, Testing Infrastructure, Tests Affects Versions: 0.12.0, 0.11.0 Reporter: Prasad Mujumdar Assignee: Prasad Mujumdar -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5351) Secure-Socket-Layer (SSL) support for HiveServer2
[ https://issues.apache.org/jira/browse/HIVE-5351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasad Mujumdar updated HIVE-5351: -- Attachment: HIVE-5351.1.patch Secure-Socket-Layer (SSL) support for HiveServer2 - Key: HIVE-5351 URL: https://issues.apache.org/jira/browse/HIVE-5351 Project: Hive Issue Type: Improvement Components: Authorization, HiveServer2, JDBC Affects Versions: 0.11.0, 0.12.0 Reporter: Prasad Mujumdar Assignee: Prasad Mujumdar Attachments: HIVE-5351.1.patch HiveServer2 and JDBC driver should support encrypted communication using SSL -- This message was sent by Atlassian JIRA (v6.1#6144)
Review Request 14870: HIVE-5351: Secure-Socket-Layer (SSL) support for HiveServer2
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/14870/ --- Review request for hive, Brock Noland and Thejas Nair. Bugs: HIVE-5351 https://issues.apache.org/jira/browse/HIVE-5351 Repository: hive-git Description --- Add support for encrypted communication for Plain SASL for binary thrift transport. - Optional thrift SSL transport on server side if configured. - Optional thrift SSL transport for JDBC client with configurable trust store - Added a miniHS2 class that for running a hiveserver2 for testing Diffs - common/src/java/org/apache/hadoop/hive/conf/HiveConf.java d0895e1 data/files/keystore.jks PRE-CREATION data/files/truststore.jks PRE-CREATION eclipse-templates/TestJdbcMiniHS2.launchtemplate PRE-CREATION jdbc/ivy.xml b9d0cea jdbc/src/java/org/apache/hive/jdbc/HiveConnection.java f155686 jdbc/src/test/org/apache/hive/jdbc/TestJdbcWithMiniHS2.java PRE-CREATION jdbc/src/test/org/apache/hive/jdbc/TestSSL.java PRE-CREATION metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 24b1832 service/src/java/org/apache/hive/service/auth/HiveAuthFactory.java 5a66a6c service/src/java/org/apache/hive/service/cli/thrift/ThriftBinaryCLIService.java 9c8f5c1 service/src/test/org/apache/hive/service/miniHS2/AbstarctHiveService.java PRE-CREATION service/src/test/org/apache/hive/service/miniHS2/MiniHS2.java PRE-CREATION service/src/test/org/apache/hive/service/miniHS2/TestHiveServer2.java PRE-CREATION shims/src/0.20/java/org/apache/hadoop/hive/shims/Hadoop20Shims.java f57f09e Diff: https://reviews.apache.org/r/14870/diff/ Testing --- - Basic HiveServer2 test cases with miniHS2 - Added multiple test cases for SSL transport Thanks, Prasad Mujumdar
[jira] [Updated] (HIVE-5351) Secure-Socket-Layer (SSL) support for HiveServer2
[ https://issues.apache.org/jira/browse/HIVE-5351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasad Mujumdar updated HIVE-5351: -- Status: Patch Available (was: Open) Patch attached Secure-Socket-Layer (SSL) support for HiveServer2 - Key: HIVE-5351 URL: https://issues.apache.org/jira/browse/HIVE-5351 Project: Hive Issue Type: Improvement Components: Authorization, HiveServer2, JDBC Affects Versions: 0.12.0, 0.11.0 Reporter: Prasad Mujumdar Assignee: Prasad Mujumdar Attachments: HIVE-5351.1.patch HiveServer2 and JDBC driver should support encrypted communication using SSL -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5351) Secure-Socket-Layer (SSL) support for HiveServer2
[ https://issues.apache.org/jira/browse/HIVE-5351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13802663#comment-13802663 ] Prasad Mujumdar commented on HIVE-5351: --- The patch HIVE-5351.1.patch includes the miniHS2 test framework as well. Secure-Socket-Layer (SSL) support for HiveServer2 - Key: HIVE-5351 URL: https://issues.apache.org/jira/browse/HIVE-5351 Project: Hive Issue Type: Improvement Components: Authorization, HiveServer2, JDBC Affects Versions: 0.11.0, 0.12.0 Reporter: Prasad Mujumdar Assignee: Prasad Mujumdar Attachments: HIVE-5351.1.patch HiveServer2 and JDBC driver should support encrypted communication using SSL -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HIVE-5623) ORC accessing array column that's empty will fail with java out of bound exception
Eric Chu created HIVE-5623: -- Summary: ORC accessing array column that's empty will fail with java out of bound exception Key: HIVE-5623 URL: https://issues.apache.org/jira/browse/HIVE-5623 Project: Hive Issue Type: Bug Components: File Formats Affects Versions: 0.11.0 Reporter: Eric Chu Priority: Critical In our ORC tests we saw that queries that work on RCFile failed on the corresponding ORC version with Java IndexOutOfBoundsException in OrcStruct.java. The queries failed b/c the table has an array type column and there are rows with an empty array. We noticed that the getList(Object list, int i) method in OrcStruct.java simply returns the i-th element from list without checking if list is not null or if i is within valid range. After fixing that the queries run fine. The fix is really simple, but maybe there are other similar cases that need to be handled. The fix is to check if listObj is null and if i falls within range: public Object getListElement(Object listObj, int i) { if (listObj == null) { return null; } List list = ((List) listObj); if (i 0 || i = list.size()) { return null; } return list.get(i); } -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5577) Remove TestNegativeCliDriver script_broken_pipe1
[ https://issues.apache.org/jira/browse/HIVE-5577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13802715#comment-13802715 ] Navis commented on HIVE-5577: - +1 Remove TestNegativeCliDriver script_broken_pipe1 Key: HIVE-5577 URL: https://issues.apache.org/jira/browse/HIVE-5577 Project: Hive Issue Type: Bug Reporter: Brock Noland Assignee: Brock Noland TestNegativeCliDriver script_broken_pipe1 is extremely flaky and not a terribly important test. Let's remove it. Failures https://builds.apache.org/user/brock/my-views/view/hive/job/Hive-trunk-hadoop1-ptest/206/testReport/org.apache.hadoop.hive.cli/TestNegativeCliDriver/testNegativeCliDriver_script_broken_pipe1/ https://builds.apache.org/user/brock/my-views/view/hive/job/Hive-trunk-hadoop1-ptest/206/testReport/junit/org.apache.hadoop.hive.cli/TestNegativeCliDriver/testNegativeCliDriver_script_broken_pipe1/ https://builds.apache.org/user/brock/my-views/view/hive/job/Hive-trunk-hadoop1-ptest/204/testReport/org.apache.hadoop.hive.cli/TestNegativeCliDriver/testNegativeCliDriver_script_broken_pipe1/ -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-2747) UNION ALL with subquery which selects NULL and performs group by fails
[ https://issues.apache.org/jira/browse/HIVE-2747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13802729#comment-13802729 ] jeff little commented on HIVE-2747: --- Hi, Kevin Wilfong. You can try the hql: from (select key, value, cast( count(1) as int) count from src group by key, value union all select NULL as key, value,cast( count(1) as int) count from src group by value) a select count;. You should modify the data type of ’count‘, otherwise the data type of 'count' in the intermediate result is void type, so it will cause java.lang.NullPointerException. In addition, if the hql sentences have union all operator, you should use 'AS' as the column's alias. UNION ALL with subquery which selects NULL and performs group by fails -- Key: HIVE-2747 URL: https://issues.apache.org/jira/browse/HIVE-2747 Project: Hive Issue Type: Bug Reporter: Kevin Wilfong Queries like the following from (select key, value, count(1) as count from src group by key, value union all select NULL as key, value, count(1) as count from src group by value) a select count(*); fail with the exception java.lang.NullPointerException at org.apache.hadoop.hive.serde2.objectinspector.StructObjectInspector.toString(StructObjectInspector.java:60) at java.lang.String.valueOf(String.java:2826) at java.lang.StringBuilder.append(StringBuilder.java:115) at org.apache.hadoop.hive.ql.exec.UnionOperator.initializeOp(UnionOperator.java:110) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:357) at org.apache.hadoop.hive.ql.exec.MapOperator.initializeOp(MapOperator.java:427) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:357) at org.apache.hadoop.hive.ql.exec.ExecMapper.configure(ExecMapper.java:98) ... 18 more This should at least provide a more informative error message if not work. It works without the group by. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-4994) Add WebHCat (Templeton) documentation to Hive wiki
[ https://issues.apache.org/jira/browse/HIVE-4994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13802757#comment-13802757 ] Lefty Leverenz commented on HIVE-4994: -- All done now. Dynamic partitions, error logs, and storage formats are linked to the Hive docs, and various Hive docs are linked to each other and to the HCatalog/WebHCat docs. Any further changes can be considered improvements. The doc conversion is finished. Whew. Add WebHCat (Templeton) documentation to Hive wiki -- Key: HIVE-4994 URL: https://issues.apache.org/jira/browse/HIVE-4994 Project: Hive Issue Type: Bug Components: Documentation Affects Versions: 0.11.0 Reporter: Lefty Leverenz Assignee: Lefty Leverenz WebHCat (Templeton) documentation in the Apache incubator had xml source files which generated html pdf output files. Now that HCatalog and WebHCat are part of the Hive project, all the WebHCat documents need to be added to the Hive wiki. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5547) webhcat pig job submission should ship hive tar if -usehcatalog is specified
[ https://issues.apache.org/jira/browse/HIVE-5547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13802763#comment-13802763 ] Hive QA commented on HIVE-5547: --- {color:green}Overall{color}: +1 all checks pass Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12609763/HIVE-5547.2.patch {color:green}SUCCESS:{color} +1 4430 tests passed Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1204/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1204/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. webhcat pig job submission should ship hive tar if -usehcatalog is specified Key: HIVE-5547 URL: https://issues.apache.org/jira/browse/HIVE-5547 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.12.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Attachments: HIVE-5547.2.patch, HIVE-5547.patch Currently when when a Pig job is submitted through WebHCat and the Pig script uses HCatalog, that means that Hive should be installed on the node in the cluster which ends up executing the job. For large clusters is this a manageability issue so we should use DistributedCache to ship the Hive tar file to the target node as part of job submission TestPig_11 in hcatalog/src/test/e2e/templeton/tests/jobsubmission.conf has the test case for this -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5506) Hive SPLIT function does not return array correctly
[ https://issues.apache.org/jira/browse/HIVE-5506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13802826#comment-13802826 ] Hudson commented on HIVE-5506: -- SUCCESS: Integrated in Hive-trunk-hadoop1-ptest #214 (See [https://builds.apache.org/job/Hive-trunk-hadoop1-ptest/214/]) HIVE-5506 : Hive SPLIT function does not return array correctly (Vikram Dixit via Ashutosh Chauhan) (hashutosh: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1534775) * /hive/trunk/data/files/input.txt * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFSplit.java * /hive/trunk/ql/src/test/queries/clientpositive/split.q * /hive/trunk/ql/src/test/results/clientpositive/split.q.out * /hive/trunk/ql/src/test/results/clientpositive/udf_split.q.out Hive SPLIT function does not return array correctly --- Key: HIVE-5506 URL: https://issues.apache.org/jira/browse/HIVE-5506 Project: Hive Issue Type: Bug Components: SQL, UDF Affects Versions: 0.9.0, 0.10.0, 0.11.0 Environment: Hive Reporter: John Omernik Assignee: Vikram Dixit K Fix For: 0.13.0 Attachments: HIVE-5506.1.patch, HIVE-5506.2.patch Hello all, I think I have outlined a bug in the hive split function: Summary: When calling split on a string of data, it will only return all array items if the the last array item has a value. For example, if I have a string of text delimited by tab with 7 columns, and the first four are filled, but the last three are blank, split will only return a 4 position array. If any number of middle columns are empty, but the last item still has a value, then it will return the proper number of columns. This was tested in Hive 0.9 and hive 0.11. Data: (Note \t represents a tab char, \x09 the line endings should be \n (UNIX style) not sure what email will do to them). Basically my data is 7 lines of data with the first 7 letters separated by tab. On some lines I've left out certain letters, but kept the number of tabs exactly the same. input.txt a\tb\tc\td\te\tf\tg a\tb\tc\td\te\t\tg a\tb\t\td\t\tf\tg \t\t\td\te\tf\tg a\tb\tc\td\t\t\t a\t\t\t\te\tf\tg a\t\t\td\t\t\tg I then created a table with one column from that data: DROP TABLE tmp_jo_tab_test; CREATE table tmp_jo_tab_test (message_line STRING) STORED AS TEXTFILE; LOAD DATA LOCAL INPATH '/tmp/input.txt' OVERWRITE INTO TABLE tmp_jo_tab_test; Ok just to validate I created a python counting script: #!/usr/bin/python import sys for line in sys.stdin: line = line[0:-1] out = line.split(\t) print len(out) The output there is : $ cat input.txt |./cnt_tabs.py 7 7 7 7 7 7 7 Based on that information, split on tab should return me 7 for each line as well: hive -e select size(split(message_line, '\\t')) from tmp_jo_tab_test; 7 7 7 7 4 7 7 However it does not. It would appear that the line where only the first four letters are filled in(and blank is passed in on the last three) only returns 4 splits, where there should technically be 7, 4 for letters included, and three blanks. a\tb\tc\td\t\t\t -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-3952) merge map-job followed by map-reduce job
[ https://issues.apache.org/jira/browse/HIVE-3952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tianyuan Fu updated HIVE-3952: -- Description: Consider the query like: select count(*)FROM ( select idOne, idTwo, value FROM bigTable JOIN smallTableOne on (bigTable.idOne = smallTableOne.idOne) ) firstjoin JOIN smallTableTwo on (firstjoin.idTwo = smallTableTwo.idTwo); where smallTableOne and smallTableTwo are smaller than hive.auto.convert.join.noconditionaltask.size and hive.auto.convert.join.noconditionaltask is set to true. The joins are collapsed into mapjoins, and it leads to a map-only job (for the map-joins) followed by a map-reduce job (for the group by). Ideally, the map-only job should be merged with the following map-reduce job. was: Consider the query like: select count(*) FROM ( select idOne, idTwo, value FROM bigTable JOIN smallTableOne on (bigTable.idOne = smallTableOne.idOne) ) firstjoin JOIN smallTableTwo on (firstjoin.idTwo = smallTableTwo.idTwo); where smallTableOne and smallTableTwo are smaller than hive.auto.convert.join.noconditionaltask.size and hive.auto.convert.join.noconditionaltask is set to true. The joins are collapsed into mapjoins, and it leads to a map-only job (for the map-joins) followed by a map-reduce job (for the group by). Ideally, the map-only job should be merged with the following map-reduce job. merge map-job followed by map-reduce job Key: HIVE-3952 URL: https://issues.apache.org/jira/browse/HIVE-3952 Project: Hive Issue Type: Improvement Components: Query Processor Reporter: Namit Jain Assignee: Vinod Kumar Vavilapalli Fix For: 0.11.0 Attachments: hive.3952.1.patch, HIVE-3952-20130226.txt, HIVE-3952-20130227.1.txt, HIVE-3952-20130301.txt, HIVE-3952-20130421.txt, HIVE-3952-20130424.txt, HIVE-3952-20130428-branch-0.11-bugfix.txt, HIVE-3952-20130428-branch-0.11.txt, HIVE-3952-20130428-branch-0.11-v2.txt Consider the query like: select count(*)FROM ( select idOne, idTwo, value FROM bigTable JOIN smallTableOne on (bigTable.idOne = smallTableOne.idOne) ) firstjoin JOIN smallTableTwo on (firstjoin.idTwo = smallTableTwo.idTwo); where smallTableOne and smallTableTwo are smaller than hive.auto.convert.join.noconditionaltask.size and hive.auto.convert.join.noconditionaltask is set to true. The joins are collapsed into mapjoins, and it leads to a map-only job (for the map-joins) followed by a map-reduce job (for the group by). Ideally, the map-only job should be merged with the following map-reduce job. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5276) Skip useless string encoding stage for hiveserver2
[ https://issues.apache.org/jira/browse/HIVE-5276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13802871#comment-13802871 ] Hive QA commented on HIVE-5276: --- {color:green}Overall{color}: +1 all checks pass Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12609770/HIVE-5276.4.patch.txt {color:green}SUCCESS:{color} +1 4430 tests passed Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1205/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1205/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. Skip useless string encoding stage for hiveserver2 -- Key: HIVE-5276 URL: https://issues.apache.org/jira/browse/HIVE-5276 Project: Hive Issue Type: Improvement Components: HiveServer2 Reporter: Navis Assignee: Navis Priority: Minor Attachments: HIVE-5276.3.patch.txt, HIVE-5276.4.patch.txt Current hiveserver2 acquires rows in string format which is used for cli output. Then convert them into row again and convert to final format lastly. This is inefficient and memory consuming. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5403) Move loading of filesystem, ugi, metastore client to hive session
[ https://issues.apache.org/jira/browse/HIVE-5403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-5403: --- Resolution: Fixed Fix Version/s: 0.13.0 Status: Resolved (was: Patch Available) Committed to trunk. Thanks, Vikram! Move loading of filesystem, ugi, metastore client to hive session - Key: HIVE-5403 URL: https://issues.apache.org/jira/browse/HIVE-5403 Project: Hive Issue Type: Bug Affects Versions: 0.13.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K Fix For: 0.13.0 Attachments: HIVE-5403.1.patch, HIVE-5403.2.patch, HIVE-5403.3.patch, HIVE-5403.4.patch As part of HIVE-5184, the metastore connection, loading filesystem were done as part of the tez session so as to speed up query times while paying a cost at startup. We can do this more generally in hive to apply to both the mapreduce and tez side of things. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-784) Support uncorrelated subqueries in the WHERE clause
[ https://issues.apache.org/jira/browse/HIVE-784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-784: -- Resolution: Fixed Status: Resolved (was: Patch Available) Committed to trunk. Thanks, Harish! Support uncorrelated subqueries in the WHERE clause --- Key: HIVE-784 URL: https://issues.apache.org/jira/browse/HIVE-784 Project: Hive Issue Type: New Feature Components: Query Processor Reporter: Ning Zhang Assignee: Harish Butani Fix For: 0.13.0 Attachments: D13443.1.patch, D13443.2.patch, HIVE-784.1.patch.txt, HIVE-784.2.patch, SubQuerySpec.pdf, tpchQueriesUsingSubQueryClauses.sql Hive currently only support views in the FROM-clause, some Facebook use cases suggest that Hive should support subqueries such as those connected by IN/EXISTS in the WHERE-clause. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5605) AddResourceOperation, DeleteResourceOperation, DfsOperation, SetOperation should be removed from org.apache.hive.service.cli.operation
[ https://issues.apache.org/jira/browse/HIVE-5605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brock Noland updated HIVE-5605: --- Resolution: Fixed Status: Resolved (was: Patch Available) Thank you for the contribution Vaibhav! I have committed this to trunk. AddResourceOperation, DeleteResourceOperation, DfsOperation, SetOperation should be removed from org.apache.hive.service.cli.operation --- Key: HIVE-5605 URL: https://issues.apache.org/jira/browse/HIVE-5605 Project: Hive Issue Type: Improvement Components: HiveServer2 Affects Versions: 0.13.0 Reporter: Vaibhav Gumashta Assignee: Vaibhav Gumashta Priority: Minor Fix For: 0.13.0 Attachments: HIVE-5605.1.patch These classes are not used as the processing for Add, Delete, DFS and Set commands is done by HiveCommandOperation -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5350) Cleanup exception handling around parallel orderby
[ https://issues.apache.org/jira/browse/HIVE-5350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brock Noland updated HIVE-5350: --- Resolution: Fixed Fix Version/s: 0.13.0 Status: Resolved (was: Patch Available) Thank you very much for the contribution Navis! I have committed this to trunk! Cleanup exception handling around parallel orderby -- Key: HIVE-5350 URL: https://issues.apache.org/jira/browse/HIVE-5350 Project: Hive Issue Type: Bug Reporter: Brock Noland Assignee: Navis Priority: Minor Fix For: 0.13.0 Attachments: D13617.1.patch I think we should log the message to the console and the full exception to the log: ExecDriver: {noformat} try { handleSampling(driverContext, mWork, job, conf); job.setPartitionerClass(HiveTotalOrderPartitioner.class); } catch (Exception e) { console.printInfo(Not enough sampling data.. Rolling back to single reducer task); rWork.setNumReduceTasks(1); job.setNumReduceTasks(1); } {noformat} -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5599) Change default logging level to INFO
[ https://issues.apache.org/jira/browse/HIVE-5599?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brock Noland updated HIVE-5599: --- Resolution: Fixed Fix Version/s: 0.13.0 Status: Resolved (was: Patch Available) Thank you for the review Thejas! I have committed this to trunk. Change default logging level to INFO Key: HIVE-5599 URL: https://issues.apache.org/jira/browse/HIVE-5599 Project: Hive Issue Type: Bug Affects Versions: 0.12.0 Reporter: Brock Noland Assignee: Brock Noland Fix For: 0.13.0 Attachments: HIVE-5599.patch The default logging level is warn: https://github.com/apache/hive/blob/trunk/common/src/java/conf/hive-log4j.properties#L19 but hive logs lot's of good information at INFO level. Additionally most hadoop projects log at INFO by default. Let's change the logging level to INFO by default. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5616) fix saveVersion.sh to work on mac
[ https://issues.apache.org/jira/browse/HIVE-5616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brock Noland updated HIVE-5616: --- Resolution: Fixed Status: Resolved (was: Patch Available) Thank you for the contribution Owen! I have committed this to branch. fix saveVersion.sh to work on mac - Key: HIVE-5616 URL: https://issues.apache.org/jira/browse/HIVE-5616 Project: Hive Issue Type: Sub-task Reporter: Owen O'Malley Assignee: Owen O'Malley Attachments: h-5616.patch There is no reason to not support builds on macs. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HIVE-5624) Remove ant artifacts from project
Brock Noland created HIVE-5624: -- Summary: Remove ant artifacts from project Key: HIVE-5624 URL: https://issues.apache.org/jira/browse/HIVE-5624 Project: Hive Issue Type: Sub-task Reporter: Brock Noland Before marking HIVE-5107 resolved we should remove the build.xml files and other ant artifacts. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5430) Refactor VectorizationContext and handle NOT expression with nulls.
[ https://issues.apache.org/jira/browse/HIVE-5430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-5430: --- Resolution: Fixed Fix Version/s: 0.13.0 Status: Resolved (was: Patch Available) Committed to trunk. Thanks, Jitendra! Refactor VectorizationContext and handle NOT expression with nulls. --- Key: HIVE-5430 URL: https://issues.apache.org/jira/browse/HIVE-5430 Project: Hive Issue Type: Sub-task Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Fix For: 0.13.0 Attachments: HIVE-5430.1.patch, HIVE-5430.2.patch, HIVE-5430.3.patch, HIVE-5430.4.patch, HIVE-5430.5.patch, HIVE-5430.6.patch NOT expression doesn't handle nulls correctly. -- This message was sent by Atlassian JIRA (v6.1#6144)
Please add Harsh J as a contributor
So I can attribute a patch to him. Thanks! Brock
[jira] [Updated] (HIVE-5454) HCatalog runs a partition listing with an empty filter
[ https://issues.apache.org/jira/browse/HIVE-5454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brock Noland updated HIVE-5454: --- Resolution: Fixed Fix Version/s: 0.13.0 Assignee: Brock Noland Status: Resolved (was: Patch Available) Thank you for the contribution Harsh! I have committed this to trunk and will attribute it to you when you are added as a contributor. Note: I am assigning it to myself in the interim so I don't forget. HCatalog runs a partition listing with an empty filter -- Key: HIVE-5454 URL: https://issues.apache.org/jira/browse/HIVE-5454 Project: Hive Issue Type: Bug Components: HCatalog Affects Versions: 0.12.0 Reporter: Harsh J Assignee: Brock Noland Fix For: 0.13.0 Attachments: D13317.1.patch, D13317.2.patch, D13317.3.patch This is a HCATALOG-527 caused regression, wherein the HCatLoader's way of calling HCatInputFormat causes it to do 2x partition lookups - once without the filter, and then again with the filter. For tables with large number partitions (10, say), the non-filter lookup proves fatal both to the client (Read timed out errors from ThriftMetaStoreClient cause the server doesn't respond) and to the server (too much data loaded into the cache, OOME, or slowdown). The fix would be to use a single call that also passes a partition filter information, as was in the case of HCatalog 0.4 sources before HCATALOG-527. (HCatalog-release-wise, this affects all 0.5.x users) -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5560) Hive produces incorrect results on multi-distinct query
[ https://issues.apache.org/jira/browse/HIVE-5560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-5560: --- Resolution: Fixed Fix Version/s: 0.13.0 Status: Resolved (was: Patch Available) Committed to trunk. Thanks, Navis! Hive produces incorrect results on multi-distinct query --- Key: HIVE-5560 URL: https://issues.apache.org/jira/browse/HIVE-5560 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.11.0, 0.12.0 Reporter: Vikram Dixit K Assignee: Navis Fix For: 0.13.0 Attachments: D13599.1.patch, D13599.2.patch {noformat} select key, count(distinct key) + count(distinct value) from src tablesample (10 ROWS) group by key POSTHOOK: type: QUERY POSTHOOK: Input: default@src A masked pattern was here 165 1 val_165 1 238 1 val_238 1 255 1 val_255 1 27 1 val_27 1 278 1 val_278 1 311 1 val_311 1 409 1 val_409 1 484 1 val_484 1 86 1 val_86 1 98 1 val_98 1 {noformat} -- This message was sent by Atlassian JIRA (v6.1#6144)
Re: Please add Harsh J as a contributor
Done. On Wed, Oct 23, 2013 at 8:15 AM, Brock Noland br...@cloudera.com wrote: So I can attribute a patch to him. Thanks! Brock
[jira] [Commented] (HIVE-5403) Move loading of filesystem, ugi, metastore client to hive session
[ https://issues.apache.org/jira/browse/HIVE-5403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13802959#comment-13802959 ] Brock Noland commented on HIVE-5403: Hey guys, thank you very much for your work on this! I know this is already committed, but the following is incorrect: {noformat} +// session creation should fail since the schema didn't get created +try { + SessionState.start(new CliSessionState(hiveConf)); +} catch (RuntimeException re) { + assertTrue(re.getCause().getCause() instanceof MetaException); +} {noformat} It should be {noformat} +// session creation should fail since the schema didn't get created +try { + SessionState.start(new CliSessionState(hiveConf)); fail(Expected exception); +} catch (RuntimeException re) { + assertTrue(re.getCause().getCause() instanceof MetaException); +} {noformat} Can you do a follow up jira? Move loading of filesystem, ugi, metastore client to hive session - Key: HIVE-5403 URL: https://issues.apache.org/jira/browse/HIVE-5403 Project: Hive Issue Type: Bug Affects Versions: 0.13.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K Fix For: 0.13.0 Attachments: HIVE-5403.1.patch, HIVE-5403.2.patch, HIVE-5403.3.patch, HIVE-5403.4.patch As part of HIVE-5184, the metastore connection, loading filesystem were done as part of the tez session so as to speed up query times while paying a cost at startup. We can do this more generally in hive to apply to both the mapreduce and tez side of things. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5454) HCatalog runs a partition listing with an empty filter
[ https://issues.apache.org/jira/browse/HIVE-5454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brock Noland updated HIVE-5454: --- Assignee: Harsh J (was: Brock Noland) HCatalog runs a partition listing with an empty filter -- Key: HIVE-5454 URL: https://issues.apache.org/jira/browse/HIVE-5454 Project: Hive Issue Type: Bug Components: HCatalog Affects Versions: 0.12.0 Reporter: Harsh J Assignee: Harsh J Fix For: 0.13.0 Attachments: D13317.1.patch, D13317.2.patch, D13317.3.patch This is a HCATALOG-527 caused regression, wherein the HCatLoader's way of calling HCatInputFormat causes it to do 2x partition lookups - once without the filter, and then again with the filter. For tables with large number partitions (10, say), the non-filter lookup proves fatal both to the client (Read timed out errors from ThriftMetaStoreClient cause the server doesn't respond) and to the server (too much data loaded into the cache, OOME, or slowdown). The fix would be to use a single call that also passes a partition filter information, as was in the case of HCatalog 0.4 sources before HCATALOG-527. (HCatalog-release-wise, this affects all 0.5.x users) -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-4523) round() function with specified decimal places not consistent with mysql
[ https://issues.apache.org/jira/browse/HIVE-4523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13802978#comment-13802978 ] Xuefu Zhang commented on HIVE-4523: --- The problem (most of it) stated here will be addressed in decimal precision/scale initiative. round() function with specified decimal places not consistent with mysql - Key: HIVE-4523 URL: https://issues.apache.org/jira/browse/HIVE-4523 Project: Hive Issue Type: Improvement Components: UDF Affects Versions: 0.7.1 Reporter: Fred Desing Assignee: Xuefu Zhang Priority: Minor Attachments: HIVE-4523.patch // hive hive select round(150.000, 2) from temp limit 1; 150.0 hive select round(150, 2) from temp limit 1; 150.0 // mysql mysql select round(150.000, 2) from DUAL limit 1; round(150.000, 2) 150.00 mysql select round(150, 2) from DUAL limit 1; round(150, 2) 150 http://dev.mysql.com/doc/refman/5.1/en/mathematical-functions.html#function_round -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5483) use metastore statistics to optimize max/min/etc. queries
[ https://issues.apache.org/jira/browse/HIVE-5483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13802986#comment-13802986 ] Ashutosh Chauhan commented on HIVE-5483: Fair points, Prashanth. I think option 2) is better because of two reasons. First, not all file formats have this capability, so tying these kind of optimization with a particular format should be avoided whenever possible. Secondly, we anyway would want to have stats fresh as much as possible in metastore for query planning purposes, so we are already down the path of making stats fresh. By the way, there is already a way to collect stats fast without full scan, for RC (via HIVE-3958 ). We can do same for ORC via HIVE-4177 I also agree we need to streamline our stats collection, stats storage and stats access api. use metastore statistics to optimize max/min/etc. queries - Key: HIVE-5483 URL: https://issues.apache.org/jira/browse/HIVE-5483 Project: Hive Issue Type: Improvement Reporter: Sergey Shelukhin Assignee: Ashutosh Chauhan Attachments: HIVE-5483.patch We have discussed this a little bit. Hive can answer queries such as select max(c1) from t purely from metastore using partition statistics, provided that we know the statistics are up to date. All data changes (e.g. adding new partitions) currently go thru metastore so we can track up-to-date-ness. If they are not up-to-date, the queries will have to read data (at least for outdated partitions) until someone runs analyze table. We can also analyze new partitions after add, if that is configured/specified in the command. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HIVE-5625) Fix issue with metastore version revision test.
Vikram Dixit K created HIVE-5625: Summary: Fix issue with metastore version revision test. Key: HIVE-5625 URL: https://issues.apache.org/jira/browse/HIVE-5625 Project: Hive Issue Type: Bug Components: Tests Affects Versions: 0.13.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K Based on Brock's comments, the change made in HIVE-5403 change the nature of the test. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5481) WebHCat e2e test: TestStreaming -ve tests should also check for job completion success
[ https://issues.apache.org/jira/browse/HIVE-5481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13803002#comment-13803002 ] Eugene Koifman commented on HIVE-5481: -- currently all webhcat e2e tests pass on trunk with Hadoop1 even w/o this patch. How do you explain this? WebHCat e2e test: TestStreaming -ve tests should also check for job completion success -- Key: HIVE-5481 URL: https://issues.apache.org/jira/browse/HIVE-5481 Project: Hive Issue Type: Bug Components: WebHCat Reporter: Vaibhav Gumashta Assignee: Vaibhav Gumashta Priority: Minor Fix For: 0.13.0 Attachments: HIVE-5481.1.patch Since TempletonController will anyway succeed for the -ve tests as well. However, the exit value should be non-zero. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5625) Fix issue with metastore version restriction test.
[ https://issues.apache.org/jira/browse/HIVE-5625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-5625: - Summary: Fix issue with metastore version restriction test. (was: Fix issue with metastore version revision test.) Fix issue with metastore version restriction test. -- Key: HIVE-5625 URL: https://issues.apache.org/jira/browse/HIVE-5625 Project: Hive Issue Type: Bug Components: Tests Affects Versions: 0.13.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K Based on Brock's comments, the change made in HIVE-5403 change the nature of the test. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5220) Add option for removing intermediate directory for partition, which is empty
[ https://issues.apache.org/jira/browse/HIVE-5220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13803013#comment-13803013 ] Hive QA commented on HIVE-5220: --- {color:green}Overall{color}: +1 all checks pass Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12609774/D12729.2.patch {color:green}SUCCESS:{color} +1 4470 tests passed Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1207/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1207/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. Add option for removing intermediate directory for partition, which is empty Key: HIVE-5220 URL: https://issues.apache.org/jira/browse/HIVE-5220 Project: Hive Issue Type: Improvement Components: Query Processor Reporter: Navis Assignee: Navis Priority: Minor Attachments: D12729.2.patch, HIVE-5220.D12729.1.patch For deeply nested partitioned table, intermediate directories are not removed even if there is no partitions in it by removing them. {noformat} /deep_part/c=09/d=01 /deep_part/c=09/d=01/e=01 /deep_part/c=09/d=01/e=02 /deep_part/c=09/d=02 /deep_part/c=09/d=02/e=01 /deep_part/c=09/d=02/e=02 {noformat} After removing partition (c='09'), directory remains like this, {noformat} /deep_part/c=09/d=01 /deep_part/c=09/d=02 {noformat} -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5220) Add option for removing intermediate directory for partition, which is empty
[ https://issues.apache.org/jira/browse/HIVE-5220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-5220: --- Resolution: Fixed Fix Version/s: 0.13.0 Status: Resolved (was: Patch Available) Committed to trunk. Thanks, Navis! Add option for removing intermediate directory for partition, which is empty Key: HIVE-5220 URL: https://issues.apache.org/jira/browse/HIVE-5220 Project: Hive Issue Type: Improvement Components: Query Processor Reporter: Navis Assignee: Navis Priority: Minor Fix For: 0.13.0 Attachments: D12729.2.patch, HIVE-5220.D12729.1.patch For deeply nested partitioned table, intermediate directories are not removed even if there is no partitions in it by removing them. {noformat} /deep_part/c=09/d=01 /deep_part/c=09/d=01/e=01 /deep_part/c=09/d=01/e=02 /deep_part/c=09/d=02 /deep_part/c=09/d=02/e=01 /deep_part/c=09/d=02/e=02 {noformat} After removing partition (c='09'), directory remains like this, {noformat} /deep_part/c=09/d=01 /deep_part/c=09/d=02 {noformat} -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5218) datanucleus does not work with MS SQLServer in Hive metastore
[ https://issues.apache.org/jira/browse/HIVE-5218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13803019#comment-13803019 ] Andy Jefferson commented on HIVE-5218: -- FYI 3.2.7 of datanucleus-rdbms is released datanucleus does not work with MS SQLServer in Hive metastore - Key: HIVE-5218 URL: https://issues.apache.org/jira/browse/HIVE-5218 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 0.12.0 Reporter: shanyu zhao Attachments: 0001-HIVE-5218-datanucleus-does-not-work-with-SQLServer-i.patch, HIVE-5218.patch HIVE-3632 upgraded datanucleus version to 3.2.x, however, this version of datanucleus doesn't work with SQLServer as the metastore. The problem is that datanucleus tries to use fully qualified object name to find a table in the database but couldn't find it. If I downgrade the version to HIVE-2084, SQLServer works fine. It could be a bug in datanucleus. This is the detailed exception I'm getting when using datanucleus 3.2.x with SQL Server: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTa sk. MetaException(message:javax.jdo.JDOException: Exception thrown calling table .exists() for a2ee36af45e9f46c19e995bfd2d9b5fd1hivemetastore..SEQUENCE_TABLE at org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusExc eption(NucleusJDOHelper.java:596) at org.datanucleus.api.jdo.JDOPersistenceManager.jdoMakePersistent(JDOPe rsistenceManager.java:732) … at org.apache.hadoop.hive.metastore.RetryingRawStore.invoke(RetryingRawS tore.java:111) at $Proxy0.createTable(Unknown Source) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_tabl e_core(HiveMetaStore.java:1071) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_tabl e_with_environment_context(HiveMetaStore.java:1104) … at $Proxy11.create_table_with_environment_context(Unknown Source) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$cr eate_table_with_environment_context.getResult(ThriftHiveMetastore.java:6417) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$cr eate_table_with_environment_context.getResult(ThriftHiveMetastore.java:6401) NestedThrowablesStackTrace: com.microsoft.sqlserver.jdbc.SQLServerException: There is already an object name d 'SEQUENCE_TABLE' in the database. at com.microsoft.sqlserver.jdbc.SQLServerException.makeFromDatabaseError (SQLServerException.java:197) at com.microsoft.sqlserver.jdbc.SQLServerStatement.getNextResult(SQLServ erStatement.java:1493) at com.microsoft.sqlserver.jdbc.SQLServerStatement.doExecuteStatement(SQ LServerStatement.java:775) at com.microsoft.sqlserver.jdbc.SQLServerStatement$StmtExecCmd.doExecute (SQLServerStatement.java:676) at com.microsoft.sqlserver.jdbc.TDSCommand.execute(IOBuffer.java:4615) at com.microsoft.sqlserver.jdbc.SQLServerConnection.executeCommand(SQLSe rverConnection.java:1400) at com.microsoft.sqlserver.jdbc.SQLServerStatement.executeCommand(SQLSer verStatement.java:179) at com.microsoft.sqlserver.jdbc.SQLServerStatement.executeStatement(SQLS erverStatement.java:154) at com.microsoft.sqlserver.jdbc.SQLServerStatement.execute(SQLServerStat ement.java:649) at com.jolbox.bonecp.StatementHandle.execute(StatementHandle.java:300) at org.datanucleus.store.rdbms.table.AbstractTable.executeDdlStatement(A bstractTable.java:760) at org.datanucleus.store.rdbms.table.AbstractTable.executeDdlStatementLi st(AbstractTable.java:711) at org.datanucleus.store.rdbms.table.AbstractTable.create(AbstractTable. java:425) at org.datanucleus.store.rdbms.table.AbstractTable.exists(AbstractTable. java:488) at org.datanucleus.store.rdbms.valuegenerator.TableGenerator.repositoryE xists(TableGenerator.java:242) at org.datanucleus.store.rdbms.valuegenerator.AbstractRDBMSGenerator.obt ainGenerationBlock(AbstractRDBMSGenerator.java:86) at org.datanucleus.store.valuegenerator.AbstractGenerator.obtainGenerati onBlock(AbstractGenerator.java:197) at org.datanucleus.store.valuegenerator.AbstractGenerator.next(AbstractG enerator.java:105) at org.datanucleus.store.rdbms.RDBMSStoreManager.getStrategyValueForGene rator(RDBMSStoreManager.java:2019) at org.datanucleus.store.AbstractStoreManager.getStrategyValue(AbstractS toreManager.java:1385) at org.datanucleus.ExecutionContextImpl.newObjectId(ExecutionContextImpl .java:3727) at
[jira] [Commented] (HIVE-5605) AddResourceOperation, DeleteResourceOperation, DfsOperation, SetOperation should be removed from org.apache.hive.service.cli.operation
[ https://issues.apache.org/jira/browse/HIVE-5605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13803020#comment-13803020 ] Vaibhav Gumashta commented on HIVE-5605: Thanks Brock! AddResourceOperation, DeleteResourceOperation, DfsOperation, SetOperation should be removed from org.apache.hive.service.cli.operation --- Key: HIVE-5605 URL: https://issues.apache.org/jira/browse/HIVE-5605 Project: Hive Issue Type: Improvement Components: HiveServer2 Affects Versions: 0.13.0 Reporter: Vaibhav Gumashta Assignee: Vaibhav Gumashta Priority: Minor Fix For: 0.13.0 Attachments: HIVE-5605.1.patch These classes are not used as the processing for Add, Delete, DFS and Set commands is done by HiveCommandOperation -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5625) Fix issue with metastore version restriction test.
[ https://issues.apache.org/jira/browse/HIVE-5625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-5625: - Status: Patch Available (was: Open) Fix issue with metastore version restriction test. -- Key: HIVE-5625 URL: https://issues.apache.org/jira/browse/HIVE-5625 Project: Hive Issue Type: Bug Components: Tests Affects Versions: 0.13.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K Attachments: HIVE-5625.1.patch Based on Brock's comments, the change made in HIVE-5403 change the nature of the test. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5625) Fix issue with metastore version restriction test.
[ https://issues.apache.org/jira/browse/HIVE-5625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-5625: - Attachment: HIVE-5625.1.patch Fix issue with metastore version restriction test. -- Key: HIVE-5625 URL: https://issues.apache.org/jira/browse/HIVE-5625 Project: Hive Issue Type: Bug Components: Tests Affects Versions: 0.13.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K Attachments: HIVE-5625.1.patch Based on Brock's comments, the change made in HIVE-5403 change the nature of the test. -- This message was sent by Atlassian JIRA (v6.1#6144)
Review Request 14877: HIVE-5625: Fix issue with metastore version restriction test.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/14877/ --- Review request for hive, Ashutosh Chauhan and Brock Noland. Bugs: HIVE-5625 https://issues.apache.org/jira/browse/HIVE-5625 Repository: hive-git Description --- Fix issue with metastore version restriction test. Diffs - metastore/src/test/org/apache/hadoop/hive/metastore/TestMetastoreVersion.java d7761f4 Diff: https://reviews.apache.org/r/14877/diff/ Testing --- Ran all metastore tests. Thanks, Vikram Dixit Kumaraswamy
[jira] [Commented] (HIVE-5625) Fix issue with metastore version restriction test.
[ https://issues.apache.org/jira/browse/HIVE-5625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13803026#comment-13803026 ] Brock Noland commented on HIVE-5625: +1 Fix issue with metastore version restriction test. -- Key: HIVE-5625 URL: https://issues.apache.org/jira/browse/HIVE-5625 Project: Hive Issue Type: Bug Components: Tests Affects Versions: 0.13.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K Attachments: HIVE-5625.1.patch Based on Brock's comments, the change made in HIVE-5403 change the nature of the test. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5619) Allow concat() to accept mixed string/binary args
[ https://issues.apache.org/jira/browse/HIVE-5619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Dere updated HIVE-5619: - Status: Patch Available (was: Open) my test run got botched, submitting patch to allow pre-commit build to run Allow concat() to accept mixed string/binary args - Key: HIVE-5619 URL: https://issues.apache.org/jira/browse/HIVE-5619 Project: Hive Issue Type: Improvement Reporter: Jason Dere Assignee: Jason Dere Attachments: HIVE-5619.1.patch concat() is currently strict about allowing either all binary or all non-binary arguments. Loosen this to permit mixed params. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5218) datanucleus does not work with MS SQLServer in Hive metastore
[ https://issues.apache.org/jira/browse/HIVE-5218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13803036#comment-13803036 ] Brock Noland commented on HIVE-5218: Great! @shanyu, I'd be happy to review a patch upgrading to 3.2.7. datanucleus does not work with MS SQLServer in Hive metastore - Key: HIVE-5218 URL: https://issues.apache.org/jira/browse/HIVE-5218 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 0.12.0 Reporter: shanyu zhao Attachments: 0001-HIVE-5218-datanucleus-does-not-work-with-SQLServer-i.patch, HIVE-5218.patch HIVE-3632 upgraded datanucleus version to 3.2.x, however, this version of datanucleus doesn't work with SQLServer as the metastore. The problem is that datanucleus tries to use fully qualified object name to find a table in the database but couldn't find it. If I downgrade the version to HIVE-2084, SQLServer works fine. It could be a bug in datanucleus. This is the detailed exception I'm getting when using datanucleus 3.2.x with SQL Server: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTa sk. MetaException(message:javax.jdo.JDOException: Exception thrown calling table .exists() for a2ee36af45e9f46c19e995bfd2d9b5fd1hivemetastore..SEQUENCE_TABLE at org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusExc eption(NucleusJDOHelper.java:596) at org.datanucleus.api.jdo.JDOPersistenceManager.jdoMakePersistent(JDOPe rsistenceManager.java:732) … at org.apache.hadoop.hive.metastore.RetryingRawStore.invoke(RetryingRawS tore.java:111) at $Proxy0.createTable(Unknown Source) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_tabl e_core(HiveMetaStore.java:1071) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_tabl e_with_environment_context(HiveMetaStore.java:1104) … at $Proxy11.create_table_with_environment_context(Unknown Source) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$cr eate_table_with_environment_context.getResult(ThriftHiveMetastore.java:6417) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$cr eate_table_with_environment_context.getResult(ThriftHiveMetastore.java:6401) NestedThrowablesStackTrace: com.microsoft.sqlserver.jdbc.SQLServerException: There is already an object name d 'SEQUENCE_TABLE' in the database. at com.microsoft.sqlserver.jdbc.SQLServerException.makeFromDatabaseError (SQLServerException.java:197) at com.microsoft.sqlserver.jdbc.SQLServerStatement.getNextResult(SQLServ erStatement.java:1493) at com.microsoft.sqlserver.jdbc.SQLServerStatement.doExecuteStatement(SQ LServerStatement.java:775) at com.microsoft.sqlserver.jdbc.SQLServerStatement$StmtExecCmd.doExecute (SQLServerStatement.java:676) at com.microsoft.sqlserver.jdbc.TDSCommand.execute(IOBuffer.java:4615) at com.microsoft.sqlserver.jdbc.SQLServerConnection.executeCommand(SQLSe rverConnection.java:1400) at com.microsoft.sqlserver.jdbc.SQLServerStatement.executeCommand(SQLSer verStatement.java:179) at com.microsoft.sqlserver.jdbc.SQLServerStatement.executeStatement(SQLS erverStatement.java:154) at com.microsoft.sqlserver.jdbc.SQLServerStatement.execute(SQLServerStat ement.java:649) at com.jolbox.bonecp.StatementHandle.execute(StatementHandle.java:300) at org.datanucleus.store.rdbms.table.AbstractTable.executeDdlStatement(A bstractTable.java:760) at org.datanucleus.store.rdbms.table.AbstractTable.executeDdlStatementLi st(AbstractTable.java:711) at org.datanucleus.store.rdbms.table.AbstractTable.create(AbstractTable. java:425) at org.datanucleus.store.rdbms.table.AbstractTable.exists(AbstractTable. java:488) at org.datanucleus.store.rdbms.valuegenerator.TableGenerator.repositoryE xists(TableGenerator.java:242) at org.datanucleus.store.rdbms.valuegenerator.AbstractRDBMSGenerator.obt ainGenerationBlock(AbstractRDBMSGenerator.java:86) at org.datanucleus.store.valuegenerator.AbstractGenerator.obtainGenerati onBlock(AbstractGenerator.java:197) at org.datanucleus.store.valuegenerator.AbstractGenerator.next(AbstractG enerator.java:105) at org.datanucleus.store.rdbms.RDBMSStoreManager.getStrategyValueForGene rator(RDBMSStoreManager.java:2019) at org.datanucleus.store.AbstractStoreManager.getStrategyValue(AbstractS toreManager.java:1385) at org.datanucleus.ExecutionContextImpl.newObjectId(ExecutionContextImpl .java:3727) at
[jira] [Commented] (HIVE-5511) percentComplete returned by job status from WebHCat is null
[ https://issues.apache.org/jira/browse/HIVE-5511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13803078#comment-13803078 ] Eugene Koifman commented on HIVE-5511: -- previous comment should read: blocks HIVE-5547 since this patch needs to be applied first percentComplete returned by job status from WebHCat is null --- Key: HIVE-5511 URL: https://issues.apache.org/jira/browse/HIVE-5511 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.12.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Attachments: HIVE-5511.3.patch In hadoop1 the logging from MR is sent to stderr. In H2, by default, to syslog. templeton.tool.LaunchMapper expects to see the output on stderr to produce 'percentComplete' in job status. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5481) WebHCat e2e test: TestStreaming -ve tests should also check for job completion success
[ https://issues.apache.org/jira/browse/HIVE-5481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13803093#comment-13803093 ] Vaibhav Gumashta commented on HIVE-5481: [~ekoifman] It could be because of [HIVE-5510|https://issues.apache.org/jira/browse/HIVE-5510] due to which the values returned were a mix of TempletonController job and the launced job (and possibly the job completion was for the launched job), which I believe will now be changed to return the values only for the TempletonController job. Thus, TempletonController job should always succeed unless it's a -ve test for the TempletonController job. WebHCat e2e test: TestStreaming -ve tests should also check for job completion success -- Key: HIVE-5481 URL: https://issues.apache.org/jira/browse/HIVE-5481 Project: Hive Issue Type: Bug Components: WebHCat Reporter: Vaibhav Gumashta Assignee: Vaibhav Gumashta Priority: Minor Fix For: 0.13.0 Attachments: HIVE-5481.1.patch Since TempletonController will anyway succeed for the -ve tests as well. However, the exit value should be non-zero. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Assigned] (HIVE-5581) Implement vectorized year/month/day... etc. for string arguments
[ https://issues.apache.org/jira/browse/HIVE-5581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Teddy Choi reassigned HIVE-5581: Assignee: Teddy Choi Implement vectorized year/month/day... etc. for string arguments Key: HIVE-5581 URL: https://issues.apache.org/jira/browse/HIVE-5581 Project: Hive Issue Type: Sub-task Components: Query Processor Affects Versions: 0.13.0 Reporter: Eric Hanson Assignee: Teddy Choi Functions year(), month(), day(), weekofyear(), hour(), minute(), second() need to be implemented for string arguments in vectorized mode. They already work for timestamp arguments. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5626) enable metastore direct SQL for drop/similar queries
[ https://issues.apache.org/jira/browse/HIVE-5626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13803119#comment-13803119 ] Sergey Shelukhin commented on HIVE-5626: [~ashutoshc] fyi enable metastore direct SQL for drop/similar queries Key: HIVE-5626 URL: https://issues.apache.org/jira/browse/HIVE-5626 Project: Hive Issue Type: Improvement Reporter: Sergey Shelukhin Priority: Minor Metastore direct SQL is currently disabled for any queries running inside external transaction (i.e. all modification queries, like dropping stuff). This was done to keep the strictly performance-optimization behavior when using Postgres, which unlike other RDBMS-es fails the tx on any syntax error; so, if direct SQL is broken there's no way to fall back. So, it is disabled for these cases. It is not as important because drop commands are rare, but we might want to address that. Either by some config setting or by making it work on non-postgres DBs. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HIVE-5626) enable metastore direct SQL for drop/similar queries
Sergey Shelukhin created HIVE-5626: -- Summary: enable metastore direct SQL for drop/similar queries Key: HIVE-5626 URL: https://issues.apache.org/jira/browse/HIVE-5626 Project: Hive Issue Type: Improvement Reporter: Sergey Shelukhin Priority: Minor Metastore direct SQL is currently disabled for any queries running inside external transaction (i.e. all modification queries, like dropping stuff). This was done to keep the strictly performance-optimization behavior when using Postgres, which unlike other RDBMS-es fails the tx on any syntax error; so, if direct SQL is broken there's no way to fall back. So, it is disabled for these cases. It is not as important because drop commands are rare, but we might want to address that. Either by some config setting or by making it work on non-postgres DBs. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Resolved] (HIVE-4994) Add WebHCat (Templeton) documentation to Hive wiki
[ https://issues.apache.org/jira/browse/HIVE-4994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thejas M Nair resolved HIVE-4994. - Resolution: Fixed Marking it as resolved. Thanks for the contribution Lefty! Add WebHCat (Templeton) documentation to Hive wiki -- Key: HIVE-4994 URL: https://issues.apache.org/jira/browse/HIVE-4994 Project: Hive Issue Type: Bug Components: Documentation Affects Versions: 0.11.0 Reporter: Lefty Leverenz Assignee: Lefty Leverenz WebHCat (Templeton) documentation in the Apache incubator had xml source files which generated html pdf output files. Now that HCatalog and WebHCat are part of the Hive project, all the WebHCat documents need to be added to the Hive wiki. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-4446) [HCatalog] Documentation for HIVE-4442, HIVE-4443, HIVE-4444
[ https://issues.apache.org/jira/browse/HIVE-4446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13803152#comment-13803152 ] Daniel Dai commented on HIVE-4446: -- Thanks Lefty, the document for this Jira looks good for me. There's additional document change not ported to cwiki yet, such as HIVE-5031, HIVE-4531, etc. [HCatalog] Documentation for HIVE-4442, HIVE-4443, HIVE- Key: HIVE-4446 URL: https://issues.apache.org/jira/browse/HIVE-4446 Project: Hive Issue Type: Improvement Components: HCatalog Reporter: Daniel Dai Assignee: Lefty Leverenz Attachments: HIVE-4446-1.patch -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HIVE-5627) Document 'usehcatalog' parameter on WebHCat calls
Eugene Koifman created HIVE-5627: Summary: Document 'usehcatalog' parameter on WebHCat calls Key: HIVE-5627 URL: https://issues.apache.org/jira/browse/HIVE-5627 Project: Hive Issue Type: Sub-task Components: WebHCat Affects Versions: 0.12.0 Reporter: Eugene Koifman Assignee: Lefty Leverenz The following REST calls in WebHCat: 1. mapreduce/jar 2. pig 3. hive now support an additional parameter 'usehcatalog'. The JavaDoc on corresponding methods in org.apache.hive.hcatalog.templeton.Server describe this parameter. Additionally, templeton.hive.archive, etc This should be added to the sections in https://cwiki.apache.org/confluence/display/Hive/WebHCat+Reference that correspond to these methods. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HIVE-5628) ListBucketingPrunnerTest and DynamicMultiDimeCollectionTest should start with Test not end with it
Brock Noland created HIVE-5628: -- Summary: ListBucketingPrunnerTest and DynamicMultiDimeCollectionTest should start with Test not end with it Key: HIVE-5628 URL: https://issues.apache.org/jira/browse/HIVE-5628 Project: Hive Issue Type: Bug Reporter: Brock Noland Assignee: Brock Noland ListBucketingPrunnerTest and DynamicMultiDimeCollectionTest will not be run by PTest because they end with Test and PTest requires tests start with Test. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HIVE-5629) Fix two javadoc failures in HCatalog
Brock Noland created HIVE-5629: -- Summary: Fix two javadoc failures in HCatalog Key: HIVE-5629 URL: https://issues.apache.org/jira/browse/HIVE-5629 Project: Hive Issue Type: Bug Reporter: Brock Noland I am seeing two javadoc failures on HCatalog. These are not being seen by PTest and indeed I cannot reproduce on my Mac but can on Linux. Regardless they should be fixed. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5629) Fix two javadoc failures in HCatalog
[ https://issues.apache.org/jira/browse/HIVE-5629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brock Noland updated HIVE-5629: --- Attachment: HIVE-5629.patch Fix two javadoc failures in HCatalog Key: HIVE-5629 URL: https://issues.apache.org/jira/browse/HIVE-5629 Project: Hive Issue Type: Bug Reporter: Brock Noland Attachments: HIVE-5629.patch I am seeing two javadoc failures on HCatalog. These are not being seen by PTest and indeed I cannot reproduce on my Mac but can on Linux. Regardless they should be fixed. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5269) Use thrift binary type for conveying binary values in hiveserver2
[ https://issues.apache.org/jira/browse/HIVE-5269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13803182#comment-13803182 ] Hive QA commented on HIVE-5269: --- {color:green}Overall{color}: +1 all checks pass Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12609771/HIVE-5269.2.patch.txt {color:green}SUCCESS:{color} +1 4470 tests passed Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1208/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1208/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. Use thrift binary type for conveying binary values in hiveserver2 - Key: HIVE-5269 URL: https://issues.apache.org/jira/browse/HIVE-5269 Project: Hive Issue Type: Improvement Components: HiveServer2 Reporter: Navis Assignee: Navis Priority: Trivial Attachments: HIVE-5269.2.patch.txt, HIVE-5269.D12873.1.patch Currently, binary type is encoded to string in hiveserver2 and decoded in client. Just using binary type might make it simpler. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5629) Fix two javadoc failures in HCatalog
[ https://issues.apache.org/jira/browse/HIVE-5629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brock Noland updated HIVE-5629: --- Assignee: Brock Noland Status: Patch Available (was: Open) Fix two javadoc failures in HCatalog Key: HIVE-5629 URL: https://issues.apache.org/jira/browse/HIVE-5629 Project: Hive Issue Type: Bug Reporter: Brock Noland Assignee: Brock Noland Attachments: HIVE-5629.patch I am seeing two javadoc failures on HCatalog. These are not being seen by PTest and indeed I cannot reproduce on my Mac but can on Linux. Regardless they should be fixed. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5627) Document 'usehcatalog' parameter on WebHCat calls
[ https://issues.apache.org/jira/browse/HIVE-5627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-5627: - Description: The following REST calls in WebHCat: 1. mapreduce/jar 2. pig now support an additional parameter 'usehcatalog'. This is a mechanism for the caller to tell WebHCat that the submitted job uses HCat, and thus needs to access the metastore, which requires additional steps for WebHCat to perform. The JavaDoc on corresponding methods in org.apache.hive.hcatalog.templeton.Server describe this parameter. Additionally, if templeton.hive.archive, templeton.hive.home and templeton.hcat.home are defined in webhcat-site.xml (documented in webhcat-default.xml) then WebHCat will ship the Hive tar to the target node where the job actually runs. This means that Hive doesn't need to be installed on every node in the Hadoop cluster. (This part was added in HIVE-5547) This should be added to the sections in https://cwiki.apache.org/confluence/display/Hive/WebHCat+Reference that correspond to these methods. was: The following REST calls in WebHCat: 1. mapreduce/jar 2. pig 3. hive now support an additional parameter 'usehcatalog'. The JavaDoc on corresponding methods in org.apache.hive.hcatalog.templeton.Server describe this parameter. Additionally, templeton.hive.archive, etc This should be added to the sections in https://cwiki.apache.org/confluence/display/Hive/WebHCat+Reference that correspond to these methods. Document 'usehcatalog' parameter on WebHCat calls - Key: HIVE-5627 URL: https://issues.apache.org/jira/browse/HIVE-5627 Project: Hive Issue Type: Sub-task Components: WebHCat Affects Versions: 0.12.0 Reporter: Eugene Koifman Assignee: Lefty Leverenz Fix For: 0.13.0 The following REST calls in WebHCat: 1. mapreduce/jar 2. pig now support an additional parameter 'usehcatalog'. This is a mechanism for the caller to tell WebHCat that the submitted job uses HCat, and thus needs to access the metastore, which requires additional steps for WebHCat to perform. The JavaDoc on corresponding methods in org.apache.hive.hcatalog.templeton.Server describe this parameter. Additionally, if templeton.hive.archive, templeton.hive.home and templeton.hcat.home are defined in webhcat-site.xml (documented in webhcat-default.xml) then WebHCat will ship the Hive tar to the target node where the job actually runs. This means that Hive doesn't need to be installed on every node in the Hadoop cluster. (This part was added in HIVE-5547) This should be added to the sections in https://cwiki.apache.org/confluence/display/Hive/WebHCat+Reference that correspond to these methods. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5629) Fix two javadoc failures in HCatalog
[ https://issues.apache.org/jira/browse/HIVE-5629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13803229#comment-13803229 ] Ashutosh Chauhan commented on HIVE-5629: +1 Fix two javadoc failures in HCatalog Key: HIVE-5629 URL: https://issues.apache.org/jira/browse/HIVE-5629 Project: Hive Issue Type: Bug Reporter: Brock Noland Assignee: Brock Noland Attachments: HIVE-5629.patch I am seeing two javadoc failures on HCatalog. These are not being seen by PTest and indeed I cannot reproduce on my Mac but can on Linux. Regardless they should be fixed. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5627) Document 'usehcatalog' parameter on WebHCat calls
[ https://issues.apache.org/jira/browse/HIVE-5627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-5627: - Description: The following REST calls in WebHCat: 1. mapreduce/jar 2. pig now support an additional parameter 'usehcatalog'. This is a mechanism for the caller to tell WebHCat that the submitted job uses HCat, and thus needs to access the metastore, which requires additional steps for WebHCat to perform in a secure cluster. The JavaDoc on corresponding methods in org.apache.hive.hcatalog.templeton.Server describe this parameter. Additionally, if templeton.hive.archive, templeton.hive.home and templeton.hcat.home are defined in webhcat-site.xml (documented in webhcat-default.xml) then WebHCat will ship the Hive tar to the target node where the job actually runs. This means that Hive doesn't need to be installed on every node in the Hadoop cluster. (This part was added in HIVE-5547). This is independent of security, but improves manageability. This should be added to the sections in https://cwiki.apache.org/confluence/display/Hive/WebHCat+Reference that correspond to these methods. was: The following REST calls in WebHCat: 1. mapreduce/jar 2. pig now support an additional parameter 'usehcatalog'. This is a mechanism for the caller to tell WebHCat that the submitted job uses HCat, and thus needs to access the metastore, which requires additional steps for WebHCat to perform. The JavaDoc on corresponding methods in org.apache.hive.hcatalog.templeton.Server describe this parameter. Additionally, if templeton.hive.archive, templeton.hive.home and templeton.hcat.home are defined in webhcat-site.xml (documented in webhcat-default.xml) then WebHCat will ship the Hive tar to the target node where the job actually runs. This means that Hive doesn't need to be installed on every node in the Hadoop cluster. (This part was added in HIVE-5547) This should be added to the sections in https://cwiki.apache.org/confluence/display/Hive/WebHCat+Reference that correspond to these methods. Document 'usehcatalog' parameter on WebHCat calls - Key: HIVE-5627 URL: https://issues.apache.org/jira/browse/HIVE-5627 Project: Hive Issue Type: Sub-task Components: WebHCat Affects Versions: 0.12.0 Reporter: Eugene Koifman Assignee: Lefty Leverenz Fix For: 0.13.0 The following REST calls in WebHCat: 1. mapreduce/jar 2. pig now support an additional parameter 'usehcatalog'. This is a mechanism for the caller to tell WebHCat that the submitted job uses HCat, and thus needs to access the metastore, which requires additional steps for WebHCat to perform in a secure cluster. The JavaDoc on corresponding methods in org.apache.hive.hcatalog.templeton.Server describe this parameter. Additionally, if templeton.hive.archive, templeton.hive.home and templeton.hcat.home are defined in webhcat-site.xml (documented in webhcat-default.xml) then WebHCat will ship the Hive tar to the target node where the job actually runs. This means that Hive doesn't need to be installed on every node in the Hadoop cluster. (This part was added in HIVE-5547). This is independent of security, but improves manageability. This should be added to the sections in https://cwiki.apache.org/confluence/display/Hive/WebHCat+Reference that correspond to these methods. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5629) Fix two javadoc failures in HCatalog
[ https://issues.apache.org/jira/browse/HIVE-5629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13803240#comment-13803240 ] Eugene Koifman commented on HIVE-5629: -- Why is {@link HCatInputFormat#setInput(org.apache.hadoop.mapreduce.Job, InputJobInfo)} causing an issue? this is standard JavaDoc http://www.oracle.com/technetwork/java/javase/documentation/index-137868.html#examples Fix two javadoc failures in HCatalog Key: HIVE-5629 URL: https://issues.apache.org/jira/browse/HIVE-5629 Project: Hive Issue Type: Bug Reporter: Brock Noland Assignee: Brock Noland Attachments: HIVE-5629.patch I am seeing two javadoc failures on HCatalog. These are not being seen by PTest and indeed I cannot reproduce on my Mac but can on Linux. Regardless they should be fixed. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5629) Fix two javadoc failures in HCatalog
[ https://issues.apache.org/jira/browse/HIVE-5629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13803242#comment-13803242 ] Brock Noland commented on HIVE-5629: That method was removed in HIVE-5454 therefore I removed the javadoc markup as that comment is still useful for legacy purposes without the link. Fix two javadoc failures in HCatalog Key: HIVE-5629 URL: https://issues.apache.org/jira/browse/HIVE-5629 Project: Hive Issue Type: Bug Reporter: Brock Noland Assignee: Brock Noland Attachments: HIVE-5629.patch I am seeing two javadoc failures on HCatalog. These are not being seen by PTest and indeed I cannot reproduce on my Mac but can on Linux. Regardless they should be fixed. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5547) webhcat pig job submission should ship hive tar if -usehcatalog is specified
[ https://issues.apache.org/jira/browse/HIVE-5547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13803248#comment-13803248 ] Eugene Koifman commented on HIVE-5547: -- HIVE-5627 covers the doc for this bug webhcat pig job submission should ship hive tar if -usehcatalog is specified Key: HIVE-5547 URL: https://issues.apache.org/jira/browse/HIVE-5547 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.12.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Attachments: HIVE-5547.2.patch, HIVE-5547.patch Currently when when a Pig job is submitted through WebHCat and the Pig script uses HCatalog, that means that Hive should be installed on the node in the cluster which ends up executing the job. For large clusters is this a manageability issue so we should use DistributedCache to ship the Hive tar file to the target node as part of job submission TestPig_11 in hcatalog/src/test/e2e/templeton/tests/jobsubmission.conf has the test case for this -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HIVE-5630) http://hive.apache.org/docs/r0.12.0/api/ does not include any HCat classes
Eugene Koifman created HIVE-5630: Summary: http://hive.apache.org/docs/r0.12.0/api/ does not include any HCat classes Key: HIVE-5630 URL: https://issues.apache.org/jira/browse/HIVE-5630 Project: Hive Issue Type: Bug Components: Documentation Affects Versions: 0.12.0, 0.11.0, 0.10.0 Reporter: Eugene Koifman same is true for 0.10 and 0.11 -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5630) http://hive.apache.org/docs/r0.12.0/api/ does not include any HCat classes
[ https://issues.apache.org/jira/browse/HIVE-5630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-5630: - Component/s: HCatalog http://hive.apache.org/docs/r0.12.0/api/ does not include any HCat classes -- Key: HIVE-5630 URL: https://issues.apache.org/jira/browse/HIVE-5630 Project: Hive Issue Type: Bug Components: Documentation, HCatalog Affects Versions: 0.10.0, 0.11.0, 0.12.0 Reporter: Eugene Koifman same is true for 0.10 and 0.11 -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-4388) HBase tests fail against Hadoop 2
[ https://issues.apache.org/jira/browse/HIVE-4388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sushanth Sowmyan updated HIVE-4388: --- Attachment: HIVE-4388.10.patch Attaching updated patch and re-marking patch-available so that precommit tests pick it up. HBase tests fail against Hadoop 2 - Key: HIVE-4388 URL: https://issues.apache.org/jira/browse/HIVE-4388 Project: Hive Issue Type: Bug Components: HBase Handler Reporter: Gunther Hagleitner Assignee: Brock Noland Attachments: HIVE-4388.10.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388-wip.txt Currently we're building by default against 0.92. When you run against hadoop 2 (-Dhadoop.mr.rev=23) builds fail because of: HBASE-5963. HIVE-3861 upgrades the version of hbase used. This will get you past the problem in HBASE-5963 (which was fixed in 0.94.1) but fails with: HBASE-6396. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-4388) HBase tests fail against Hadoop 2
[ https://issues.apache.org/jira/browse/HIVE-4388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sushanth Sowmyan updated HIVE-4388: --- Status: Open (was: Patch Available) HBase tests fail against Hadoop 2 - Key: HIVE-4388 URL: https://issues.apache.org/jira/browse/HIVE-4388 Project: Hive Issue Type: Bug Components: HBase Handler Reporter: Gunther Hagleitner Assignee: Brock Noland Attachments: HIVE-4388.10.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388-wip.txt Currently we're building by default against 0.92. When you run against hadoop 2 (-Dhadoop.mr.rev=23) builds fail because of: HBASE-5963. HIVE-3861 upgrades the version of hbase used. This will get you past the problem in HBASE-5963 (which was fixed in 0.94.1) but fails with: HBASE-6396. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-4388) HBase tests fail against Hadoop 2
[ https://issues.apache.org/jira/browse/HIVE-4388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sushanth Sowmyan updated HIVE-4388: --- Status: Patch Available (was: Open) HBase tests fail against Hadoop 2 - Key: HIVE-4388 URL: https://issues.apache.org/jira/browse/HIVE-4388 Project: Hive Issue Type: Bug Components: HBase Handler Reporter: Gunther Hagleitner Assignee: Brock Noland Attachments: HIVE-4388.10.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388-wip.txt Currently we're building by default against 0.92. When you run against hadoop 2 (-Dhadoop.mr.rev=23) builds fail because of: HBASE-5963. HIVE-3861 upgrades the version of hbase used. This will get you past the problem in HBASE-5963 (which was fixed in 0.94.1) but fails with: HBASE-6396. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-4388) HBase tests fail against Hadoop 2
[ https://issues.apache.org/jira/browse/HIVE-4388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13803276#comment-13803276 ] Brock Noland commented on HIVE-4388: For my part it looks good! The only things I noted were: * I don't see the version upgrade? I think the protobufs stuff will be invalid without hbase 0.96? * 0.96 has been released so I think we can remove the SNAPSHOT stuff in addition to adding apache SNAPSHOT's to the hcat pom. HBase tests fail against Hadoop 2 - Key: HIVE-4388 URL: https://issues.apache.org/jira/browse/HIVE-4388 Project: Hive Issue Type: Bug Components: HBase Handler Reporter: Gunther Hagleitner Assignee: Brock Noland Attachments: HIVE-4388.10.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388-wip.txt Currently we're building by default against 0.92. When you run against hadoop 2 (-Dhadoop.mr.rev=23) builds fail because of: HBASE-5963. HIVE-3861 upgrades the version of hbase used. This will get you past the problem in HBASE-5963 (which was fixed in 0.94.1) but fails with: HBASE-6396. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5519) Use paging mechanism for templeton get requests.
[ https://issues.apache.org/jira/browse/HIVE-5519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Sankar Sivarama Subramaniyan updated HIVE-5519: Description: Issuing a command to retrieve the jobs field using https://mwinkledemo.azurehdinsight.net:563/templeton/v1/queue/job_id?user.name=adminfields=* --user u:p will result in timeout in windows machine. The issue happens because of the amount of data that needs to be fetched. The proposal is to use paging based encoding scheme so that we flush the contents regularly and the client does not time out. was: Issuing a command to retrieve the jobs field using https://mwinkledemo.azurehdinsight.net:563/templeton/v1/queue/job_id?user.name=adminfields=* --user u:p will result in timeout in windows machine. The issue happens because of the amount of data that needs to be fetched. The proposal is to introduce a new api to retrieve a list of job details rather than retrieve all the information using a single command. Summary: Use paging mechanism for templeton get requests. (was: Support ranges of job ids for templeton) Use paging mechanism for templeton get requests. Key: HIVE-5519 URL: https://issues.apache.org/jira/browse/HIVE-5519 Project: Hive Issue Type: Bug Components: WebHCat Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Hari Sankar Sivarama Subramaniyan Issuing a command to retrieve the jobs field using https://mwinkledemo.azurehdinsight.net:563/templeton/v1/queue/job_id?user.name=adminfields=* --user u:p will result in timeout in windows machine. The issue happens because of the amount of data that needs to be fetched. The proposal is to use paging based encoding scheme so that we flush the contents regularly and the client does not time out. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-4969) HCatalog HBaseHCatStorageHandler is not returning all the data
[ https://issues.apache.org/jira/browse/HIVE-4969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13803287#comment-13803287 ] Venki Korukanti commented on HIVE-4969: --- I haven't tested this on latest trunk. I will test it and attach a unittest. HCatalog HBaseHCatStorageHandler is not returning all the data -- Key: HIVE-4969 URL: https://issues.apache.org/jira/browse/HIVE-4969 Project: Hive Issue Type: Bug Components: HCatalog Affects Versions: 0.11.0 Reporter: Venki Korukanti Priority: Critical Attachments: HIVE-4969-1.patch Repro steps: 1) Create an HCatalog table mapped to HBase table. hcat -e CREATE TABLE studentHCat(rownum int, name string, age int, gpa float) STORED BY 'org.apache.hcatalog.hbase.HBaseHCatStorageHandler' TBLPROPERTIES('hbase.table.name' ='studentHBase', 'hbase.columns.mapping' = ':key,onecf:name,twocf:age,threecf:gpa'); 2) Load the following data from Pig. cat student_data 1^Asarah laertes^A23^A2.40 2^Atom allen^A72^A1.57 3^Abob ovid^A61^A2.67 4^Aethan nixon^A38^A2.15 5^Acalvin robinson^A28^A2.53 6^Airene ovid^A65^A2.56 7^Ayuri garcia^A36^A1.65 8^Acalvin nixon^A41^A1.04 9^Ajessica davidson^A48^A2.11 10^Akatie king^A39^A1.05 grunt A = LOAD 'student_data' AS (rownum:int,name:chararray,age:int,gpa:float); grunt STORE A INTO 'studentHCat' USING org.apache.hcatalog.pig.HCatStorer(); 3) Now from HBase do a scan on the studentHBase table hbase(main):026:0 scan 'studentPig', {LIMIT = 5} 4) From pig access the data in table grunt A = LOAD 'studentHCat' USING org.apache.hcatalog.pig.HCatLoader(); grunt STORE A INTO '/user/root/studentPig'; 5) Verify the output written in StudentPig hadoop fs -cat /user/root/studentPig/part-r-0 1 23 2 72 3 61 4 38 5 28 6 65 7 36 8 41 9 48 10 39 The data returned has only two fields (rownum and age). Problem: While reading the data from HBase table, HbaseSnapshotRecordReader gets data row in Result (org.apache.hadoop.hbase.client.Result) object and processes the KeyValue fields in it. After processing, it creates another Result object out of the processed KeyValue array. Problem here is KeyValue array is not sorted. Result object expects the input KeyValue array to have sorted elements. When we call the Result.getValue() it returns no value for some of the fields as it does a binary search on un-ordered array. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HIVE-5631) Index creation on a skew table fails
Venki Korukanti created HIVE-5631: - Summary: Index creation on a skew table fails Key: HIVE-5631 URL: https://issues.apache.org/jira/browse/HIVE-5631 Project: Hive Issue Type: Bug Components: Database/Schema Affects Versions: 0.12.0 Reporter: Venki Korukanti Assignee: Venki Korukanti Fix For: 0.13.0 REPRO STEPS: create database skewtest; use skewtest; create table skew (id bigint, acct string) skewed by (acct) on ('CC','CH'); create index skew_indx on table skew (id) as 'org.apache.hadoop.hive.ql.index.compact.CompactIndexHandler' WITH DEFERRED REBUILD; Last DDL fails with following error. FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. InvalidObjectException(message:Invalid skew column [acct]) When creating a table, Hive has sanity tests to make sure the columns have proper names and the skewed columns are subset of the table columns. Here we fail because index table has skewed column info. Index tables's skewed columns include {acct} and the columns are {id, _bucketname, _offsets}. As the skewed column {acct} is not part of the table columns Hive throws the exception. The reason why Index table got skewed column info even though its definition has no such info is: When creating the index table a deep copy of the base table's StorageDescriptor (SD) (in this case 'skew') is made. And in that copied SD, index specific parameters are set and unrelated parameters are reset. Here skewed column info is not reset (there are few other params that are not reset). That's why the index table contains the skewed column info. Fix: Instead of deep copying the base table StorageDescriptor, create a new one from gathered info. This way it avoids the index table to inherit unnecessary properties in SD from base table. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5403) Move loading of filesystem, ugi, metastore client to hive session
[ https://issues.apache.org/jira/browse/HIVE-5403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13803292#comment-13803292 ] Gunther Hagleitner commented on HIVE-5403: -- Another one: PerfLogger doesn't work on the backend with this change anymore. The problem is that the SessionState now uses MetaException in the code path to start the session. That's not available on the backend. PerfLogger has logic to determine whether it runs front or backend. It does so by checking SessionState.get() == null. That check cannot be executed anymore because loading the SessionState tries to resolve the MetaException (MetaStore api). The easiest fix would be to collapse the exception handlers to one that catches exception (super class of meta store) and wraps that into a runtime exception. Logically that's no different from what's performed right now. Can we have a follow up to this one as well? Move loading of filesystem, ugi, metastore client to hive session - Key: HIVE-5403 URL: https://issues.apache.org/jira/browse/HIVE-5403 Project: Hive Issue Type: Bug Affects Versions: 0.13.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K Fix For: 0.13.0 Attachments: HIVE-5403.1.patch, HIVE-5403.2.patch, HIVE-5403.3.patch, HIVE-5403.4.patch As part of HIVE-5184, the metastore connection, loading filesystem were done as part of the tez session so as to speed up query times while paying a cost at startup. We can do this more generally in hive to apply to both the mapreduce and tez side of things. -- This message was sent by Atlassian JIRA (v6.1#6144)
Review Request 14887: Subquery support: disallow nesting of SubQueries
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/14887/ --- Review request for hive and Ashutosh Chauhan. Bugs: HIVE-5613 https://issues.apache.org/jira/browse/HIVE-5613 Repository: hive-git Description --- This is Restriction 9 from the SubQuery design doc: We will not do algebraic transformations for these kinds of queries: {noformat} -query 1 select ... from x where x.b in (select u from y where y.c = 10 and exists (select m from z where z.A = x.C) ) - query 2 select ... from x where x.b in (select u from y where y.c = 10 and exists (select m from z where z.A = y.D) {noformat} Diffs - ql/src/java/org/apache/hadoop/hive/ql/parse/QB.java 50b5a77 ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 6fc3cd5 ql/src/java/org/apache/hadoop/hive/ql/parse/SubQueryUtils.java 2d7775c ql/src/test/queries/clientnegative/subquery_nested_subquery.q PRE-CREATION ql/src/test/results/clientnegative/subquery_nested_subquery.q.out PRE-CREATION Diff: https://reviews.apache.org/r/14887/diff/ Testing --- tested subquery tests added new subquery_nested_subquery.q negative test Thanks, Harish Butani
[jira] [Commented] (HIVE-5629) Fix two javadoc failures in HCatalog
[ https://issues.apache.org/jira/browse/HIVE-5629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13803314#comment-13803314 ] Eugene Koifman commented on HIVE-5629: -- I see Fix two javadoc failures in HCatalog Key: HIVE-5629 URL: https://issues.apache.org/jira/browse/HIVE-5629 Project: Hive Issue Type: Bug Reporter: Brock Noland Assignee: Brock Noland Attachments: HIVE-5629.patch I am seeing two javadoc failures on HCatalog. These are not being seen by PTest and indeed I cannot reproduce on my Mac but can on Linux. Regardless they should be fixed. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HIVE-5632) Eliminate splits based on SARGs using stripe statistics in ORC
Prasanth J created HIVE-5632: Summary: Eliminate splits based on SARGs using stripe statistics in ORC Key: HIVE-5632 URL: https://issues.apache.org/jira/browse/HIVE-5632 Project: Hive Issue Type: Improvement Affects Versions: 0.13.0 Reporter: Prasanth J Assignee: Prasanth J HIVE-5562 provides stripe level statistics in ORC. Stripe level statistics combined with predicate pushdown in ORC (HIVE-4246) can be used to eliminate the stripes (thereby splits) that doesn't satisfy the predicate condition. This can greatly reduce unnecessary reads. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5632) Eliminate splits based on SARGs using stripe statistics in ORC
[ https://issues.apache.org/jira/browse/HIVE-5632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth J updated HIVE-5632: - Attachment: HIVE-5632.1.patch.txt Eliminate splits based on SARGs using stripe statistics in ORC -- Key: HIVE-5632 URL: https://issues.apache.org/jira/browse/HIVE-5632 Project: Hive Issue Type: Improvement Affects Versions: 0.13.0 Reporter: Prasanth J Assignee: Prasanth J Labels: orcfile Attachments: HIVE-5632.1.patch.txt HIVE-5562 provides stripe level statistics in ORC. Stripe level statistics combined with predicate pushdown in ORC (HIVE-4246) can be used to eliminate the stripes (thereby splits) that doesn't satisfy the predicate condition. This can greatly reduce unnecessary reads. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5632) Eliminate splits based on SARGs using stripe statistics in ORC
[ https://issues.apache.org/jira/browse/HIVE-5632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13803342#comment-13803342 ] Prasanth J commented on HIVE-5632: -- This patch is generated on top of HIVE-5562. Test cases needs to be added. Eliminate splits based on SARGs using stripe statistics in ORC -- Key: HIVE-5632 URL: https://issues.apache.org/jira/browse/HIVE-5632 Project: Hive Issue Type: Improvement Affects Versions: 0.13.0 Reporter: Prasanth J Assignee: Prasanth J Labels: orcfile Attachments: HIVE-5632.1.patch.txt HIVE-5562 provides stripe level statistics in ORC. Stripe level statistics combined with predicate pushdown in ORC (HIVE-4246) can be used to eliminate the stripes (thereby splits) that doesn't satisfy the predicate condition. This can greatly reduce unnecessary reads. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-4388) HBase tests fail against Hadoop 2
[ https://issues.apache.org/jira/browse/HIVE-4388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sushanth Sowmyan updated HIVE-4388: --- Status: Open (was: Patch Available) HBase tests fail against Hadoop 2 - Key: HIVE-4388 URL: https://issues.apache.org/jira/browse/HIVE-4388 Project: Hive Issue Type: Bug Components: HBase Handler Reporter: Gunther Hagleitner Assignee: Brock Noland Attachments: HIVE-4388.10.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388-wip.txt Currently we're building by default against 0.92. When you run against hadoop 2 (-Dhadoop.mr.rev=23) builds fail because of: HBASE-5963. HIVE-3861 upgrades the version of hbase used. This will get you past the problem in HBASE-5963 (which was fixed in 0.94.1) but fails with: HBASE-6396. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-4388) HBase tests fail against Hadoop 2
[ https://issues.apache.org/jira/browse/HIVE-4388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13803348#comment-13803348 ] Sushanth Sowmyan commented on HIVE-4388: Ack, that's because I was building with a -Dhbase.version.with.hadoop.version whenever I built. Sorry, updating. And agreed, it makes sense to remove that ${use.hbase.snapshot} special casing. Removing it. HBase tests fail against Hadoop 2 - Key: HIVE-4388 URL: https://issues.apache.org/jira/browse/HIVE-4388 Project: Hive Issue Type: Bug Components: HBase Handler Reporter: Gunther Hagleitner Assignee: Brock Noland Attachments: HIVE-4388.10.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388-wip.txt Currently we're building by default against 0.92. When you run against hadoop 2 (-Dhadoop.mr.rev=23) builds fail because of: HBASE-5963. HIVE-3861 upgrades the version of hbase used. This will get you past the problem in HBASE-5963 (which was fixed in 0.94.1) but fails with: HBASE-6396. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-4388) HBase tests fail against Hadoop 2
[ https://issues.apache.org/jira/browse/HIVE-4388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sushanth Sowmyan updated HIVE-4388: --- Attachment: HIVE-4388.11.patch HBase tests fail against Hadoop 2 - Key: HIVE-4388 URL: https://issues.apache.org/jira/browse/HIVE-4388 Project: Hive Issue Type: Bug Components: HBase Handler Reporter: Gunther Hagleitner Assignee: Brock Noland Attachments: HIVE-4388.10.patch, HIVE-4388.11.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388-wip.txt Currently we're building by default against 0.92. When you run against hadoop 2 (-Dhadoop.mr.rev=23) builds fail because of: HBASE-5963. HIVE-3861 upgrades the version of hbase used. This will get you past the problem in HBASE-5963 (which was fixed in 0.94.1) but fails with: HBASE-6396. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-4388) HBase tests fail against Hadoop 2
[ https://issues.apache.org/jira/browse/HIVE-4388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sushanth Sowmyan updated HIVE-4388: --- Status: Patch Available (was: Open) HBase tests fail against Hadoop 2 - Key: HIVE-4388 URL: https://issues.apache.org/jira/browse/HIVE-4388 Project: Hive Issue Type: Bug Components: HBase Handler Reporter: Gunther Hagleitner Assignee: Brock Noland Attachments: HIVE-4388.10.patch, HIVE-4388.11.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388-wip.txt Currently we're building by default against 0.92. When you run against hadoop 2 (-Dhadoop.mr.rev=23) builds fail because of: HBASE-5963. HIVE-3861 upgrades the version of hbase used. This will get you past the problem in HBASE-5963 (which was fixed in 0.94.1) but fails with: HBASE-6396. -- This message was sent by Atlassian JIRA (v6.1#6144)
Review Request 14890: Index creation on a skew table fails
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/14890/ --- Review request for hive, Ashutosh Chauhan and Thejas Nair. Bugs: HIVE-5631 https://issues.apache.org/jira/browse/HIVE-5631 Repository: hive-git Description --- Repro steps: CREATE DATABASE skewtest; USE skewtest; CREATE TABLE skew (id bigint, acct string) SKEWED BY (acct) ON ('CC','CH'); CREATE INDEX skew_indx ON TABLE skew (id) as 'org.apache.hadoop.hive.ql.index.compact.CompactIndexHandler' WITH DEFERRED REBUILD; Last DDL fails with following error. FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. InvalidObjectException(message:Invalid skew column [acct]) When creating a table, Hive has sanity tests to make sure the columns have proper names and the skewed columns are subset of the table columns. Here we fail because index table has skewed column info. Index tables's skewed columns include {acct} and the columns are {id, _bucketname, _offsets}. As the skewed column {acct} is not part of the table columns Hive throws the exception. The reason why Index table got skewed column info even though its definition has no such info is: When creating the index table a deep copy of the base table's StorageDescriptor (SD) (in this case 'skew') is made. And in that copied SD, index specific parameters are set and unrelated parameters are reset. Here skewed column info is not reset (there are few other params that are not reset). That's why the index table contains the skewed column info. Fix: Instead of deep copying the base table StorageDescriptor, create a new one from gathered info. This way it avoids the index table to inherit unnecessary properties in SD from base table. Diffs - ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java b0f124b ql/src/test/org/apache/hadoop/hive/ql/metadata/TestHive.java d0cbed6 ql/src/test/queries/clientpositive/index_skewtable.q PRE-CREATION ql/src/test/results/clientpositive/index_skewtable.q.out PRE-CREATION Diff: https://reviews.apache.org/r/14890/diff/ Testing --- Added unittest and ran the index related unittest queries Thanks, Venki Korukanti
[jira] [Commented] (HIVE-5631) Index creation on a skew table fails
[ https://issues.apache.org/jira/browse/HIVE-5631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13803363#comment-13803363 ] Venki Korukanti commented on HIVE-5631: --- Review: https://reviews.apache.org/r/14890/ Index creation on a skew table fails Key: HIVE-5631 URL: https://issues.apache.org/jira/browse/HIVE-5631 Project: Hive Issue Type: Bug Components: Database/Schema Affects Versions: 0.12.0 Reporter: Venki Korukanti Assignee: Venki Korukanti Fix For: 0.13.0 REPRO STEPS: create database skewtest; use skewtest; create table skew (id bigint, acct string) skewed by (acct) on ('CC','CH'); create index skew_indx on table skew (id) as 'org.apache.hadoop.hive.ql.index.compact.CompactIndexHandler' WITH DEFERRED REBUILD; Last DDL fails with following error. FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. InvalidObjectException(message:Invalid skew column [acct]) When creating a table, Hive has sanity tests to make sure the columns have proper names and the skewed columns are subset of the table columns. Here we fail because index table has skewed column info. Index tables's skewed columns include {acct} and the columns are {id, _bucketname, _offsets}. As the skewed column {acct} is not part of the table columns Hive throws the exception. The reason why Index table got skewed column info even though its definition has no such info is: When creating the index table a deep copy of the base table's StorageDescriptor (SD) (in this case 'skew') is made. And in that copied SD, index specific parameters are set and unrelated parameters are reset. Here skewed column info is not reset (there are few other params that are not reset). That's why the index table contains the skewed column info. Fix: Instead of deep copying the base table StorageDescriptor, create a new one from gathered info. This way it avoids the index table to inherit unnecessary properties in SD from base table. -- This message was sent by Atlassian JIRA (v6.1#6144)
Re: Review Request 14870: HIVE-5351: Secure-Socket-Layer (SSL) support for HiveServer2
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/14870/ --- (Updated Oct. 23, 2013, 9:44 p.m.) Review request for hive, Brock Noland and Thejas Nair. Bugs: HIVE-5351 https://issues.apache.org/jira/browse/HIVE-5351 Repository: hive-git Description --- Add support for encrypted communication for Plain SASL for binary thrift transport. - Optional thrift SSL transport on server side if configured. - Optional thrift SSL transport for JDBC client with configurable trust store - Added a miniHS2 class that for running a hiveserver2 for testing Diffs (updated) - common/src/java/org/apache/hadoop/hive/conf/HiveConf.java abfde42 data/files/keystore.jks PRE-CREATION data/files/truststore.jks PRE-CREATION eclipse-templates/TestJdbcMiniHS2.launchtemplate PRE-CREATION jdbc/ivy.xml b9d0cea jdbc/src/java/org/apache/hive/jdbc/HiveConnection.java f155686 jdbc/src/test/org/apache/hive/jdbc/TestJdbcWithMiniHS2.java PRE-CREATION jdbc/src/test/org/apache/hive/jdbc/TestSSL.java PRE-CREATION metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 24b1832 service/src/java/org/apache/hive/service/auth/HiveAuthFactory.java 5a66a6c service/src/java/org/apache/hive/service/cli/thrift/ThriftBinaryCLIService.java 9c8f5c1 service/src/test/org/apache/hive/service/miniHS2/AbstarctHiveService.java PRE-CREATION service/src/test/org/apache/hive/service/miniHS2/MiniHS2.java PRE-CREATION service/src/test/org/apache/hive/service/miniHS2/TestHiveServer2.java PRE-CREATION shims/src/0.20/java/org/apache/hadoop/hive/shims/Hadoop20Shims.java 623ebcd Diff: https://reviews.apache.org/r/14870/diff/ Testing --- - Basic HiveServer2 test cases with miniHS2 - Added multiple test cases for SSL transport Thanks, Prasad Mujumdar
[jira] [Updated] (HIVE-5351) Secure-Socket-Layer (SSL) support for HiveServer2
[ https://issues.apache.org/jira/browse/HIVE-5351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasad Mujumdar updated HIVE-5351: -- Attachment: HIVE-5351.2.patch Secure-Socket-Layer (SSL) support for HiveServer2 - Key: HIVE-5351 URL: https://issues.apache.org/jira/browse/HIVE-5351 Project: Hive Issue Type: Improvement Components: Authorization, HiveServer2, JDBC Affects Versions: 0.11.0, 0.12.0 Reporter: Prasad Mujumdar Assignee: Prasad Mujumdar Attachments: HIVE-5351.1.patch, HIVE-5351.2.patch HiveServer2 and JDBC driver should support encrypted communication using SSL -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5351) Secure-Socket-Layer (SSL) support for HiveServer2
[ https://issues.apache.org/jira/browse/HIVE-5351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasad Mujumdar updated HIVE-5351: -- Attachment: HIVE-5351.2.patch Secure-Socket-Layer (SSL) support for HiveServer2 - Key: HIVE-5351 URL: https://issues.apache.org/jira/browse/HIVE-5351 Project: Hive Issue Type: Improvement Components: Authorization, HiveServer2, JDBC Affects Versions: 0.11.0, 0.12.0 Reporter: Prasad Mujumdar Assignee: Prasad Mujumdar Attachments: HIVE-5351.1.patch, HIVE-5351.2.patch HiveServer2 and JDBC driver should support encrypted communication using SSL -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5351) Secure-Socket-Layer (SSL) support for HiveServer2
[ https://issues.apache.org/jira/browse/HIVE-5351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasad Mujumdar updated HIVE-5351: -- Attachment: (was: HIVE-5351.2.patch) Secure-Socket-Layer (SSL) support for HiveServer2 - Key: HIVE-5351 URL: https://issues.apache.org/jira/browse/HIVE-5351 Project: Hive Issue Type: Improvement Components: Authorization, HiveServer2, JDBC Affects Versions: 0.11.0, 0.12.0 Reporter: Prasad Mujumdar Assignee: Prasad Mujumdar Attachments: HIVE-5351.1.patch, HIVE-5351.2.patch HiveServer2 and JDBC driver should support encrypted communication using SSL -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5631) Index creation on a skew table fails
[ https://issues.apache.org/jira/browse/HIVE-5631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Venki Korukanti updated HIVE-5631: -- Attachment: HIVE-5631.1.patch.txt Index creation on a skew table fails Key: HIVE-5631 URL: https://issues.apache.org/jira/browse/HIVE-5631 Project: Hive Issue Type: Bug Components: Database/Schema Affects Versions: 0.12.0 Reporter: Venki Korukanti Assignee: Venki Korukanti Fix For: 0.13.0 Attachments: HIVE-5631.1.patch.txt REPRO STEPS: create database skewtest; use skewtest; create table skew (id bigint, acct string) skewed by (acct) on ('CC','CH'); create index skew_indx on table skew (id) as 'org.apache.hadoop.hive.ql.index.compact.CompactIndexHandler' WITH DEFERRED REBUILD; Last DDL fails with following error. FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. InvalidObjectException(message:Invalid skew column [acct]) When creating a table, Hive has sanity tests to make sure the columns have proper names and the skewed columns are subset of the table columns. Here we fail because index table has skewed column info. Index tables's skewed columns include {acct} and the columns are {id, _bucketname, _offsets}. As the skewed column {acct} is not part of the table columns Hive throws the exception. The reason why Index table got skewed column info even though its definition has no such info is: When creating the index table a deep copy of the base table's StorageDescriptor (SD) (in this case 'skew') is made. And in that copied SD, index specific parameters are set and unrelated parameters are reset. Here skewed column info is not reset (there are few other params that are not reset). That's why the index table contains the skewed column info. Fix: Instead of deep copying the base table StorageDescriptor, create a new one from gathered info. This way it avoids the index table to inherit unnecessary properties in SD from base table. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5631) Index creation on a skew table fails
[ https://issues.apache.org/jira/browse/HIVE-5631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Venki Korukanti updated HIVE-5631: -- Status: Patch Available (was: Open) Index creation on a skew table fails Key: HIVE-5631 URL: https://issues.apache.org/jira/browse/HIVE-5631 Project: Hive Issue Type: Bug Components: Database/Schema Affects Versions: 0.12.0 Reporter: Venki Korukanti Assignee: Venki Korukanti Fix For: 0.13.0 Attachments: HIVE-5631.1.patch.txt REPRO STEPS: create database skewtest; use skewtest; create table skew (id bigint, acct string) skewed by (acct) on ('CC','CH'); create index skew_indx on table skew (id) as 'org.apache.hadoop.hive.ql.index.compact.CompactIndexHandler' WITH DEFERRED REBUILD; Last DDL fails with following error. FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. InvalidObjectException(message:Invalid skew column [acct]) When creating a table, Hive has sanity tests to make sure the columns have proper names and the skewed columns are subset of the table columns. Here we fail because index table has skewed column info. Index tables's skewed columns include {acct} and the columns are {id, _bucketname, _offsets}. As the skewed column {acct} is not part of the table columns Hive throws the exception. The reason why Index table got skewed column info even though its definition has no such info is: When creating the index table a deep copy of the base table's StorageDescriptor (SD) (in this case 'skew') is made. And in that copied SD, index specific parameters are set and unrelated parameters are reset. Here skewed column info is not reset (there are few other params that are not reset). That's why the index table contains the skewed column info. Fix: Instead of deep copying the base table StorageDescriptor, create a new one from gathered info. This way it avoids the index table to inherit unnecessary properties in SD from base table. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5506) Hive SPLIT function does not return array correctly
[ https://issues.apache.org/jira/browse/HIVE-5506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13803378#comment-13803378 ] Hudson commented on HIVE-5506: -- ABORTED: Integrated in Hive-trunk-hadoop2 #518 (See [https://builds.apache.org/job/Hive-trunk-hadoop2/518/]) HIVE-5506 : Hive SPLIT function does not return array correctly (Vikram Dixit via Ashutosh Chauhan) (hashutosh: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1534775) * /hive/trunk/data/files/input.txt * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFSplit.java * /hive/trunk/ql/src/test/queries/clientpositive/split.q * /hive/trunk/ql/src/test/results/clientpositive/split.q.out * /hive/trunk/ql/src/test/results/clientpositive/udf_split.q.out Hive SPLIT function does not return array correctly --- Key: HIVE-5506 URL: https://issues.apache.org/jira/browse/HIVE-5506 Project: Hive Issue Type: Bug Components: SQL, UDF Affects Versions: 0.9.0, 0.10.0, 0.11.0 Environment: Hive Reporter: John Omernik Assignee: Vikram Dixit K Fix For: 0.13.0 Attachments: HIVE-5506.1.patch, HIVE-5506.2.patch Hello all, I think I have outlined a bug in the hive split function: Summary: When calling split on a string of data, it will only return all array items if the the last array item has a value. For example, if I have a string of text delimited by tab with 7 columns, and the first four are filled, but the last three are blank, split will only return a 4 position array. If any number of middle columns are empty, but the last item still has a value, then it will return the proper number of columns. This was tested in Hive 0.9 and hive 0.11. Data: (Note \t represents a tab char, \x09 the line endings should be \n (UNIX style) not sure what email will do to them). Basically my data is 7 lines of data with the first 7 letters separated by tab. On some lines I've left out certain letters, but kept the number of tabs exactly the same. input.txt a\tb\tc\td\te\tf\tg a\tb\tc\td\te\t\tg a\tb\t\td\t\tf\tg \t\t\td\te\tf\tg a\tb\tc\td\t\t\t a\t\t\t\te\tf\tg a\t\t\td\t\t\tg I then created a table with one column from that data: DROP TABLE tmp_jo_tab_test; CREATE table tmp_jo_tab_test (message_line STRING) STORED AS TEXTFILE; LOAD DATA LOCAL INPATH '/tmp/input.txt' OVERWRITE INTO TABLE tmp_jo_tab_test; Ok just to validate I created a python counting script: #!/usr/bin/python import sys for line in sys.stdin: line = line[0:-1] out = line.split(\t) print len(out) The output there is : $ cat input.txt |./cnt_tabs.py 7 7 7 7 7 7 7 Based on that information, split on tab should return me 7 for each line as well: hive -e select size(split(message_line, '\\t')) from tmp_jo_tab_test; 7 7 7 7 4 7 7 However it does not. It would appear that the line where only the first four letters are filled in(and blank is passed in on the last three) only returns 4 splits, where there should technically be 7, 4 for letters included, and three blanks. a\tb\tc\td\t\t\t -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Assigned] (HIVE-5216) Need to annotate public API in HCatalog
[ https://issues.apache.org/jira/browse/HIVE-5216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman reassigned HIVE-5216: Assignee: Eugene Koifman Need to annotate public API in HCatalog --- Key: HIVE-5216 URL: https://issues.apache.org/jira/browse/HIVE-5216 Project: Hive Issue Type: Bug Components: HCatalog, WebHCat Affects Versions: 0.12.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Attachments: HIVE-5216.patch need to annotate which API is considered public using something like @InterfaceAudience.Public @InterfaceStability.Evolving Currently this is what is considered (at a minimum) public API HCatLoader HCatStorer HCatInputFormat HCatOutputFormat HCatReader HCatWriter HCatRecord HCatSchema This is needed so that clients/dependent projects know which API they can rely on and which can change w/o notice. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5216) Need to annotate public API in HCatalog
[ https://issues.apache.org/jira/browse/HIVE-5216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-5216: - Status: Patch Available (was: Open) Need to annotate public API in HCatalog --- Key: HIVE-5216 URL: https://issues.apache.org/jira/browse/HIVE-5216 Project: Hive Issue Type: Bug Components: HCatalog, WebHCat Affects Versions: 0.12.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Attachments: HIVE-5216.patch need to annotate which API is considered public using something like @InterfaceAudience.Public @InterfaceStability.Evolving Currently this is what is considered (at a minimum) public API HCatLoader HCatStorer HCatInputFormat HCatOutputFormat HCatReader HCatWriter HCatRecord HCatSchema This is needed so that clients/dependent projects know which API they can rely on and which can change w/o notice. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5216) Need to annotate public API in HCatalog
[ https://issues.apache.org/jira/browse/HIVE-5216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-5216: - Attachment: HIVE-5216.patch Need to annotate public API in HCatalog --- Key: HIVE-5216 URL: https://issues.apache.org/jira/browse/HIVE-5216 Project: Hive Issue Type: Bug Components: HCatalog, WebHCat Affects Versions: 0.12.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Attachments: HIVE-5216.patch need to annotate which API is considered public using something like @InterfaceAudience.Public @InterfaceStability.Evolving Currently this is what is considered (at a minimum) public API HCatLoader HCatStorer HCatInputFormat HCatOutputFormat HCatReader HCatWriter HCatRecord HCatSchema This is needed so that clients/dependent projects know which API they can rely on and which can change w/o notice. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5283) Merge vectorization branch to trunk
[ https://issues.apache.org/jira/browse/HIVE-5283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13803389#comment-13803389 ] Thejas M Nair commented on HIVE-5283: - Added fix version of 0.13 in addition to vectorization-branch for these 106 jiras (fixed+ fix-version=vectorization-branch ). Merge vectorization branch to trunk --- Key: HIVE-5283 URL: https://issues.apache.org/jira/browse/HIVE-5283 Project: Hive Issue Type: Bug Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Fix For: 0.13.0 Attachments: alltypesorc, HIVE-5283.1.patch, HIVE-5283.2.patch, HIVE-5283.3.patch, HIVE-5283.4.patch The purpose of this jira is to upload vectorization patch, run tests etc. The actual work will continue under HIVE-4160 umbrella jira. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5216) Need to annotate public API in HCatalog
[ https://issues.apache.org/jira/browse/HIVE-5216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13803392#comment-13803392 ] Thejas M Nair commented on HIVE-5216: - +1 Need to annotate public API in HCatalog --- Key: HIVE-5216 URL: https://issues.apache.org/jira/browse/HIVE-5216 Project: Hive Issue Type: Bug Components: HCatalog, WebHCat Affects Versions: 0.12.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Attachments: HIVE-5216.patch need to annotate which API is considered public using something like @InterfaceAudience.Public @InterfaceStability.Evolving Currently this is what is considered (at a minimum) public API HCatLoader HCatStorer HCatInputFormat HCatOutputFormat HCatReader HCatWriter HCatRecord HCatSchema This is needed so that clients/dependent projects know which API they can rely on and which can change w/o notice. -- This message was sent by Atlassian JIRA (v6.1#6144)