[jira] [Commented] (HIVE-6545) analyze table throws NPE for non-existent tables.
[ https://issues.apache.org/jira/browse/HIVE-6545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13919834#comment-13919834 ] Harish Butani commented on HIVE-6545: - +1 analyze table throws NPE for non-existent tables. - Key: HIVE-6545 URL: https://issues.apache.org/jira/browse/HIVE-6545 Project: Hive Issue Type: Bug Components: Statistics Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Attachments: HIVE-6545.patch Instead of NPE, we should give error message to user. -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: Timeline for the Hive 0.13 release?
Yes sure. I am fine with porting over HIVE-5317 and dependents. Besides, couldn't handle 104 angry fans (uh watchers) :) Let’s follow this procedure: if you have features that should go into branch-0.13 please post a message here, give the community a chance to voice their opinions. regards, Harish. On Mar 4, 2014, at 8:03 AM, Alan Gates ga...@hortonworks.com wrote: Sure. I’d really like to get the work related to HIVE-5317 in 0.13. HIVE-5843 is patch available and hopefully can be checked in today. There are several more that depend on that one and can’t be made patch available until then (HIVE-6060, HIVE-6319, HIVE-6460, and HIVE-5687). I don’t want to hold up the branching, but are you ok with those going in after the branch? Alan. On Mar 3, 2014, at 7:53 PM, Harish Butani hbut...@hortonworks.com wrote: I plan to create the branch 5pm PST tomorrow. Ok with everybody? regards, Harish. On Feb 21, 2014, at 5:44 PM, Lefty Leverenz leftylever...@gmail.com wrote: That's appropriate -- let the Hive release march forth on March 4th. -- Lefty On Fri, Feb 21, 2014 at 4:04 PM, Harish Butani hbut...@hortonworks.comwrote: Ok,let’s set it for March 4th . regards, Harish. On Feb 21, 2014, at 12:14 PM, Brock Noland br...@cloudera.com wrote: Might as well make it March 4th or 5th. Otherwise folks will burn weekend time to get patches in. On Fri, Feb 21, 2014 at 2:10 PM, Harish Butani hbut...@hortonworks.com wrote: Yes makes sense. How about we postpone the branching until 10am PST March 3rd, which is the following Monday. Don’t see a point of setting the branch time to a Friday evening. Do people agree? regards, Harish. On Feb 21, 2014, at 11:04 AM, Brock Noland br...@cloudera.com wrote: +1 On Fri, Feb 21, 2014 at 1:02 PM, Thejas Nair the...@hortonworks.com wrote: Can we wait for some few more days for the branching ? I have a few more security fixes that I would like to get in, and we also have a long pre-commit queue ahead right now. How about branching around Friday next week ? By then hadoop 2.3 should also be out as that vote has been concluded, and we can get HIVE-6037 in as well. -Thejas On Sun, Feb 16, 2014 at 5:32 PM, Brock Noland br...@cloudera.com wrote: I'd love to see HIVE-6037 in the 0.13 release. I have +1'ed it pending tests. Brock On Sun, Feb 16, 2014 at 7:23 PM, Navis류승우 navis@nexr.com wrote: HIVE-6037 is for generating hive-default.template file from HiveConf. Could it be included in this release? If it's not, I'll suspend further rebasing of it till next release (conflicts too frequently). 2014-02-16 20:38 GMT+09:00 Lefty Leverenz leftylever...@gmail.com : I'll try to catch up on the wikidocs backlog for 0.13.0 patches in time for the release. It's a long and growing list, though, so no promises. Feel free to do your own documentation, or hand it off to a friendly in-house writer. -- Lefty, self-appointed Hive docs maven On Sat, Feb 15, 2014 at 1:28 PM, Thejas Nair the...@hortonworks.com wrote: Sounds good to me. On Fri, Feb 14, 2014 at 7:29 PM, Harish Butani hbut...@hortonworks.com wrote: Hi, Its mid feb. Wanted to check if the community is ready to cut a branch. Could we cut the branch in a week , say 5pm PST 2/21/14? The goal is to keep the release cycle short: couple of weeks; so after the branch we go into stabilizing mode for hive 0.13, checking in only blocker/critical bug fixes. regards, Harish. On Jan 20, 2014, at 9:25 AM, Brock Noland br...@cloudera.com wrote: Hi, I agree that picking a date to branch and then restricting commits to that branch would be a less time intensive plan for the RM. Brock On Sat, Jan 18, 2014 at 4:21 PM, Harish Butani hbut...@hortonworks.com wrote: Yes agree it is time to start planning for the next release. I would like to volunteer to do the release management duties for this release(will be a great experience for me) Will be happy to do it, if the community is fine with this. regards, Harish. On Jan 17, 2014, at 7:05 PM, Thejas Nair the...@hortonworks.com wrote: Yes, I think it is time to start planning for the next release. For 0.12 release I created a branch and then accepted patches that people asked to be included for sometime, before moving a phase of accepting only critical bug fixes. This turned out to be laborious. I think we should instead give everyone a few weeks to get any patches they are working on to be ready, cut the branch, and take in only critical bug fixes to the branch after that. How about cutting the branch around mid-February and targeting to release in a week or two after that. Thanks, Thejas On Fri, Jan 17, 2014 at 4:39 PM, Carl Steinbach c...@apache.org wrote: I was wondering what people think about setting a tentative date for the Hive 0.13 release? At an old Hive
Re: Review Request 18464: Support secure Subject.doAs() in HiveServer2 JDBC client
Hi Shiv - I believe that the auth mechanism in play is still considered kerberos in this case. It is just based on a preauthenticated subject rather than a UGI. In the end - it is kerberos. On Tue, Mar 4, 2014 at 2:34 PM, Shivaraju Gowda shiv...@cisco.com wrote: On Feb. 27, 2014, 4:59 p.m., Vaibhav Gumashta wrote: service/src/java/org/apache/hive/service/auth/KerberosSaslHelper.java, line 68 https://reviews.apache.org/r/18464/diff/1/?file=503361#file503361line68 Can you push this to HadoopThriftAuthBridge.Client#createClientTransport just like the way the else portion does instead of the createSubjectAssumedTransport method? From within the method you can return the TSubjectAssumingTransport. Shivaraju Gowda wrote: Again this was in my first cut. I was passing the value as tokenStrForm parameter to keep the method signature same. I later moved away from it since it was not elegant and changing the method signature involved broader implications. I felt this functionality didn't belong in Hadoop shim layer. Having the change in there also meant one more jar getting affected(hive-exec.jar) Shivaraju Gowda wrote: Another issue was the dependency on hadoop.core.jar. The calls AuthMethod.valueOf(AuthMethod.class, methodStr) and SaslRpcServer.splitKerberosName(serverPrincipal) in HadoopThriftAuthBridge.Client#createClientTransport are from hadoop.core.jar Vaibhav Gumashta wrote: Actually in case of a kerberos setting, those jars are already required in the client's classpath ( https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Clients#HiveServer2Clients-JDBCClientSetupforaSecureCluster- check Running the JDBC Sample Code section). And this jira is applicable only to a kerberos setup. Correct. But my point is we don't have to have that dependency on external Hadoop component for using kerberos in this way. On Feb. 27, 2014, 4:59 p.m., Vaibhav Gumashta wrote: jdbc/src/java/org/apache/hive/jdbc/HiveConnection.java, line 136 https://reviews.apache.org/r/18464/diff/1/?file=503360#file503360line136 I think, instead of having to do identityContext equals fromKerberosSubject, we can just use assumeSubject equals true/false, keeping the default to false. Shivaraju Gowda wrote: Passing it as assumeSubject boolean url property was my first cut. However I thought assumeSubject itself doesn't convey the message for its intended use in and off by itself(need to refer to the documentation) and making it key-value pair might give it some more meaning and there is also a possibility of it being later used for other use cases (say hypothetically the value can be fromKeyTab, fromTicketCache or fromLogin etc.). Shivaraju Gowda wrote: Do you think it might better if we use auth property here, i.e auth=fromKerberosSubject. Right now the only values for auth=noSasl. Vaibhav Gumashta wrote: auth property is kind of meant to map to the hiveserver2 auth modes [none, sasl, nosasl, kerberos]. The way it is used currently is not very clean and there are some jiras out there to clean that up and make the mapping more evident. OK, I look at this feature as an authentication mechanism. We are authenticating using the KerberosSubject passed by the user. - Shivaraju --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/18464/#review35730 --- On Feb. 25, 2014, 6:50 a.m., Kevin Minder wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/18464/ --- (Updated Feb. 25, 2014, 6:50 a.m.) Review request for hive, Kevin Minder and Vaibhav Gumashta. Bugs: HIVE-6486 https://issues.apache.org/jira/browse/HIVE-6486 Repository: hive-git Description --- Support secure Subject.doAs() in HiveServer2 JDBC client Diffs - jdbc/src/java/org/apache/hive/jdbc/HiveConnection.java 17b4d39 service/src/java/org/apache/hive/service/auth/KerberosSaslHelper.java 379dafb service/src/java/org/apache/hive/service/auth/TSubjectAssumingTransport.java PRE-CREATION Diff: https://reviews.apache.org/r/18464/diff/ Testing --- Manual testing Thanks, Kevin Minder -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly
[jira] [Created] (HIVE-6546) WebHCat job submission for pig with -useHCatalog argument fails on Windows
Eric Hanson created HIVE-6546: - Summary: WebHCat job submission for pig with -useHCatalog argument fails on Windows Key: HIVE-6546 URL: https://issues.apache.org/jira/browse/HIVE-6546 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.12.0, 0.11.0, 0.13.0 Environment: Windows Azure HDINSIGHT and Windows one-box installations. Reporter: Eric Hanson -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6546) WebHCat job submission for pig with -useHCatalog argument fails on Windows
[ https://issues.apache.org/jira/browse/HIVE-6546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Hanson updated HIVE-6546: -- Description: On a one-box windows setup, do the following from a powershell prompt: cmd /c curl.exe -s ` -d user.name=hadoop ` -d arg=-useHCatalog ` -d execute=emp = load '/data/emp/emp_0.dat'; dump emp; ` -d statusdir=/tmp/webhcat.output01 ` 'http://localhost:50111/templeton/v1/pig' -v The job fails with error code 7, but it should run. I traced this down to the following. In the job configuration for the TempletonJobController, we have templeton.args set to cmd,/c,call,C:\\hadooppig-0.11.0.1.3.0.0-0846/bin/pig.cmd,-D__WEBHCAT_TOKEN_FILE_LOCATION__=-useHCatalog,-execute,emp = load '/data/emp/emp_0.dat'; dump emp; Notice the = sign before -useHCatalog. I think this should be a comma. The bad string D__WEBHCAT_TOKEN_FILE_LOCATION__=-useHCatalog gets created in org.apache.hadoop.util.GenericOptionsParser.preProcessForWindows(). It happens at line 434: {code} } else { if (i args.length - 1) { prop += = + args[++i]; // RIGHT HERE! at iterations i = 37, 38 } } {code} Bug is here: {code} if (prop != null) { if (prop.contains(=)) { // -D__WEBHCAT_TOKEN_FILE_LOCATION__ does not contain equal, so else branch is run and appends =-useHCatalog, // everything good } else { if (i args.length - 1) { prop += = + args[++i]; } } newArgs.add(prop); } {code} One possible fix is to change the string constant org.apache.hcatalog.templeton.tool.TempletonControllerJob.TOKEN_FILE_ARG_PLACEHOLDER to have an = sign in it. Or, preProcessForWindows() itself could be changed. WebHCat job submission for pig with -useHCatalog argument fails on Windows -- Key: HIVE-6546 URL: https://issues.apache.org/jira/browse/HIVE-6546 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.11.0, 0.12.0, 0.13.0 Environment: Windows Azure HDINSIGHT and Windows one-box installations. Reporter: Eric Hanson On a one-box windows setup, do the following from a powershell prompt: cmd /c curl.exe -s ` -d user.name=hadoop ` -d arg=-useHCatalog ` -d execute=emp = load '/data/emp/emp_0.dat'; dump emp; ` -d statusdir=/tmp/webhcat.output01 ` 'http://localhost:50111/templeton/v1/pig' -v The job fails with error code 7, but it should run. I traced this down to the following. In the job configuration for the TempletonJobController, we have templeton.args set to cmd,/c,call,C:\\hadooppig-0.11.0.1.3.0.0-0846/bin/pig.cmd,-D__WEBHCAT_TOKEN_FILE_LOCATION__=-useHCatalog,-execute,emp = load '/data/emp/emp_0.dat'; dump emp; Notice the = sign before -useHCatalog. I think this should be a comma. The bad string D__WEBHCAT_TOKEN_FILE_LOCATION__=-useHCatalog gets created in org.apache.hadoop.util.GenericOptionsParser.preProcessForWindows(). It happens at line 434: {code} } else { if (i args.length - 1) { prop += = + args[++i]; // RIGHT HERE! at iterations i = 37, 38 } } {code} Bug is here: {code} if (prop != null) { if (prop.contains(=)) { // -D__WEBHCAT_TOKEN_FILE_LOCATION__ does not contain equal, so else branch is run and appends =-useHCatalog, // everything good } else { if (i args.length - 1) { prop += = + args[++i]; } } newArgs.add(prop); } {code} One possible fix is to change the string constant org.apache.hcatalog.templeton.tool.TempletonControllerJob.TOKEN_FILE_ARG_PLACEHOLDER to have an = sign in it. Or, preProcessForWindows() itself could be changed. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6546) WebHCat job submission for pig with -useHCatalog argument fails on Windows
[ https://issues.apache.org/jira/browse/HIVE-6546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Hanson updated HIVE-6546: -- Environment: HDInsight deploying HDP 1.3: c:\apps\dist\pig-0.11.0.1.3.2.0-05 Also on Windows HDP 1.3 one-box configuration. was:Windows Azure HDINSIGHT and Windows one-box installations. WebHCat job submission for pig with -useHCatalog argument fails on Windows -- Key: HIVE-6546 URL: https://issues.apache.org/jira/browse/HIVE-6546 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.11.0, 0.12.0, 0.13.0 Environment: HDInsight deploying HDP 1.3: c:\apps\dist\pig-0.11.0.1.3.2.0-05 Also on Windows HDP 1.3 one-box configuration. Reporter: Eric Hanson On a one-box windows setup, do the following from a powershell prompt: cmd /c curl.exe -s ` -d user.name=hadoop ` -d arg=-useHCatalog ` -d execute=emp = load '/data/emp/emp_0.dat'; dump emp; ` -d statusdir=/tmp/webhcat.output01 ` 'http://localhost:50111/templeton/v1/pig' -v The job fails with error code 7, but it should run. I traced this down to the following. In the job configuration for the TempletonJobController, we have templeton.args set to cmd,/c,call,C:\\hadooppig-0.11.0.1.3.0.0-0846/bin/pig.cmd,-D__WEBHCAT_TOKEN_FILE_LOCATION__=-useHCatalog,-execute,emp = load '/data/emp/emp_0.dat'; dump emp; Notice the = sign before -useHCatalog. I think this should be a comma. The bad string D__WEBHCAT_TOKEN_FILE_LOCATION__=-useHCatalog gets created in org.apache.hadoop.util.GenericOptionsParser.preProcessForWindows(). It happens at line 434: {code} } else { if (i args.length - 1) { prop += = + args[++i]; // RIGHT HERE! at iterations i = 37, 38 } } {code} Bug is here: {code} if (prop != null) { if (prop.contains(=)) { // -D__WEBHCAT_TOKEN_FILE_LOCATION__ does not contain equal, so else branch is run and appends =-useHCatalog, // everything good } else { if (i args.length - 1) { prop += = + args[++i]; } } newArgs.add(prop); } {code} One possible fix is to change the string constant org.apache.hcatalog.templeton.tool.TempletonControllerJob.TOKEN_FILE_ARG_PLACEHOLDER to have an = sign in it. Or, preProcessForWindows() itself could be changed. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HIVE-6547) normalize struct Role in metastore thrift interface
Thejas M Nair created HIVE-6547: --- Summary: normalize struct Role in metastore thrift interface Key: HIVE-6547 URL: https://issues.apache.org/jira/browse/HIVE-6547 Project: Hive Issue Type: Bug Components: Metastore, Thrift API Affects Versions: 0.13.0 Reporter: Thejas M Nair Assignee: Thejas M Nair Fix For: 0.13.0 As discussed in HIVE-5931, it will be cleaner to have the information about Role to role member mapping removed from the Role object, as it is not part of a logical Role. This information not relevant for actions such as creating a Role. As part of this change get_role_grants_for_principal api will be added, so that it can be used in place of list_roles, when role mapping information is desired. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6411) Support more generic way of using composite key for HBaseHandler
[ https://issues.apache.org/jira/browse/HIVE-6411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13919872#comment-13919872 ] Hive QA commented on HIVE-6411: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12632418/HIVE-6411.3.patch.txt {color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 5240 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver_hbase_custom_key2 org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_auto_sortmerge_join_16 org.apache.hadoop.hive.cli.TestNegativeMinimrCliDriver.testNegativeCliDriver_mapreduce_stack_trace_hadoop20 {noformat} Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1616/testReport Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1616/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 3 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12632418 Support more generic way of using composite key for HBaseHandler Key: HIVE-6411 URL: https://issues.apache.org/jira/browse/HIVE-6411 Project: Hive Issue Type: Improvement Components: HBase Handler Reporter: Navis Assignee: Navis Priority: Minor Attachments: HIVE-6411.1.patch.txt, HIVE-6411.2.patch.txt, HIVE-6411.3.patch.txt HIVE-2599 introduced using custom object for the row key. But it forces key objects to extend HBaseCompositeKey, which is again extension of LazyStruct. If user provides proper Object and OI, we can replace internal key and keyOI with those. Initial implementation is based on factory interface. {code} public interface HBaseKeyFactory { void init(SerDeParameters parameters, Properties properties) throws SerDeException; ObjectInspector createObjectInspector(TypeInfo type) throws SerDeException; LazyObjectBase createObject(ObjectInspector inspector) throws SerDeException; } {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: Timeline for the Hive 0.13 release?
I would like to include HIVE-5943/HIVE-5942 (describe role support ) and HIVE-6547 (metastore api - Role struct cleanup) in the release. I should have the patch for describe-role ready in a day or two, and for HIVE-6547 as well this week. On Tue, Mar 4, 2014 at 11:55 AM, Harish Butani hbut...@hortonworks.comwrote: Yes sure. I am fine with porting over HIVE-5317 and dependents. Besides, couldn't handle 104 angry fans (uh watchers) :) Let’s follow this procedure: if you have features that should go into branch-0.13 please post a message here, give the community a chance to voice their opinions. regards, Harish. On Mar 4, 2014, at 8:03 AM, Alan Gates ga...@hortonworks.com wrote: Sure. I’d really like to get the work related to HIVE-5317 in 0.13. HIVE-5843 is patch available and hopefully can be checked in today. There are several more that depend on that one and can’t be made patch available until then (HIVE-6060, HIVE-6319, HIVE-6460, and HIVE-5687). I don’t want to hold up the branching, but are you ok with those going in after the branch? Alan. On Mar 3, 2014, at 7:53 PM, Harish Butani hbut...@hortonworks.com wrote: I plan to create the branch 5pm PST tomorrow. Ok with everybody? regards, Harish. On Feb 21, 2014, at 5:44 PM, Lefty Leverenz leftylever...@gmail.com wrote: That's appropriate -- let the Hive release march forth on March 4th. -- Lefty On Fri, Feb 21, 2014 at 4:04 PM, Harish Butani hbut...@hortonworks.comwrote: Ok,let’s set it for March 4th . regards, Harish. On Feb 21, 2014, at 12:14 PM, Brock Noland br...@cloudera.com wrote: Might as well make it March 4th or 5th. Otherwise folks will burn weekend time to get patches in. On Fri, Feb 21, 2014 at 2:10 PM, Harish Butani hbut...@hortonworks.com wrote: Yes makes sense. How about we postpone the branching until 10am PST March 3rd, which is the following Monday. Don’t see a point of setting the branch time to a Friday evening. Do people agree? regards, Harish. On Feb 21, 2014, at 11:04 AM, Brock Noland br...@cloudera.com wrote: +1 On Fri, Feb 21, 2014 at 1:02 PM, Thejas Nair the...@hortonworks.com wrote: Can we wait for some few more days for the branching ? I have a few more security fixes that I would like to get in, and we also have a long pre-commit queue ahead right now. How about branching around Friday next week ? By then hadoop 2.3 should also be out as that vote has been concluded, and we can get HIVE-6037 in as well. -Thejas On Sun, Feb 16, 2014 at 5:32 PM, Brock Noland br...@cloudera.com wrote: I'd love to see HIVE-6037 in the 0.13 release. I have +1'ed it pending tests. Brock On Sun, Feb 16, 2014 at 7:23 PM, Navis류승우 navis@nexr.com wrote: HIVE-6037 is for generating hive-default.template file from HiveConf. Could it be included in this release? If it's not, I'll suspend further rebasing of it till next release (conflicts too frequently). 2014-02-16 20:38 GMT+09:00 Lefty Leverenz leftylever...@gmail.com : I'll try to catch up on the wikidocs backlog for 0.13.0 patches in time for the release. It's a long and growing list, though, so no promises. Feel free to do your own documentation, or hand it off to a friendly in-house writer. -- Lefty, self-appointed Hive docs maven On Sat, Feb 15, 2014 at 1:28 PM, Thejas Nair the...@hortonworks.com wrote: Sounds good to me. On Fri, Feb 14, 2014 at 7:29 PM, Harish Butani hbut...@hortonworks.com wrote: Hi, Its mid feb. Wanted to check if the community is ready to cut a branch. Could we cut the branch in a week , say 5pm PST 2/21/14? The goal is to keep the release cycle short: couple of weeks; so after the branch we go into stabilizing mode for hive 0.13, checking in only blocker/critical bug fixes. regards, Harish. On Jan 20, 2014, at 9:25 AM, Brock Noland br...@cloudera.com wrote: Hi, I agree that picking a date to branch and then restricting commits to that branch would be a less time intensive plan for the RM. Brock On Sat, Jan 18, 2014 at 4:21 PM, Harish Butani hbut...@hortonworks.com wrote: Yes agree it is time to start planning for the next release. I would like to volunteer to do the release management duties for this release(will be a great experience for me) Will be happy to do it, if the community is fine with this. regards, Harish. On Jan 17, 2014, at 7:05 PM, Thejas Nair the...@hortonworks.com wrote: Yes, I think it is time to start planning for the next release. For 0.12 release I created a branch and then accepted patches that people asked to be included for sometime, before moving a phase of accepting only critical bug fixes. This turned out to be laborious. I think we should instead give everyone a
[jira] [Commented] (HIVE-6537) NullPointerException when loading hashtable for MapJoin directly
[ https://issues.apache.org/jira/browse/HIVE-6537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13919998#comment-13919998 ] Sergey Shelukhin commented on HIVE-6537: I don't think failure is related (passed for me, too). New test looks good for me. Do I have +1 for the patch itself? I can commit (after 24 hours :)) NullPointerException when loading hashtable for MapJoin directly Key: HIVE-6537 URL: https://issues.apache.org/jira/browse/HIVE-6537 Project: Hive Issue Type: Bug Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: HIVE-6537.01.patch, HIVE-6537.2.patch.txt, HIVE-6537.patch We see the following error: {noformat} 2014-02-20 23:33:15,743 FATAL [main] org.apache.hadoop.hive.ql.exec.mr.ExecMapper: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.NullPointerException at org.apache.hadoop.hive.ql.exec.mr.HashTableLoader.load(HashTableLoader.java:103) at org.apache.hadoop.hive.ql.exec.MapJoinOperator.loadHashTable(MapJoinOperator.java:149) at org.apache.hadoop.hive.ql.exec.MapJoinOperator.cleanUpInputFileChangedOp(MapJoinOperator.java:164) at org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1026) at org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1030) at org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1030) at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:489) at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:177) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163) Caused by: java.lang.NullPointerException at java.util.Arrays.fill(Arrays.java:2685) at org.apache.hadoop.hive.ql.exec.mr.HashTableLoader.loadDirectly(HashTableLoader.java:155) at org.apache.hadoop.hive.ql.exec.mr.HashTableLoader.load(HashTableLoader.java:81) ... 15 more {noformat} It appears that the tables in Arrays.fill call is nulls. I don't really have full understanding of this path, but what I gleaned so far is this... From what I see, tables would be set unconditionally in initializeOp of the sink, and in no other place, so I assume for this code to ever work that startForward calls it at least some time. Here, it doesn't call it, so it's null. Previous loop also uses tables, and should have NPE-d before fill was ever called; it didn't, so I'd assume it never executed. There's a little bit of inconsistency in the above code where directWorks are added to parents unconditionally but sink is only added as child conditionally. I think it may be that some of the direct works are not table scans; in fact given that loop never executes they may be null (which is rather strange). Regardless, it seems that the logic should be fixed, it may be the root cause -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-5317) Implement insert, update, and delete in Hive with full ACID support
[ https://issues.apache.org/jira/browse/HIVE-5317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1392#comment-1392 ] Vinod Kumar Vavilapalli commented on HIVE-5317: --- bq. MAPREDUCE-279, at 109, currently out scores us. There may be others, but it would be cool to have more watchers than Yarn. Hehe, looks like we have a race. I'll go ask some of us YARN folks who are also watching this JIRA to stop watching this one :D Implement insert, update, and delete in Hive with full ACID support --- Key: HIVE-5317 URL: https://issues.apache.org/jira/browse/HIVE-5317 Project: Hive Issue Type: New Feature Reporter: Owen O'Malley Assignee: Owen O'Malley Attachments: InsertUpdatesinHive.pdf Many customers want to be able to insert, update and delete rows from Hive tables with full ACID support. The use cases are varied, but the form of the queries that should be supported are: * INSERT INTO tbl SELECT … * INSERT INTO tbl VALUES ... * UPDATE tbl SET … WHERE … * DELETE FROM tbl WHERE … * MERGE INTO tbl USING src ON … WHEN MATCHED THEN ... WHEN NOT MATCHED THEN ... * SET TRANSACTION LEVEL … * BEGIN/END TRANSACTION Use Cases * Once an hour, a set of inserts and updates (up to 500k rows) for various dimension tables (eg. customer, inventory, stores) needs to be processed. The dimension tables have primary keys and are typically bucketed and sorted on those keys. * Once a day a small set (up to 100k rows) of records need to be deleted for regulatory compliance. * Once an hour a log of transactions is exported from a RDBS and the fact tables need to be updated (up to 1m rows) to reflect the new data. The transactions are a combination of inserts, updates, and deletes. The table is partitioned and bucketed. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6486) Support secure Subject.doAs() in HiveServer2 JDBC client.
[ https://issues.apache.org/jira/browse/HIVE-6486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13920015#comment-13920015 ] Thejas M Nair commented on HIVE-6486: - +1 [~shivshi] Can you include the usage notes in the release notes section of the jira, so we can pick information from there for documentation ? If you would like to also help with adding this to the wiki documentation, that would be great too! Support secure Subject.doAs() in HiveServer2 JDBC client. - Key: HIVE-6486 URL: https://issues.apache.org/jira/browse/HIVE-6486 Project: Hive Issue Type: Improvement Components: Authentication, HiveServer2, JDBC Affects Versions: 0.11.0, 0.12.0 Reporter: Shivaraju Gowda Assignee: Shivaraju Gowda Fix For: 0.13.0 Attachments: HIVE-6486.1.patch, Hive_011_Support-Subject_doAS.patch, TestHive_SujectDoAs.java HIVE-5155 addresses the problem of kerberos authentication in multi-user middleware server using proxy user. In this mode the principal used by the middle ware server has privileges to impersonate selected users in Hive/Hadoop. This enhancement is to support Subject.doAs() authentication in Hive JDBC layer so that the end users Kerberos Subject is passed through in the middle ware server. With this improvement there won't be any additional setup in the server to grant proxy privileges to some users and there won't be need to specify a proxy user in the JDBC client. This version should also be more secure since it won't require principals with the privileges to impersonate other users in Hive/Hadoop setup. -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: Timeline for the Hive 0.13 release?
I think we should include the following patches as well, they are in patch available/review stage - HIVE-5155 - (proxy user support for HS2) is patch available and has undergone some reviews. HIVE-6486 - Support secure Subject.doAs() in HiveServer2 JDBC client. I have reviewed and +1'd it. On Tue, Mar 4, 2014 at 12:13 PM, Thejas Nair the...@hortonworks.com wrote: I would like to include HIVE-5943/HIVE-5942 (describe role support ) and HIVE-6547 (metastore api - Role struct cleanup) in the release. I should have the patch for describe-role ready in a day or two, and for HIVE-6547 as well this week. On Tue, Mar 4, 2014 at 11:55 AM, Harish Butani hbut...@hortonworks.com wrote: Yes sure. I am fine with porting over HIVE-5317 and dependents. Besides, couldn't handle 104 angry fans (uh watchers) :) Let’s follow this procedure: if you have features that should go into branch-0.13 please post a message here, give the community a chance to voice their opinions. regards, Harish. On Mar 4, 2014, at 8:03 AM, Alan Gates ga...@hortonworks.com wrote: Sure. I’d really like to get the work related to HIVE-5317 in 0.13. HIVE-5843 is patch available and hopefully can be checked in today. There are several more that depend on that one and can’t be made patch available until then (HIVE-6060, HIVE-6319, HIVE-6460, and HIVE-5687). I don’t want to hold up the branching, but are you ok with those going in after the branch? Alan. On Mar 3, 2014, at 7:53 PM, Harish Butani hbut...@hortonworks.com wrote: I plan to create the branch 5pm PST tomorrow. Ok with everybody? regards, Harish. On Feb 21, 2014, at 5:44 PM, Lefty Leverenz leftylever...@gmail.com wrote: That's appropriate -- let the Hive release march forth on March 4th. -- Lefty On Fri, Feb 21, 2014 at 4:04 PM, Harish Butani hbut...@hortonworks.comwrote: Ok,let’s set it for March 4th . regards, Harish. On Feb 21, 2014, at 12:14 PM, Brock Noland br...@cloudera.com wrote: Might as well make it March 4th or 5th. Otherwise folks will burn weekend time to get patches in. On Fri, Feb 21, 2014 at 2:10 PM, Harish Butani hbut...@hortonworks.com wrote: Yes makes sense. How about we postpone the branching until 10am PST March 3rd, which is the following Monday. Don’t see a point of setting the branch time to a Friday evening. Do people agree? regards, Harish. On Feb 21, 2014, at 11:04 AM, Brock Noland br...@cloudera.com wrote: +1 On Fri, Feb 21, 2014 at 1:02 PM, Thejas Nair the...@hortonworks.com wrote: Can we wait for some few more days for the branching ? I have a few more security fixes that I would like to get in, and we also have a long pre-commit queue ahead right now. How about branching around Friday next week ? By then hadoop 2.3 should also be out as that vote has been concluded, and we can get HIVE-6037 in as well. -Thejas On Sun, Feb 16, 2014 at 5:32 PM, Brock Noland br...@cloudera.com wrote: I'd love to see HIVE-6037 in the 0.13 release. I have +1'ed it pending tests. Brock On Sun, Feb 16, 2014 at 7:23 PM, Navis류승우 navis@nexr.com wrote: HIVE-6037 is for generating hive-default.template file from HiveConf. Could it be included in this release? If it's not, I'll suspend further rebasing of it till next release (conflicts too frequently). 2014-02-16 20:38 GMT+09:00 Lefty Leverenz leftylever...@gmail.com : I'll try to catch up on the wikidocs backlog for 0.13.0 patches in time for the release. It's a long and growing list, though, so no promises. Feel free to do your own documentation, or hand it off to a friendly in-house writer. -- Lefty, self-appointed Hive docs maven On Sat, Feb 15, 2014 at 1:28 PM, Thejas Nair the...@hortonworks.com wrote: Sounds good to me. On Fri, Feb 14, 2014 at 7:29 PM, Harish Butani hbut...@hortonworks.com wrote: Hi, Its mid feb. Wanted to check if the community is ready to cut a branch. Could we cut the branch in a week , say 5pm PST 2/21/14? The goal is to keep the release cycle short: couple of weeks; so after the branch we go into stabilizing mode for hive 0.13, checking in only blocker/critical bug fixes. regards, Harish. On Jan 20, 2014, at 9:25 AM, Brock Noland br...@cloudera.com wrote: Hi, I agree that picking a date to branch and then restricting commits to that branch would be a less time intensive plan for the RM. Brock On Sat, Jan 18, 2014 at 4:21 PM, Harish Butani hbut...@hortonworks.com wrote: Yes agree it is time to start planning for the next release. I would like to volunteer to do the release management duties for this release(will be a great experience for me) Will be happy to do it, if the community is fine with this. regards, Harish. On Jan
Re: Timeline for the Hive 0.13 release?
I would like HIVE-6455 and HIVE-4177 to go in the release. HIVE-6455 - Scalable dynamic partitioning optimization (I already have a patch for it and is under code review) HIVE-4177 - Support partial scan for analyze command - ORC (I will post a patch within this week) Thanks Prasanth Jayachandran On Mar 4, 2014, at 12:13 PM, Thejas Nair the...@hortonworks.com wrote: I would like to include HIVE-5943/HIVE-5942 (describe role support ) and HIVE-6547 (metastore api - Role struct cleanup) in the release. I should have the patch for describe-role ready in a day or two, and for HIVE-6547 as well this week. On Tue, Mar 4, 2014 at 11:55 AM, Harish Butani hbut...@hortonworks.comwrote: Yes sure. I am fine with porting over HIVE-5317 and dependents. Besides, couldn't handle 104 angry fans (uh watchers) :) Let’s follow this procedure: if you have features that should go into branch-0.13 please post a message here, give the community a chance to voice their opinions. regards, Harish. On Mar 4, 2014, at 8:03 AM, Alan Gates ga...@hortonworks.com wrote: Sure. I’d really like to get the work related to HIVE-5317 in 0.13. HIVE-5843 is patch available and hopefully can be checked in today. There are several more that depend on that one and can’t be made patch available until then (HIVE-6060, HIVE-6319, HIVE-6460, and HIVE-5687). I don’t want to hold up the branching, but are you ok with those going in after the branch? Alan. On Mar 3, 2014, at 7:53 PM, Harish Butani hbut...@hortonworks.com wrote: I plan to create the branch 5pm PST tomorrow. Ok with everybody? regards, Harish. On Feb 21, 2014, at 5:44 PM, Lefty Leverenz leftylever...@gmail.com wrote: That's appropriate -- let the Hive release march forth on March 4th. -- Lefty On Fri, Feb 21, 2014 at 4:04 PM, Harish Butani hbut...@hortonworks.comwrote: Ok,let’s set it for March 4th . regards, Harish. On Feb 21, 2014, at 12:14 PM, Brock Noland br...@cloudera.com wrote: Might as well make it March 4th or 5th. Otherwise folks will burn weekend time to get patches in. On Fri, Feb 21, 2014 at 2:10 PM, Harish Butani hbut...@hortonworks.com wrote: Yes makes sense. How about we postpone the branching until 10am PST March 3rd, which is the following Monday. Don’t see a point of setting the branch time to a Friday evening. Do people agree? regards, Harish. On Feb 21, 2014, at 11:04 AM, Brock Noland br...@cloudera.com wrote: +1 On Fri, Feb 21, 2014 at 1:02 PM, Thejas Nair the...@hortonworks.com wrote: Can we wait for some few more days for the branching ? I have a few more security fixes that I would like to get in, and we also have a long pre-commit queue ahead right now. How about branching around Friday next week ? By then hadoop 2.3 should also be out as that vote has been concluded, and we can get HIVE-6037 in as well. -Thejas On Sun, Feb 16, 2014 at 5:32 PM, Brock Noland br...@cloudera.com wrote: I'd love to see HIVE-6037 in the 0.13 release. I have +1'ed it pending tests. Brock On Sun, Feb 16, 2014 at 7:23 PM, Navis류승우 navis@nexr.com wrote: HIVE-6037 is for generating hive-default.template file from HiveConf. Could it be included in this release? If it's not, I'll suspend further rebasing of it till next release (conflicts too frequently). 2014-02-16 20:38 GMT+09:00 Lefty Leverenz leftylever...@gmail.com : I'll try to catch up on the wikidocs backlog for 0.13.0 patches in time for the release. It's a long and growing list, though, so no promises. Feel free to do your own documentation, or hand it off to a friendly in-house writer. -- Lefty, self-appointed Hive docs maven On Sat, Feb 15, 2014 at 1:28 PM, Thejas Nair the...@hortonworks.com wrote: Sounds good to me. On Fri, Feb 14, 2014 at 7:29 PM, Harish Butani hbut...@hortonworks.com wrote: Hi, Its mid feb. Wanted to check if the community is ready to cut a branch. Could we cut the branch in a week , say 5pm PST 2/21/14? The goal is to keep the release cycle short: couple of weeks; so after the branch we go into stabilizing mode for hive 0.13, checking in only blocker/critical bug fixes. regards, Harish. On Jan 20, 2014, at 9:25 AM, Brock Noland br...@cloudera.com wrote: Hi, I agree that picking a date to branch and then restricting commits to that branch would be a less time intensive plan for the RM. Brock On Sat, Jan 18, 2014 at 4:21 PM, Harish Butani hbut...@hortonworks.com wrote: Yes agree it is time to start planning for the next release. I would like to volunteer to do the release management duties for this release(will be a great experience for me) Will be happy to do it, if the community is fine with this. regards, Harish. On Jan 17, 2014, at 7:05 PM, Thejas Nair the...@hortonworks.com wrote: Yes, I think it is time to start planning
Re: Timeline for the Hive 0.13 release?
I'd like to have the following go in: HIVE-4764 [Support Kerberos HTTP authentication for HiveServer2 running in http mode] https://issues.apache.org/jira/browse/HIVE-4764 HIVE-6306 [HiveServer2 running in http mode should support for doAs functionality] https://issues.apache.org/jira/browse/HIVE-6306 HIVE-6350 [Support LDAP authentication for HiveServer2 in http mode]https://issues.apache.org/jira/browse/HIVE-6350 HIVE-6485 [Downgrade to httpclient-4.2.5 in JDBC from httpclient-4.3.2]https://issues.apache.org/jira/browse/HIVE-6485 And it would awesome to have HIVE-5155 - (proxy user support for HS2). Thanks, --Vaibhav On Tue, Mar 4, 2014 at 1:15 PM, Prasanth Jayachandran pjayachand...@hortonworks.com wrote: I would like HIVE-6455 and HIVE-4177 to go in the release. HIVE-6455 - Scalable dynamic partitioning optimization (I already have a patch for it and is under code review) HIVE-4177 - Support partial scan for analyze command - ORC (I will post a patch within this week) Thanks Prasanth Jayachandran On Mar 4, 2014, at 12:13 PM, Thejas Nair the...@hortonworks.com wrote: I would like to include HIVE-5943/HIVE-5942 (describe role support ) and HIVE-6547 (metastore api - Role struct cleanup) in the release. I should have the patch for describe-role ready in a day or two, and for HIVE-6547 as well this week. On Tue, Mar 4, 2014 at 11:55 AM, Harish Butani hbut...@hortonworks.com wrote: Yes sure. I am fine with porting over HIVE-5317 and dependents. Besides, couldn't handle 104 angry fans (uh watchers) :) Let’s follow this procedure: if you have features that should go into branch-0.13 please post a message here, give the community a chance to voice their opinions. regards, Harish. On Mar 4, 2014, at 8:03 AM, Alan Gates ga...@hortonworks.com wrote: Sure. I’d really like to get the work related to HIVE-5317 in 0.13. HIVE-5843 is patch available and hopefully can be checked in today. There are several more that depend on that one and can’t be made patch available until then (HIVE-6060, HIVE-6319, HIVE-6460, and HIVE-5687). I don’t want to hold up the branching, but are you ok with those going in after the branch? Alan. On Mar 3, 2014, at 7:53 PM, Harish Butani hbut...@hortonworks.com wrote: I plan to create the branch 5pm PST tomorrow. Ok with everybody? regards, Harish. On Feb 21, 2014, at 5:44 PM, Lefty Leverenz leftylever...@gmail.com wrote: That's appropriate -- let the Hive release march forth on March 4th. -- Lefty On Fri, Feb 21, 2014 at 4:04 PM, Harish Butani hbut...@hortonworks.comwrote: Ok,let’s set it for March 4th . regards, Harish. On Feb 21, 2014, at 12:14 PM, Brock Noland br...@cloudera.com wrote: Might as well make it March 4th or 5th. Otherwise folks will burn weekend time to get patches in. On Fri, Feb 21, 2014 at 2:10 PM, Harish Butani hbut...@hortonworks.com wrote: Yes makes sense. How about we postpone the branching until 10am PST March 3rd, which is the following Monday. Don’t see a point of setting the branch time to a Friday evening. Do people agree? regards, Harish. On Feb 21, 2014, at 11:04 AM, Brock Noland br...@cloudera.com wrote: +1 On Fri, Feb 21, 2014 at 1:02 PM, Thejas Nair the...@hortonworks.com wrote: Can we wait for some few more days for the branching ? I have a few more security fixes that I would like to get in, and we also have a long pre-commit queue ahead right now. How about branching around Friday next week ? By then hadoop 2.3 should also be out as that vote has been concluded, and we can get HIVE-6037 in as well. -Thejas On Sun, Feb 16, 2014 at 5:32 PM, Brock Noland br...@cloudera.com wrote: I'd love to see HIVE-6037 in the 0.13 release. I have +1'ed it pending tests. Brock On Sun, Feb 16, 2014 at 7:23 PM, Navis류승우 navis@nexr.com wrote: HIVE-6037 is for generating hive-default.template file from HiveConf. Could it be included in this release? If it's not, I'll suspend further rebasing of it till next release (conflicts too frequently). 2014-02-16 20:38 GMT+09:00 Lefty Leverenz leftylever...@gmail.com : I'll try to catch up on the wikidocs backlog for 0.13.0 patches in time for the release. It's a long and growing list, though, so no promises. Feel free to do your own documentation, or hand it off to a friendly in-house writer. -- Lefty, self-appointed Hive docs maven On Sat, Feb 15, 2014 at 1:28 PM, Thejas Nair the...@hortonworks.com wrote: Sounds good to me. On Fri, Feb 14, 2014 at 7:29 PM, Harish Butani hbut...@hortonworks.com wrote: Hi, Its mid feb. Wanted to check if the community is ready to cut a branch. Could we cut the branch in a week , say 5pm PST 2/21/14? The goal is to keep the release cycle short: couple of
[jira] [Commented] (HIVE-6414) ParquetInputFormat provides data values that do not match the object inspectors
[ https://issues.apache.org/jira/browse/HIVE-6414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13920048#comment-13920048 ] Szehon Ho commented on HIVE-6414: - Hi Justin, thanks for taking care of it. Do you want resubmit the patch for testing for this issue? There had been an issue where the pre-commit test queue got lost. ParquetInputFormat provides data values that do not match the object inspectors --- Key: HIVE-6414 URL: https://issues.apache.org/jira/browse/HIVE-6414 Project: Hive Issue Type: Bug Components: Serializers/Deserializers Affects Versions: 0.13.0 Reporter: Remus Rusanu Assignee: Justin Coffey Labels: Parquet Fix For: 0.13.0 Attachments: HIVE-6414.2.patch, HIVE-6414.3.patch, HIVE-6414.patch While working on HIVE-5998 I noticed that the ParquetRecordReader returns IntWritable for all 'int like' types, in disaccord with the row object inspectors. I though fine, and I worked my way around it. But I see now that the issue trigger failuers in other places, eg. in aggregates: {noformat} Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {cint:528534767,ctinyint:31,csmallint:4963,cfloat:31.0,cdouble:4963.0,cstring1:cvLH6Eat2yFsyy7p} at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:534) at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:177) ... 8 more Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.ClassCastException: org.apache.hadoop.io.IntWritable cannot be cast to java.lang.Short at org.apache.hadoop.hive.ql.exec.GroupByOperator.processOp(GroupByOperator.java:808) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:790) at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:87) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:790) at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:92) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:790) at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:524) ... 9 more Caused by: java.lang.ClassCastException: org.apache.hadoop.io.IntWritable cannot be cast to java.lang.Short at org.apache.hadoop.hive.serde2.objectinspector.primitive.JavaShortObjectInspector.get(JavaShortObjectInspector.java:41) at org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.compare(ObjectInspectorUtils.java:671) at org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.compare(ObjectInspectorUtils.java:631) at org.apache.hadoop.hive.ql.udf.generic.GenericUDAFMin$GenericUDAFMinEvaluator.merge(GenericUDAFMin.java:109) at org.apache.hadoop.hive.ql.udf.generic.GenericUDAFMin$GenericUDAFMinEvaluator.iterate(GenericUDAFMin.java:96) at org.apache.hadoop.hive.ql.udf.generic.GenericUDAFEvaluator.aggregate(GenericUDAFEvaluator.java:183) at org.apache.hadoop.hive.ql.exec.GroupByOperator.updateAggregations(GroupByOperator.java:641) at org.apache.hadoop.hive.ql.exec.GroupByOperator.processHashAggr(GroupByOperator.java:838) at org.apache.hadoop.hive.ql.exec.GroupByOperator.processKey(GroupByOperator.java:735) at org.apache.hadoop.hive.ql.exec.GroupByOperator.processOp(GroupByOperator.java:803) ... 15 more {noformat} My test is (I'm writing a test .q from HIVE-5998, but the repro does not involve vectorization): {noformat} create table if not exists alltypes_parquet ( cint int, ctinyint tinyint, csmallint smallint, cfloat float, cdouble double, cstring1 string) stored as parquet; insert overwrite table alltypes_parquet select cint, ctinyint, csmallint, cfloat, cdouble, cstring1 from alltypesorc; explain select * from alltypes_parquet limit 10; select * from alltypes_parquet limit 10; explain select ctinyint, max(cint), min(csmallint), count(cstring1), avg(cfloat), stddev_pop(cdouble) from alltypes_parquet group by ctinyint; select ctinyint, max(cint), min(csmallint), count(cstring1), avg(cfloat), stddev_pop(cdouble) from alltypes_parquet group by ctinyint; {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6486) Support secure Subject.doAs() in HiveServer2 JDBC client.
[ https://issues.apache.org/jira/browse/HIVE-6486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13920061#comment-13920061 ] Vaibhav Gumashta commented on HIVE-6486: [~shivshi] Thanks for the updated patch. Can you also update the rb diff? It seems to have the older patch. Support secure Subject.doAs() in HiveServer2 JDBC client. - Key: HIVE-6486 URL: https://issues.apache.org/jira/browse/HIVE-6486 Project: Hive Issue Type: Improvement Components: Authentication, HiveServer2, JDBC Affects Versions: 0.11.0, 0.12.0 Reporter: Shivaraju Gowda Assignee: Shivaraju Gowda Fix For: 0.13.0 Attachments: HIVE-6486.1.patch, Hive_011_Support-Subject_doAS.patch, TestHive_SujectDoAs.java HIVE-5155 addresses the problem of kerberos authentication in multi-user middleware server using proxy user. In this mode the principal used by the middle ware server has privileges to impersonate selected users in Hive/Hadoop. This enhancement is to support Subject.doAs() authentication in Hive JDBC layer so that the end users Kerberos Subject is passed through in the middle ware server. With this improvement there won't be any additional setup in the server to grant proxy privileges to some users and there won't be need to specify a proxy user in the JDBC client. This version should also be more secure since it won't require principals with the privileges to impersonate other users in Hive/Hadoop setup. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6503) document pluggable authentication modules (PAM) in template config, wiki
[ https://issues.apache.org/jira/browse/HIVE-6503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13920063#comment-13920063 ] Lefty Leverenz commented on HIVE-6503: -- You could update the parameter description in a release note, either here or on HIVE-6466 (or both). The wiki needs to be updated here: * [Setting Up HiveServer2: Authentication/Security Configuration |https://cwiki.apache.org/confluence/display/Hive/Setting+Up+HiveServer2#SettingUpHiveServer2-Authentication/SecurityConfiguration] (Eventually HiveServer2 configuration parameters will be documented in Configuration Properties, but they're not there yet.) document pluggable authentication modules (PAM) in template config, wiki Key: HIVE-6503 URL: https://issues.apache.org/jira/browse/HIVE-6503 Project: Hive Issue Type: Bug Affects Versions: 0.13.0 Reporter: Thejas M Nair Assignee: Vaibhav Gumashta Priority: Blocker Fix For: 0.13.0 HIVE-6466 adds support for PAM as a supported value for hive.server2.authentication. It also adds a config parameter hive.server2.authentication.pam.services. The default template file needs to be updated to document these. The wiki docs should also document the support for pluggable authentication modules. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6414) ParquetInputFormat provides data values that do not match the object inspectors
[ https://issues.apache.org/jira/browse/HIVE-6414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13920071#comment-13920071 ] Xuefu Zhang commented on HIVE-6414: --- +1 to the patch #3. ParquetInputFormat provides data values that do not match the object inspectors --- Key: HIVE-6414 URL: https://issues.apache.org/jira/browse/HIVE-6414 Project: Hive Issue Type: Bug Components: Serializers/Deserializers Affects Versions: 0.13.0 Reporter: Remus Rusanu Assignee: Justin Coffey Labels: Parquet Fix For: 0.13.0 Attachments: HIVE-6414.2.patch, HIVE-6414.3.patch, HIVE-6414.patch While working on HIVE-5998 I noticed that the ParquetRecordReader returns IntWritable for all 'int like' types, in disaccord with the row object inspectors. I though fine, and I worked my way around it. But I see now that the issue trigger failuers in other places, eg. in aggregates: {noformat} Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {cint:528534767,ctinyint:31,csmallint:4963,cfloat:31.0,cdouble:4963.0,cstring1:cvLH6Eat2yFsyy7p} at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:534) at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:177) ... 8 more Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.ClassCastException: org.apache.hadoop.io.IntWritable cannot be cast to java.lang.Short at org.apache.hadoop.hive.ql.exec.GroupByOperator.processOp(GroupByOperator.java:808) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:790) at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:87) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:790) at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:92) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:790) at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:524) ... 9 more Caused by: java.lang.ClassCastException: org.apache.hadoop.io.IntWritable cannot be cast to java.lang.Short at org.apache.hadoop.hive.serde2.objectinspector.primitive.JavaShortObjectInspector.get(JavaShortObjectInspector.java:41) at org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.compare(ObjectInspectorUtils.java:671) at org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.compare(ObjectInspectorUtils.java:631) at org.apache.hadoop.hive.ql.udf.generic.GenericUDAFMin$GenericUDAFMinEvaluator.merge(GenericUDAFMin.java:109) at org.apache.hadoop.hive.ql.udf.generic.GenericUDAFMin$GenericUDAFMinEvaluator.iterate(GenericUDAFMin.java:96) at org.apache.hadoop.hive.ql.udf.generic.GenericUDAFEvaluator.aggregate(GenericUDAFEvaluator.java:183) at org.apache.hadoop.hive.ql.exec.GroupByOperator.updateAggregations(GroupByOperator.java:641) at org.apache.hadoop.hive.ql.exec.GroupByOperator.processHashAggr(GroupByOperator.java:838) at org.apache.hadoop.hive.ql.exec.GroupByOperator.processKey(GroupByOperator.java:735) at org.apache.hadoop.hive.ql.exec.GroupByOperator.processOp(GroupByOperator.java:803) ... 15 more {noformat} My test is (I'm writing a test .q from HIVE-5998, but the repro does not involve vectorization): {noformat} create table if not exists alltypes_parquet ( cint int, ctinyint tinyint, csmallint smallint, cfloat float, cdouble double, cstring1 string) stored as parquet; insert overwrite table alltypes_parquet select cint, ctinyint, csmallint, cfloat, cdouble, cstring1 from alltypesorc; explain select * from alltypes_parquet limit 10; select * from alltypes_parquet limit 10; explain select ctinyint, max(cint), min(csmallint), count(cstring1), avg(cfloat), stddev_pop(cdouble) from alltypes_parquet group by ctinyint; select ctinyint, max(cint), min(csmallint), count(cstring1), avg(cfloat), stddev_pop(cdouble) from alltypes_parquet group by ctinyint; {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6433) SQL std auth - allow grant/revoke roles if user has ADMIN OPTION
[ https://issues.apache.org/jira/browse/HIVE-6433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13920084#comment-13920084 ] Lefty Leverenz commented on HIVE-6433: -- Does this need any documentation, besides general docs for the parent HIVE-5837? SQL std auth - allow grant/revoke roles if user has ADMIN OPTION Key: HIVE-6433 URL: https://issues.apache.org/jira/browse/HIVE-6433 Project: Hive Issue Type: Sub-task Reporter: Thejas M Nair Assignee: Ashutosh Chauhan Fix For: 0.13.0 Attachments: HIVE-6433.1.patch, HIVE-6433.2.patch, HIVE-6433.patch Follow up jira for HIVE-5952. If a user/role has admin option on a role, then user should be able to grant /revoke other users to/from the role. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-1662) Add file pruning into Hive.
[ https://issues.apache.org/jira/browse/HIVE-1662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13920087#comment-13920087 ] Hive QA commented on HIVE-1662: --- {color:green}Overall{color}: +1 all checks pass Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12632450/HIVE-1662.12.patch.txt {color:green}SUCCESS:{color} +1 5240 tests passed Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1619/testReport Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1619/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12632450 Add file pruning into Hive. --- Key: HIVE-1662 URL: https://issues.apache.org/jira/browse/HIVE-1662 Project: Hive Issue Type: New Feature Reporter: He Yongqiang Assignee: Navis Attachments: HIVE-1662.10.patch.txt, HIVE-1662.11.patch.txt, HIVE-1662.12.patch.txt, HIVE-1662.8.patch.txt, HIVE-1662.9.patch.txt, HIVE-1662.D8391.1.patch, HIVE-1662.D8391.2.patch, HIVE-1662.D8391.3.patch, HIVE-1662.D8391.4.patch, HIVE-1662.D8391.5.patch, HIVE-1662.D8391.6.patch, HIVE-1662.D8391.7.patch now hive support filename virtual column. if a file name filter presents in a query, hive should be able to only add files which passed the filter to input paths. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6492) limit partition number involved in a table scan
[ https://issues.apache.org/jira/browse/HIVE-6492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Selina Zhang updated HIVE-6492: --- Attachment: HIVE-6492.4.patch.txt Gunther, thanks for your comments! Removed the logic for simple fetch query. Let the query pass if it is a fetch operator (no mapreduce job launched). However, I still need to put the logic right after the physical optimizers because only till then I have the information that if the query is a metadata only query. limit partition number involved in a table scan --- Key: HIVE-6492 URL: https://issues.apache.org/jira/browse/HIVE-6492 Project: Hive Issue Type: New Feature Components: Query Processor Affects Versions: 0.12.0 Reporter: Selina Zhang Fix For: 0.13.0 Attachments: HIVE-6492.1.patch.txt, HIVE-6492.2.patch.txt, HIVE-6492.3.patch.txt, HIVE-6492.4.patch.txt Original Estimate: 24h Remaining Estimate: 24h To protect the cluster, a new configure variable hive.limit.query.max.table.partition is added to hive configuration to limit the table partitions involved in a table scan. The default value will be set to -1 which means there is no limit by default. This variable will not affect metadata only query. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-4293) Predicates following UDTF operator are removed by PPD
[ https://issues.apache.org/jira/browse/HIVE-4293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harish Butani updated HIVE-4293: Attachment: HIVE-4293.10.patch Predicates following UDTF operator are removed by PPD - Key: HIVE-4293 URL: https://issues.apache.org/jira/browse/HIVE-4293 Project: Hive Issue Type: Bug Components: Query Processor Reporter: Navis Assignee: Navis Priority: Critical Attachments: D9933.6.patch, HIVE-4293.10.patch, HIVE-4293.7.patch.txt, HIVE-4293.8.patch.txt, HIVE-4293.9.patch.txt, HIVE-4293.D9933.1.patch, HIVE-4293.D9933.2.patch, HIVE-4293.D9933.3.patch, HIVE-4293.D9933.4.patch, HIVE-4293.D9933.5.patch For example, {noformat} explain SELECT value from ( select explode(array(key, value)) as (value) from ( select * FROM src WHERE key 200 ) A ) B WHERE value 300 ; {noformat} Makes plan like this, removing last predicates {noformat} TableScan alias: src Filter Operator predicate: expr: (key 200.0) type: boolean Select Operator expressions: expr: array(key,value) type: arraystring outputColumnNames: _col0 UDTF Operator function name: explode Select Operator expressions: expr: col type: string outputColumnNames: _col0 File Output Operator compressed: false GlobalTableId: 0 table: input format: org.apache.hadoop.mapred.TextInputFormat output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-4293) Predicates following UDTF operator are removed by PPD
[ https://issues.apache.org/jira/browse/HIVE-4293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13920107#comment-13920107 ] Harish Butani commented on HIVE-4293: - [~navis] this looks good +1 attaching an updated patch, since the last one is couple of months old. Also added the testcase from HIVE-5964 Please take a look; hope you don't mind that I uploaded an updated patch. Predicates following UDTF operator are removed by PPD - Key: HIVE-4293 URL: https://issues.apache.org/jira/browse/HIVE-4293 Project: Hive Issue Type: Bug Components: Query Processor Reporter: Navis Assignee: Navis Priority: Critical Attachments: D9933.6.patch, HIVE-4293.7.patch.txt, HIVE-4293.8.patch.txt, HIVE-4293.9.patch.txt, HIVE-4293.D9933.1.patch, HIVE-4293.D9933.2.patch, HIVE-4293.D9933.3.patch, HIVE-4293.D9933.4.patch, HIVE-4293.D9933.5.patch For example, {noformat} explain SELECT value from ( select explode(array(key, value)) as (value) from ( select * FROM src WHERE key 200 ) A ) B WHERE value 300 ; {noformat} Makes plan like this, removing last predicates {noformat} TableScan alias: src Filter Operator predicate: expr: (key 200.0) type: boolean Select Operator expressions: expr: array(key,value) type: arraystring outputColumnNames: _col0 UDTF Operator function name: explode Select Operator expressions: expr: col type: string outputColumnNames: _col0 File Output Operator compressed: false GlobalTableId: 0 table: input format: org.apache.hadoop.mapred.TextInputFormat output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Comment Edited] (HIVE-4293) Predicates following UDTF operator are removed by PPD
[ https://issues.apache.org/jira/browse/HIVE-4293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13920107#comment-13920107 ] Harish Butani edited comment on HIVE-4293 at 3/4/14 10:17 PM: -- [~navis] this looks good +1 attaching an updated patch, since the last one is couple of months old. - Had to resolve minor conflicts in SemAly. - Regenned .q.out files. - Also added the testcase from HIVE-5964 Please take a look; hope you don't mind that I uploaded an updated patch. was (Author: rhbutani): [~navis] this looks good +1 attaching an updated patch, since the last one is couple of months old. Also added the testcase from HIVE-5964 Please take a look; hope you don't mind that I uploaded an updated patch. Predicates following UDTF operator are removed by PPD - Key: HIVE-4293 URL: https://issues.apache.org/jira/browse/HIVE-4293 Project: Hive Issue Type: Bug Components: Query Processor Reporter: Navis Assignee: Navis Priority: Critical Attachments: D9933.6.patch, HIVE-4293.10.patch, HIVE-4293.7.patch.txt, HIVE-4293.8.patch.txt, HIVE-4293.9.patch.txt, HIVE-4293.D9933.1.patch, HIVE-4293.D9933.2.patch, HIVE-4293.D9933.3.patch, HIVE-4293.D9933.4.patch, HIVE-4293.D9933.5.patch For example, {noformat} explain SELECT value from ( select explode(array(key, value)) as (value) from ( select * FROM src WHERE key 200 ) A ) B WHERE value 300 ; {noformat} Makes plan like this, removing last predicates {noformat} TableScan alias: src Filter Operator predicate: expr: (key 200.0) type: boolean Select Operator expressions: expr: array(key,value) type: arraystring outputColumnNames: _col0 UDTF Operator function name: explode Select Operator expressions: expr: col type: string outputColumnNames: _col0 File Output Operator compressed: false GlobalTableId: 0 table: input format: org.apache.hadoop.mapred.TextInputFormat output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6486) Support secure Subject.doAs() in HiveServer2 JDBC client.
[ https://issues.apache.org/jira/browse/HIVE-6486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13920109#comment-13920109 ] Prasad Mujumdar commented on HIVE-6486: --- [~shivshi] My apologies for not looking into it earlier. The patch looks fine to me. Thanks for Addressing the issue. I understand that we can't add a unit test for this since it needs all the security setup. There's an integration test added as part of the proposed HIVE-5155 patch. Once that's committed, I will try add a test case to cover this. +1 Support secure Subject.doAs() in HiveServer2 JDBC client. - Key: HIVE-6486 URL: https://issues.apache.org/jira/browse/HIVE-6486 Project: Hive Issue Type: Improvement Components: Authentication, HiveServer2, JDBC Affects Versions: 0.11.0, 0.12.0 Reporter: Shivaraju Gowda Assignee: Shivaraju Gowda Fix For: 0.13.0 Attachments: HIVE-6486.1.patch, Hive_011_Support-Subject_doAS.patch, TestHive_SujectDoAs.java HIVE-5155 addresses the problem of kerberos authentication in multi-user middleware server using proxy user. In this mode the principal used by the middle ware server has privileges to impersonate selected users in Hive/Hadoop. This enhancement is to support Subject.doAs() authentication in Hive JDBC layer so that the end users Kerberos Subject is passed through in the middle ware server. With this improvement there won't be any additional setup in the server to grant proxy privileges to some users and there won't be need to specify a proxy user in the JDBC client. This version should also be more secure since it won't require principals with the privileges to impersonate other users in Hive/Hadoop setup. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6541) Need to write documentation for ACID work
[ https://issues.apache.org/jira/browse/HIVE-6541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13920115#comment-13920115 ] Alan Gates commented on HIVE-6541: -- [~leftylev] I have a first draft of the docs. Should I just post it in here in text format so we can iterate on it then I can post it to JIRA when this stuff gets committed? Need to write documentation for ACID work - Key: HIVE-6541 URL: https://issues.apache.org/jira/browse/HIVE-6541 Project: Hive Issue Type: Sub-task Components: Documentation Affects Versions: 0.13.0 Reporter: Alan Gates Assignee: Alan Gates Fix For: 0.13.0 ACID introduces a number of new config file options, tables in the metastore, keywords in the grammar, and a new interface for use of tools like storm and flume. These need to be documented. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6492) limit partition number involved in a table scan
[ https://issues.apache.org/jira/browse/HIVE-6492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gunther Hagleitner updated HIVE-6492: - Attachment: HIVE-6492.4.patch_suggestion [~selinazh] - see .4_suggestion for what i meant by doing it in semantic analyzer. that way it will work for both tez + mr. i've also added a couple of tests. if you like you can throw out the fetch task part to make it simpler. limit partition number involved in a table scan --- Key: HIVE-6492 URL: https://issues.apache.org/jira/browse/HIVE-6492 Project: Hive Issue Type: New Feature Components: Query Processor Affects Versions: 0.12.0 Reporter: Selina Zhang Fix For: 0.13.0 Attachments: HIVE-6492.1.patch.txt, HIVE-6492.2.patch.txt, HIVE-6492.3.patch.txt, HIVE-6492.4.patch.txt, HIVE-6492.4.patch_suggestion Original Estimate: 24h Remaining Estimate: 24h To protect the cluster, a new configure variable hive.limit.query.max.table.partition is added to hive configuration to limit the table partitions involved in a table scan. The default value will be set to -1 which means there is no limit by default. This variable will not affect metadata only query. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6492) limit partition number involved in a table scan
[ https://issues.apache.org/jira/browse/HIVE-6492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Selina Zhang updated HIVE-6492: --- Attachment: HIVE-6492.5.patch.txt Thank you,Gunther! I like your patch though it does not really care of metadata only query. But I agree put it to SemanticAnalyzer is better. I just renamed the suggestion patch and re-submit it. limit partition number involved in a table scan --- Key: HIVE-6492 URL: https://issues.apache.org/jira/browse/HIVE-6492 Project: Hive Issue Type: New Feature Components: Query Processor Affects Versions: 0.12.0 Reporter: Selina Zhang Fix For: 0.13.0 Attachments: HIVE-6492.1.patch.txt, HIVE-6492.2.patch.txt, HIVE-6492.3.patch.txt, HIVE-6492.4.patch.txt, HIVE-6492.4.patch_suggestion, HIVE-6492.5.patch.txt Original Estimate: 24h Remaining Estimate: 24h To protect the cluster, a new configure variable hive.limit.query.max.table.partition is added to hive configuration to limit the table partitions involved in a table scan. The default value will be set to -1 which means there is no limit by default. This variable will not affect metadata only query. -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: Review Request 13845: HIVE-5155: Support secure proxy user access to HiveServer2
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/13845/ --- (Updated March 4, 2014, 10:47 p.m.) Review request for hive, Brock Noland, Carl Steinbach, and Thejas Nair. Changes --- Corrected a merge conflict. Bugs: HIVE-5155 https://issues.apache.org/jira/browse/HIVE-5155 Repository: hive-git Description --- Delegation token support - Enable delegation token connection for HiveServer2 Enhance the TCLIService interface to support delegation token requests Support passing the delegation token connection type via JDBC URL and Beeline option Direct proxy access - Define new proxy user property Shim interfaces to validate proxy access for a given user Note that the diff doesn't include thrift generated code. Diffs (updated) - beeline/pom.xml 7449430 beeline/src/java/org/apache/hive/beeline/BeeLine.java 563d242 beeline/src/java/org/apache/hive/beeline/BeeLineOpts.java 91e20ec beeline/src/java/org/apache/hive/beeline/Commands.java d2d7fd3 beeline/src/java/org/apache/hive/beeline/DatabaseConnection.java 94178ef beeline/src/test/org/apache/hive/beeline/ProxyAuthTest.java PRE-CREATION common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 388a604 conf/hive-default.xml.template 3f01e0b data/files/ProxyAuth.res PRE-CREATION itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestJdbcDriver2.java 8210e75 jdbc/src/java/org/apache/hadoop/hive/jdbc/HiveConnection.java d08e05b jdbc/src/java/org/apache/hive/jdbc/HiveConnection.java 4102d7a jdbc/src/java/org/apache/hive/jdbc/Utils.java 608837e service/src/java/org/apache/hive/service/auth/HiveAuthFactory.java d8ba3aa service/src/java/org/apache/hive/service/auth/KerberosSaslHelper.java 519556c service/src/java/org/apache/hive/service/auth/PlainSaslHelper.java 15b1675 service/src/java/org/apache/hive/service/cli/CLIService.java 2b1e712 service/src/java/org/apache/hive/service/cli/CLIServiceClient.java b9d1489 service/src/java/org/apache/hive/service/cli/EmbeddedCLIServiceClient.java a31ea94 service/src/java/org/apache/hive/service/cli/ICLIService.java 621d689 service/src/java/org/apache/hive/service/cli/session/HiveSession.java c8fb8ec service/src/java/org/apache/hive/service/cli/session/HiveSessionImpl.java d6d0d27 service/src/java/org/apache/hive/service/cli/session/HiveSessionImplwithUGI.java b934ebe service/src/java/org/apache/hive/service/cli/session/SessionManager.java cec3b04 service/src/java/org/apache/hive/service/cli/thrift/ThriftCLIService.java 26bda5a service/src/java/org/apache/hive/service/cli/thrift/ThriftCLIServiceClient.java 3675e86 service/src/test/org/apache/hive/service/auth/TestPlainSaslHelper.java 8fa4afd service/src/test/org/apache/hive/service/cli/session/TestSessionHooks.java 2fac800 shims/0.20/src/main/java/org/apache/hadoop/hive/shims/Hadoop20Shims.java 51c8051 shims/common-secure/src/main/java/org/apache/hadoop/hive/shims/HadoopShimsSecure.java e205caa shims/common-secure/src/main/java/org/apache/hadoop/hive/thrift/DelegationTokenSecretManager.java 29114f0 shims/common-secure/src/main/java/org/apache/hadoop/hive/thrift/HadoopThriftAuthBridge20S.java dc89de1 shims/common/src/main/java/org/apache/hadoop/hive/shims/HadoopShims.java e15ab4e shims/common/src/main/java/org/apache/hadoop/hive/thrift/HadoopThriftAuthBridge.java 03f4e51 Diff: https://reviews.apache.org/r/13845/diff/ Testing --- Since this requires kerberos setup, its tested by a standalone test program that runs various existing and new secure connection scenarios. The test code is attached to the ticket at https://issues.apache.org/jira/secure/attachment/12600119/ProxyAuth.java Thanks, Prasad Mujumdar
[jira] [Commented] (HIVE-6492) limit partition number involved in a table scan
[ https://issues.apache.org/jira/browse/HIVE-6492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13920146#comment-13920146 ] Gunther Hagleitner commented on HIVE-6492: -- [~selinazh] - sorry if i messed up the metadata only part. Can you give me an example where the patch doesn't work? limit partition number involved in a table scan --- Key: HIVE-6492 URL: https://issues.apache.org/jira/browse/HIVE-6492 Project: Hive Issue Type: New Feature Components: Query Processor Affects Versions: 0.12.0 Reporter: Selina Zhang Fix For: 0.13.0 Attachments: HIVE-6492.1.patch.txt, HIVE-6492.2.patch.txt, HIVE-6492.3.patch.txt, HIVE-6492.4.patch.txt, HIVE-6492.4.patch_suggestion, HIVE-6492.5.patch.txt Original Estimate: 24h Remaining Estimate: 24h To protect the cluster, a new configure variable hive.limit.query.max.table.partition is added to hive configuration to limit the table partitions involved in a table scan. The default value will be set to -1 which means there is no limit by default. This variable will not affect metadata only query. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-5155) Support secure proxy user access to HiveServer2
[ https://issues.apache.org/jira/browse/HIVE-5155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasad Mujumdar updated HIVE-5155: -- Attachment: HIVE-5155-noThrift.8.patch Support secure proxy user access to HiveServer2 --- Key: HIVE-5155 URL: https://issues.apache.org/jira/browse/HIVE-5155 Project: Hive Issue Type: Improvement Components: Authentication, HiveServer2, JDBC Affects Versions: 0.12.0 Reporter: Prasad Mujumdar Assignee: Prasad Mujumdar Attachments: HIVE-5155-1-nothrift.patch, HIVE-5155-noThrift.2.patch, HIVE-5155-noThrift.4.patch, HIVE-5155-noThrift.5.patch, HIVE-5155-noThrift.6.patch, HIVE-5155-noThrift.7.patch, HIVE-5155-noThrift.8.patch, HIVE-5155.1.patch, HIVE-5155.2.patch, HIVE-5155.3.patch, ProxyAuth.java, ProxyAuth.out, TestKERBEROS_Hive_JDBC.java The HiveServer2 can authenticate a client using via Kerberos and impersonate the connecting user with underlying secure hadoop. This becomes a gateway for a remote client to access secure hadoop cluster. Now this works fine for when the client obtains Kerberos ticket and directly connects to HiveServer2. There's another big use case for middleware tools where the end user wants to access Hive via another server. For example Oozie action or Hue submitting queries or a BI tool server accessing to HiveServer2. In these cases, the third party server doesn't have end user's Kerberos credentials and hence it can't submit queries to HiveServer2 on behalf of the end user. This ticket is for enabling proxy access to HiveServer2 for third party tools on behalf of end users. There are two parts of the solution proposed in this ticket: 1) Delegation token based connection for Oozie (OOZIE-1457) This is the common mechanism for Hadoop ecosystem components. Hive Remote Metastore and HCatalog already support this. This is suitable for tool like Oozie that submits the MR jobs as actions on behalf of its client. Oozie already uses similar mechanism for Metastore/HCatalog access. 2) Direct proxy access for privileged hadoop users The delegation token implementation can be a challenge for non-hadoop (especially non-java) components. This second part enables a privileged user to directly specify an alternate session user during the connection. If the connecting user has hadoop level privilege to impersonate the requested userid, then HiveServer2 will run the session as that requested user. For example, user Hue is allowed to impersonate user Bob (via core-site.xml proxy user configuration). Then user Hue can connect to HiveServer2 and specify Bob as session user via a session property. HiveServer2 will verify Hue's proxy user privilege and then impersonate user Bob instead of Hue. This will enable any third party tool to impersonate alternate userid without having to implement delegation token connection. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-5155) Support secure proxy user access to HiveServer2
[ https://issues.apache.org/jira/browse/HIVE-5155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13920152#comment-13920152 ] Prasad Mujumdar commented on HIVE-5155: --- [~thejas] [~vaibhavgumashta] The rebased patch is attached and review updated y'day. I found a minor rebase conflict that I just fixed. Please take a look when you get a chance. Thanks! Support secure proxy user access to HiveServer2 --- Key: HIVE-5155 URL: https://issues.apache.org/jira/browse/HIVE-5155 Project: Hive Issue Type: Improvement Components: Authentication, HiveServer2, JDBC Affects Versions: 0.12.0 Reporter: Prasad Mujumdar Assignee: Prasad Mujumdar Attachments: HIVE-5155-1-nothrift.patch, HIVE-5155-noThrift.2.patch, HIVE-5155-noThrift.4.patch, HIVE-5155-noThrift.5.patch, HIVE-5155-noThrift.6.patch, HIVE-5155-noThrift.7.patch, HIVE-5155-noThrift.8.patch, HIVE-5155.1.patch, HIVE-5155.2.patch, HIVE-5155.3.patch, ProxyAuth.java, ProxyAuth.out, TestKERBEROS_Hive_JDBC.java The HiveServer2 can authenticate a client using via Kerberos and impersonate the connecting user with underlying secure hadoop. This becomes a gateway for a remote client to access secure hadoop cluster. Now this works fine for when the client obtains Kerberos ticket and directly connects to HiveServer2. There's another big use case for middleware tools where the end user wants to access Hive via another server. For example Oozie action or Hue submitting queries or a BI tool server accessing to HiveServer2. In these cases, the third party server doesn't have end user's Kerberos credentials and hence it can't submit queries to HiveServer2 on behalf of the end user. This ticket is for enabling proxy access to HiveServer2 for third party tools on behalf of end users. There are two parts of the solution proposed in this ticket: 1) Delegation token based connection for Oozie (OOZIE-1457) This is the common mechanism for Hadoop ecosystem components. Hive Remote Metastore and HCatalog already support this. This is suitable for tool like Oozie that submits the MR jobs as actions on behalf of its client. Oozie already uses similar mechanism for Metastore/HCatalog access. 2) Direct proxy access for privileged hadoop users The delegation token implementation can be a challenge for non-hadoop (especially non-java) components. This second part enables a privileged user to directly specify an alternate session user during the connection. If the connecting user has hadoop level privilege to impersonate the requested userid, then HiveServer2 will run the session as that requested user. For example, user Hue is allowed to impersonate user Bob (via core-site.xml proxy user configuration). Then user Hue can connect to HiveServer2 and specify Bob as session user via a session property. HiveServer2 will verify Hue's proxy user privilege and then impersonate user Bob instead of Hue. This will enable any third party tool to impersonate alternate userid without having to implement delegation token connection. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-5687) Streaming support in Hive
[ https://issues.apache.org/jira/browse/HIVE-5687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Roshan Naik updated HIVE-5687: -- Issue Type: Sub-task (was: Bug) Parent: HIVE-5317 Streaming support in Hive - Key: HIVE-5687 URL: https://issues.apache.org/jira/browse/HIVE-5687 Project: Hive Issue Type: Sub-task Reporter: Roshan Naik Assignee: Roshan Naik Attachments: 5687-api-spec4.pdf, 5687-draft-api-spec.pdf, 5687-draft-api-spec2.pdf, 5687-draft-api-spec3.pdf, HIVE-5687.patch, HIVE-5687.v2.patch Implement support for Streaming data into HIVE. - Provide a client streaming API - Transaction support: Clients should be able to periodically commit a batch of records atomically - Immediate visibility: Records should be immediately visible to queries on commit - Should not overload HDFS with too many small files Use Cases: - Streaming logs into HIVE via Flume - Streaming results of computations from Storm -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6486) Support secure Subject.doAs() in HiveServer2 JDBC client.
[ https://issues.apache.org/jira/browse/HIVE-6486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shivaraju Gowda updated HIVE-6486: -- Attachment: HIVE-6486.2.patch The test failure in Pre-commit tests looks unrelated to the patch. The test case passed in my setup. I have rebased the patch to the trunk and uploading it again. Support secure Subject.doAs() in HiveServer2 JDBC client. - Key: HIVE-6486 URL: https://issues.apache.org/jira/browse/HIVE-6486 Project: Hive Issue Type: Improvement Components: Authentication, HiveServer2, JDBC Affects Versions: 0.11.0, 0.12.0 Reporter: Shivaraju Gowda Assignee: Shivaraju Gowda Fix For: 0.13.0 Attachments: HIVE-6486.1.patch, HIVE-6486.2.patch, Hive_011_Support-Subject_doAS.patch, TestHive_SujectDoAs.java HIVE-5155 addresses the problem of kerberos authentication in multi-user middleware server using proxy user. In this mode the principal used by the middle ware server has privileges to impersonate selected users in Hive/Hadoop. This enhancement is to support Subject.doAs() authentication in Hive JDBC layer so that the end users Kerberos Subject is passed through in the middle ware server. With this improvement there won't be any additional setup in the server to grant proxy privileges to some users and there won't be need to specify a proxy user in the JDBC client. This version should also be more secure since it won't require principals with the privileges to impersonate other users in Hive/Hadoop setup. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6486) Support secure Subject.doAs() in HiveServer2 JDBC client.
[ https://issues.apache.org/jira/browse/HIVE-6486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shivaraju Gowda updated HIVE-6486: -- Status: Patch Available (was: Open) Support secure Subject.doAs() in HiveServer2 JDBC client. - Key: HIVE-6486 URL: https://issues.apache.org/jira/browse/HIVE-6486 Project: Hive Issue Type: Improvement Components: Authentication, HiveServer2, JDBC Affects Versions: 0.12.0, 0.11.0 Reporter: Shivaraju Gowda Assignee: Shivaraju Gowda Fix For: 0.13.0 Attachments: HIVE-6486.1.patch, HIVE-6486.2.patch, Hive_011_Support-Subject_doAS.patch, TestHive_SujectDoAs.java HIVE-5155 addresses the problem of kerberos authentication in multi-user middleware server using proxy user. In this mode the principal used by the middle ware server has privileges to impersonate selected users in Hive/Hadoop. This enhancement is to support Subject.doAs() authentication in Hive JDBC layer so that the end users Kerberos Subject is passed through in the middle ware server. With this improvement there won't be any additional setup in the server to grant proxy privileges to some users and there won't be need to specify a proxy user in the JDBC client. This version should also be more secure since it won't require principals with the privileges to impersonate other users in Hive/Hadoop setup. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6486) Support secure Subject.doAs() in HiveServer2 JDBC client.
[ https://issues.apache.org/jira/browse/HIVE-6486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shivaraju Gowda updated HIVE-6486: -- Status: Open (was: Patch Available) Support secure Subject.doAs() in HiveServer2 JDBC client. - Key: HIVE-6486 URL: https://issues.apache.org/jira/browse/HIVE-6486 Project: Hive Issue Type: Improvement Components: Authentication, HiveServer2, JDBC Affects Versions: 0.12.0, 0.11.0 Reporter: Shivaraju Gowda Assignee: Shivaraju Gowda Fix For: 0.13.0 Attachments: HIVE-6486.1.patch, HIVE-6486.2.patch, Hive_011_Support-Subject_doAS.patch, TestHive_SujectDoAs.java HIVE-5155 addresses the problem of kerberos authentication in multi-user middleware server using proxy user. In this mode the principal used by the middle ware server has privileges to impersonate selected users in Hive/Hadoop. This enhancement is to support Subject.doAs() authentication in Hive JDBC layer so that the end users Kerberos Subject is passed through in the middle ware server. With this improvement there won't be any additional setup in the server to grant proxy privileges to some users and there won't be need to specify a proxy user in the JDBC client. This version should also be more secure since it won't require principals with the privileges to impersonate other users in Hive/Hadoop setup. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-5950) ORC SARG creation fails with NPE for predicate conditions with decimal/date/char/varchar datatypes
[ https://issues.apache.org/jira/browse/HIVE-5950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gunther Hagleitner updated HIVE-5950: - Resolution: Fixed Status: Resolved (was: Patch Available) Committed to trunk. Thanks Prasanth! ORC SARG creation fails with NPE for predicate conditions with decimal/date/char/varchar datatypes -- Key: HIVE-5950 URL: https://issues.apache.org/jira/browse/HIVE-5950 Project: Hive Issue Type: Bug Affects Versions: 0.13.0 Reporter: Prasanth J Assignee: Prasanth J Labels: orcfile Attachments: HIVE-5950.1.patch, HIVE-5950.2.patch, HIVE-5950.3.patch, HIVE-5950.4.patch, HIVE-5950.5.patch When decimal or date column is used, the type field in PredicateLeafImpl will be set to null. This will result in NPE during predicate leaf generation because of null dereferencing in hashcode computation. SARG creation should be extended to support/handle decimal and date data types. -- This message was sent by Atlassian JIRA (v6.2#6252)
Review Request 18757: HIVE-6486 Support secure Subject.doAs() in HiveServer2 JDBC client
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/18757/ --- Review request for hive, Kevin Minder, Prasad Mujumdar, Thejas Nair, and Vaibhav Gumashta. Bugs: HIVE-6486 https://issues.apache.org/jira/browse/HIVE-6486 Repository: hive Description --- Support secure Subject.doAs() in HiveServer2 JDBC client. Original review: https://reviews.apache.org/r/18464/ Diffs - http://svn.apache.org/repos/asf/hive/trunk/jdbc/src/java/org/apache/hive/jdbc/HiveConnection.java 1574208 http://svn.apache.org/repos/asf/hive/trunk/service/src/java/org/apache/hive/service/auth/KerberosSaslHelper.java 1574208 http://svn.apache.org/repos/asf/hive/trunk/service/src/java/org/apache/hive/service/auth/TSubjectAssumingTransport.java PRE-CREATION Diff: https://reviews.apache.org/r/18757/diff/ Testing --- Manual testing. Thanks, Shivaraju Gowda
[jira] [Commented] (HIVE-6486) Support secure Subject.doAs() in HiveServer2 JDBC client.
[ https://issues.apache.org/jira/browse/HIVE-6486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13920193#comment-13920193 ] Shivaraju Gowda commented on HIVE-6486: --- Vaibhav Gumashta I have created a review with the rebased trunk (ReviewBoard #18757). I couldn't edit the current review because I was not the owner. Support secure Subject.doAs() in HiveServer2 JDBC client. - Key: HIVE-6486 URL: https://issues.apache.org/jira/browse/HIVE-6486 Project: Hive Issue Type: Improvement Components: Authentication, HiveServer2, JDBC Affects Versions: 0.11.0, 0.12.0 Reporter: Shivaraju Gowda Assignee: Shivaraju Gowda Fix For: 0.13.0 Attachments: HIVE-6486.1.patch, HIVE-6486.2.patch, Hive_011_Support-Subject_doAS.patch, TestHive_SujectDoAs.java HIVE-5155 addresses the problem of kerberos authentication in multi-user middleware server using proxy user. In this mode the principal used by the middle ware server has privileges to impersonate selected users in Hive/Hadoop. This enhancement is to support Subject.doAs() authentication in Hive JDBC layer so that the end users Kerberos Subject is passed through in the middle ware server. With this improvement there won't be any additional setup in the server to grant proxy privileges to some users and there won't be need to specify a proxy user in the JDBC client. This version should also be more secure since it won't require principals with the privileges to impersonate other users in Hive/Hadoop setup. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6486) Support secure Subject.doAs() in HiveServer2 JDBC client.
[ https://issues.apache.org/jira/browse/HIVE-6486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13920201#comment-13920201 ] Shivaraju Gowda commented on HIVE-6486: --- Prasad Mujumdar: Thanks for the review and the offer to add the test. The test case attached to this issue might serve as a good starting point. Support secure Subject.doAs() in HiveServer2 JDBC client. - Key: HIVE-6486 URL: https://issues.apache.org/jira/browse/HIVE-6486 Project: Hive Issue Type: Improvement Components: Authentication, HiveServer2, JDBC Affects Versions: 0.11.0, 0.12.0 Reporter: Shivaraju Gowda Assignee: Shivaraju Gowda Fix For: 0.13.0 Attachments: HIVE-6486.1.patch, HIVE-6486.2.patch, Hive_011_Support-Subject_doAS.patch, TestHive_SujectDoAs.java HIVE-5155 addresses the problem of kerberos authentication in multi-user middleware server using proxy user. In this mode the principal used by the middle ware server has privileges to impersonate selected users in Hive/Hadoop. This enhancement is to support Subject.doAs() authentication in Hive JDBC layer so that the end users Kerberos Subject is passed through in the middle ware server. With this improvement there won't be any additional setup in the server to grant proxy privileges to some users and there won't be need to specify a proxy user in the JDBC client. This version should also be more secure since it won't require principals with the privileges to impersonate other users in Hive/Hadoop setup. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6486) Support secure Subject.doAs() in HiveServer2 JDBC client.
[ https://issues.apache.org/jira/browse/HIVE-6486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13920205#comment-13920205 ] Shivaraju Gowda commented on HIVE-6486: --- Thejas M Nair : Thanks for the review. Vaibhav Gumashta had some concerns on using the auth url property to enable this functionality, once it is cleared I will add the usage notes to the release notes section of the jira. I don't know where the Wiki Documentation is, can you point me to it, I will see if I can help. Support secure Subject.doAs() in HiveServer2 JDBC client. - Key: HIVE-6486 URL: https://issues.apache.org/jira/browse/HIVE-6486 Project: Hive Issue Type: Improvement Components: Authentication, HiveServer2, JDBC Affects Versions: 0.11.0, 0.12.0 Reporter: Shivaraju Gowda Assignee: Shivaraju Gowda Fix For: 0.13.0 Attachments: HIVE-6486.1.patch, HIVE-6486.2.patch, Hive_011_Support-Subject_doAS.patch, TestHive_SujectDoAs.java HIVE-5155 addresses the problem of kerberos authentication in multi-user middleware server using proxy user. In this mode the principal used by the middle ware server has privileges to impersonate selected users in Hive/Hadoop. This enhancement is to support Subject.doAs() authentication in Hive JDBC layer so that the end users Kerberos Subject is passed through in the middle ware server. With this improvement there won't be any additional setup in the server to grant proxy privileges to some users and there won't be need to specify a proxy user in the JDBC client. This version should also be more secure since it won't require principals with the privileges to impersonate other users in Hive/Hadoop setup. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6325) Enable using multiple concurrent sessions in tez
[ https://issues.apache.org/jira/browse/HIVE-6325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-6325: - Attachment: HIVE-6325.11.patch Rebasing patch. No tests affected. Enable using multiple concurrent sessions in tez Key: HIVE-6325 URL: https://issues.apache.org/jira/browse/HIVE-6325 Project: Hive Issue Type: Improvement Components: Tez Affects Versions: 0.13.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K Attachments: HIVE-6325.1.patch, HIVE-6325.10.patch, HIVE-6325.11.patch, HIVE-6325.2.patch, HIVE-6325.3.patch, HIVE-6325.4.patch, HIVE-6325.5.patch, HIVE-6325.6.patch, HIVE-6325.7.patch, HIVE-6325.8.patch, HIVE-6325.9.patch We would like to enable multiple concurrent sessions in tez via hive server 2. This will enable users to make efficient use of the cluster when it has been partitioned using yarn queues. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6325) Enable using multiple concurrent sessions in tez
[ https://issues.apache.org/jira/browse/HIVE-6325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-6325: - Status: Patch Available (was: Open) Enable using multiple concurrent sessions in tez Key: HIVE-6325 URL: https://issues.apache.org/jira/browse/HIVE-6325 Project: Hive Issue Type: Improvement Components: Tez Affects Versions: 0.13.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K Attachments: HIVE-6325.1.patch, HIVE-6325.10.patch, HIVE-6325.11.patch, HIVE-6325.2.patch, HIVE-6325.3.patch, HIVE-6325.4.patch, HIVE-6325.5.patch, HIVE-6325.6.patch, HIVE-6325.7.patch, HIVE-6325.8.patch, HIVE-6325.9.patch We would like to enable multiple concurrent sessions in tez via hive server 2. This will enable users to make efficient use of the cluster when it has been partitioned using yarn queues. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6539) Couple of issues in fs based stats collection
[ https://issues.apache.org/jira/browse/HIVE-6539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-6539: --- Affects Version/s: 0.13.0 Couple of issues in fs based stats collection - Key: HIVE-6539 URL: https://issues.apache.org/jira/browse/HIVE-6539 Project: Hive Issue Type: Bug Affects Versions: 0.13.0 Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Fix For: 0.13.0 Attachments: HIVE-6539.patch While testing on cluster found couple of bugs: * NPE in certain case. * map object reuse causing problem -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6539) Couple of issues in fs based stats collection
[ https://issues.apache.org/jira/browse/HIVE-6539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-6539: --- Resolution: Fixed Fix Version/s: 0.13.0 Status: Resolved (was: Patch Available) Committed to trunk. Couple of issues in fs based stats collection - Key: HIVE-6539 URL: https://issues.apache.org/jira/browse/HIVE-6539 Project: Hive Issue Type: Bug Affects Versions: 0.13.0 Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Fix For: 0.13.0 Attachments: HIVE-6539.patch While testing on cluster found couple of bugs: * NPE in certain case. * map object reuse causing problem -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6545) analyze table throws NPE for non-existent tables.
[ https://issues.apache.org/jira/browse/HIVE-6545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13920226#comment-13920226 ] Hive QA commented on HIVE-6545: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12632573/HIVE-6545.patch {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 5240 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_parallel_orderby {noformat} Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1621/testReport Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1621/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12632573 analyze table throws NPE for non-existent tables. - Key: HIVE-6545 URL: https://issues.apache.org/jira/browse/HIVE-6545 Project: Hive Issue Type: Bug Components: Statistics Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Attachments: HIVE-6545.patch Instead of NPE, we should give error message to user. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6492) limit partition number involved in a table scan
[ https://issues.apache.org/jira/browse/HIVE-6492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13920237#comment-13920237 ] Selina Zhang commented on HIVE-6492: In the new test case limit_partition_2.q: select distinct hr from srcpart; should let pass because hr is the partition key. With the new patch, it is blocked: FAILED: SemanticException Number of partitions scanned (=4) on table srcpart exceeds limit (=1). This is controlled by hive.limit.query.max.table.partition. limit partition number involved in a table scan --- Key: HIVE-6492 URL: https://issues.apache.org/jira/browse/HIVE-6492 Project: Hive Issue Type: New Feature Components: Query Processor Affects Versions: 0.12.0 Reporter: Selina Zhang Fix For: 0.13.0 Attachments: HIVE-6492.1.patch.txt, HIVE-6492.2.patch.txt, HIVE-6492.3.patch.txt, HIVE-6492.4.patch.txt, HIVE-6492.4.patch_suggestion, HIVE-6492.5.patch.txt Original Estimate: 24h Remaining Estimate: 24h To protect the cluster, a new configure variable hive.limit.query.max.table.partition is added to hive configuration to limit the table partitions involved in a table scan. The default value will be set to -1 which means there is no limit by default. This variable will not affect metadata only query. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6529) Tez output files are out of date
[ https://issues.apache.org/jira/browse/HIVE-6529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13920236#comment-13920236 ] Gunther Hagleitner commented on HIVE-6529: -- +1 Tez output files are out of date Key: HIVE-6529 URL: https://issues.apache.org/jira/browse/HIVE-6529 Project: Hive Issue Type: Bug Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Priority: Minor Attachments: HIVE-6529.patch -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6545) analyze table throws NPE for non-existent tables.
[ https://issues.apache.org/jira/browse/HIVE-6545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-6545: --- Resolution: Fixed Fix Version/s: 0.13.0 Status: Resolved (was: Patch Available) Committed to trunk. analyze table throws NPE for non-existent tables. - Key: HIVE-6545 URL: https://issues.apache.org/jira/browse/HIVE-6545 Project: Hive Issue Type: Bug Components: Statistics Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Fix For: 0.13.0 Attachments: HIVE-6545.patch Instead of NPE, we should give error message to user. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HIVE-6548) Missing owner name and type fields in schema script for DBS table
Ashutosh Chauhan created HIVE-6548: -- Summary: Missing owner name and type fields in schema script for DBS table Key: HIVE-6548 URL: https://issues.apache.org/jira/browse/HIVE-6548 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 0.13.0 Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan HIVE-6386 introduced new columns in DBS table, but those are missing from schema scripts. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6548) Missing owner name and type fields in schema script for DBS table
[ https://issues.apache.org/jira/browse/HIVE-6548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-6548: --- Attachment: HIVE-6548.patch Patch to add missing columns in schema scripts. Missing owner name and type fields in schema script for DBS table -- Key: HIVE-6548 URL: https://issues.apache.org/jira/browse/HIVE-6548 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 0.13.0 Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Attachments: HIVE-6548.patch HIVE-6386 introduced new columns in DBS table, but those are missing from schema scripts. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6548) Missing owner name and type fields in schema script for DBS table
[ https://issues.apache.org/jira/browse/HIVE-6548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-6548: --- Status: Patch Available (was: Open) Missing owner name and type fields in schema script for DBS table -- Key: HIVE-6548 URL: https://issues.apache.org/jira/browse/HIVE-6548 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 0.13.0 Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Attachments: HIVE-6548.patch HIVE-6386 introduced new columns in DBS table, but those are missing from schema scripts. -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: Timeline for the Hive 0.13 release?
I have two patches still as patch-available, that have had +1s as well, but are waiting on pre-commit tests picking them up go in to 0.13: https://issues.apache.org/jira/browse/HIVE-6507 (refactor of table property names from string constants to an enum in OrcFile) https://issues.apache.org/jira/browse/HIVE-6499 (fixes bug where calls like create table and drop table can fail if metastore-side authorization is used in conjunction with custom inputformat/outputformat/serdes that are not loadable from the metastore-side)
[jira] [Updated] (HIVE-5843) Transaction manager for Hive
[ https://issues.apache.org/jira/browse/HIVE-5843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-5843: --- Resolution: Fixed Status: Resolved (was: Patch Available) Committed to trunk. Thanks, Alan! Transaction manager for Hive Key: HIVE-5843 URL: https://issues.apache.org/jira/browse/HIVE-5843 Project: Hive Issue Type: Sub-task Affects Versions: 0.12.0 Reporter: Alan Gates Assignee: Alan Gates Fix For: 0.13.0 Attachments: 5843.5-wip.patch, HIVE-5843-src-only.6.patch, HIVE-5843-src-only.patch, HIVE-5843.10.patch, HIVE-5843.2.patch, HIVE-5843.3-src.path, HIVE-5843.3.patch, HIVE-5843.4-src.patch, HIVE-5843.4.patch, HIVE-5843.6.patch, HIVE-5843.7.patch, HIVE-5843.8.patch, HIVE-5843.8.src-only.patch, HIVE-5843.9.patch, HIVE-5843.patch, HiveTransactionManagerDetailedDesign (1).pdf As part of the ACID work proposed in HIVE-5317 a transaction manager is required. -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: Review Request 15873: Query cancel should stop running MR tasks
On Feb. 27, 2014, 11:08 p.m., Thejas Nair wrote: ql/src/java/org/apache/hadoop/hive/ql/DriverContext.java, line 110 https://reviews.apache.org/r/15873/diff/3/?file=478815#file478815line110 When pollFinished is running, this shutdown() function will not be able to make progress. Which means that the query cancellation will happen only after a task (could be an MR task) is complete. It seems synchronizing around shutdown should be sufficient, either by making it volatile or having synchronized methods around it. Since thread safe concurrent collection classes are being used here, I don't see other concurrency issues that would make it necessary to make all these functions synchronized. Navis Ryu wrote: It just only polls status of running tasks and goes into wait state quite quickly, so it would not hinder shutdown process. Furthermore, two threads, polling and shutdown, has a race condition on both collections, runnable and running, so those should be guarded by shared something. Thejas Nair wrote: Yes, it will go into the wait state quickly. But I haven't understood how the wait helps here. There is no notify in this code, so the wait will always wait for 2 seconds. It will be no different from a sleep(2000) . So it looks like the polling outside loop will continue until all the currently running jobs are complete. In javadoc, Object.wait() The current thread must own this object's monitor. The thread releases ownership of this monitor and waits until another thread notifies threads waiting on this object's monitor In wait state, any other thread can take the monitor (in sleep, it's not possible). So shutdown thread does not need to wait for 2 seconds. Polling thread might notice 2 seconds after shutdown as you said because it's not notified. But I think it's not a big deal. Isn't it? - Navis --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/15873/#review35625 --- On March 4, 2014, 8:02 a.m., Navis Ryu wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/15873/ --- (Updated March 4, 2014, 8:02 a.m.) Review request for hive. Bugs: HIVE-5901 https://issues.apache.org/jira/browse/HIVE-5901 Repository: hive-git Description --- Currently, query canceling does not stop running MR job immediately. Diffs - ql/src/java/org/apache/hadoop/hive/ql/Driver.java 332cadb ql/src/java/org/apache/hadoop/hive/ql/DriverContext.java c51a9c8 ql/src/java/org/apache/hadoop/hive/ql/exec/ConditionalTask.java 854cd52 ql/src/java/org/apache/hadoop/hive/ql/exec/TaskRunner.java ead7b59 Diff: https://reviews.apache.org/r/15873/diff/ Testing --- Thanks, Navis Ryu
[jira] [Updated] (HIVE-6523) Tests with -Phadoop-2 and MiniMRCluster error if it doesn't find yarn-site.xml
[ https://issues.apache.org/jira/browse/HIVE-6523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sushanth Sowmyan updated HIVE-6523: --- Resolution: Won't Fix Status: Resolved (was: Patch Available) Tests with -Phadoop-2 and MiniMRCluster error if it doesn't find yarn-site.xml -- Key: HIVE-6523 URL: https://issues.apache.org/jira/browse/HIVE-6523 Project: Hive Issue Type: Bug Affects Versions: 0.13.0 Environment: Hadoop 2.4.* Reporter: Sushanth Sowmyan Assignee: Sushanth Sowmyan Attachments: HIVE-6523.patch With the newer hadoop versions (2.4+) in tests, MiniMRCluster throws an error loading resources if it can't find a yarn-site.xml in its classpath, which affects test runs with -Phadoop-2 and minimrclusters. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6523) Tests with -Phadoop-2 and MiniMRCluster error if it doesn't find yarn-site.xml
[ https://issues.apache.org/jira/browse/HIVE-6523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13920277#comment-13920277 ] Sushanth Sowmyan commented on HIVE-6523: Closing as WONTFIX, since YARN-1758 has been fixed. This can be reopened at some time if this issue is observed again. Tests with -Phadoop-2 and MiniMRCluster error if it doesn't find yarn-site.xml -- Key: HIVE-6523 URL: https://issues.apache.org/jira/browse/HIVE-6523 Project: Hive Issue Type: Bug Affects Versions: 0.13.0 Environment: Hadoop 2.4.* Reporter: Sushanth Sowmyan Assignee: Sushanth Sowmyan Attachments: HIVE-6523.patch With the newer hadoop versions (2.4+) in tests, MiniMRCluster throws an error loading resources if it can't find a yarn-site.xml in its classpath, which affects test runs with -Phadoop-2 and minimrclusters. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HIVE-6549) removed templeton.jar from webhcat-default.xml
Eugene Koifman created HIVE-6549: Summary: removed templeton.jar from webhcat-default.xml Key: HIVE-6549 URL: https://issues.apache.org/jira/browse/HIVE-6549 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.12.0 Reporter: Eugene Koifman Assignee: Eugene Koifman this property is no longer used also removed corresponding AppConfig.TEMPLETON_JAR_NAME -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6549) removed templeton.jar from webhcat-default.xml
[ https://issues.apache.org/jira/browse/HIVE-6549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-6549: - Priority: Minor (was: Major) removed templeton.jar from webhcat-default.xml -- Key: HIVE-6549 URL: https://issues.apache.org/jira/browse/HIVE-6549 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.12.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Priority: Minor this property is no longer used also removed corresponding AppConfig.TEMPLETON_JAR_NAME -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: Timeline for the Hive 0.13 release?
branching now. Will be changing the pom files on trunk. Will send another email when the branch and trunk changes are in. On Mar 4, 2014, at 4:03 PM, Sushanth Sowmyan khorg...@gmail.com wrote: I have two patches still as patch-available, that have had +1s as well, but are waiting on pre-commit tests picking them up go in to 0.13: https://issues.apache.org/jira/browse/HIVE-6507 (refactor of table property names from string constants to an enum in OrcFile) https://issues.apache.org/jira/browse/HIVE-6499 (fixes bug where calls like create table and drop table can fail if metastore-side authorization is used in conjunction with custom inputformat/outputformat/serdes that are not loadable from the metastore-side) -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
Re: Timeline for the Hive 0.13 release?
Hi, I have https://issues.apache.org/jira/browse/HIVE-6325 in patch available. It is awaiting pre-commit tests to run. I would like for it to go in as well. Thanks Vikram. On Tue, Mar 4, 2014 at 5:05 PM, Harish Butani hbut...@hortonworks.comwrote: branching now. Will be changing the pom files on trunk. Will send another email when the branch and trunk changes are in. On Mar 4, 2014, at 4:03 PM, Sushanth Sowmyan khorg...@gmail.com wrote: I have two patches still as patch-available, that have had +1s as well, but are waiting on pre-commit tests picking them up go in to 0.13: https://issues.apache.org/jira/browse/HIVE-6507 (refactor of table property names from string constants to an enum in OrcFile) https://issues.apache.org/jira/browse/HIVE-6499 (fixes bug where calls like create table and drop table can fail if metastore-side authorization is used in conjunction with custom inputformat/outputformat/serdes that are not loadable from the metastore-side) -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You. -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
[jira] [Commented] (HIVE-5888) group by after join operation product no result when hive.optimize.skewjoin = true
[ https://issues.apache.org/jira/browse/HIVE-5888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13920354#comment-13920354 ] Jian Fang commented on HIVE-5888: - We had the same problem for Hive 0.11. Ankit created a JIRA at https://issues.apache.org/jira/browse/HIVE-6520. Based on Ankit's observation, here is the root cause of the problem: -- Hive's Skew join optimization is a physical optimization that changes the operator DAG (At compile time, Hive first creates a basic operator DAG and then various optimizations optimize it). After compile time skew join optimization, the skew join related nodes will look like: (MR job with Reduce Join Operator (Stage-1))-(Conditional Skew Join Task that performs Map Join (Stage-2)). When Skew Join optimization kicks in at compile time, it sets a flag handleSkewJoin in Stage-1. At run time, Stage-1 performs following (provided handleSkewJoin flag was set): 1. Join unskewed keys through normal MR job. 2. Copies data with skewed keys (from all tables) in a specific directory structure in hdfs. Stage-2 then picks the skewed data and performs Map Join. The Map Join of the skewed keys is the real optimization because it saves running reducer which has to copy intermediate data from mappers. Hive also has Map Join Optimization and this is the cause of the problem. A normal map-reduce join is converted to map join if (n-1) small tables can fit in memory. If this happens, at compile time, after both map join and skew join optimization, nodes will look like: (MR job with Map Join Operator (Stage-1))-(Conditional Skew Join Task that performs Map Join (Stage-2)). Now the problem is that Skew Join optimization sets handleSkewJoin only for Reduce Join Operator in Stage-1 (it assumes there will be a reducer). So, in case there is Map Join Operator, handleSkewJoin flag is not set and Stage-1 doesn't copy skewed keys in hdfs. When Stage-2 runs, it is not able to find skewed key directory and it gets eliminated at run time. Therefore, no results are displayed. - I tried to set hive.optimize.skewjoin=true and hive.auto.convert.join=false so that stage-1 would not be converted to a mapjoin to work around this problem. But the reduce phase in stage-1 took an extremely long time. We had 200 reducers, most of them only have 5 or 6 input keys and all the remaining keys were distributed to two reducers. Seems the two reducers created very big RowContainer files on local disk, for example. -rwxrwxrwx 1 hadoop hadoop 334G Mar 5 00:56 RowContainer6650985529012862786.[129].tmp -rw-r--r-- 1 hadoop hadoop 2.7G Mar 5 00:56 .RowContainer6650985529012862786.[129].tmp.crc This behavior is really weird. The inconsistent results caused a lot trouble for us. Is there any way to work around this problem? group by after join operation product no result when hive.optimize.skewjoin = true Key: HIVE-5888 URL: https://issues.apache.org/jira/browse/HIVE-5888 Project: Hive Issue Type: Bug Affects Versions: 0.12.0 Reporter: cyril liao -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6492) limit partition number involved in a table scan
[ https://issues.apache.org/jira/browse/HIVE-6492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13920358#comment-13920358 ] Selina Zhang commented on HIVE-6492: Also should let the test case pass in limit_partition_3.q set hive.compute.query.using.stats=true; set hive.limit.query.max.table.partition=1; select count(*) from part; for it does not need a table scan. limit partition number involved in a table scan --- Key: HIVE-6492 URL: https://issues.apache.org/jira/browse/HIVE-6492 Project: Hive Issue Type: New Feature Components: Query Processor Affects Versions: 0.12.0 Reporter: Selina Zhang Fix For: 0.13.0 Attachments: HIVE-6492.1.patch.txt, HIVE-6492.2.patch.txt, HIVE-6492.3.patch.txt, HIVE-6492.4.patch.txt, HIVE-6492.4.patch_suggestion, HIVE-6492.5.patch.txt Original Estimate: 24h Remaining Estimate: 24h To protect the cluster, a new configure variable hive.limit.query.max.table.partition is added to hive configuration to limit the table partitions involved in a table scan. The default value will be set to -1 which means there is no limit by default. This variable will not affect metadata only query. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6392) Hive (and HCatalog) don't allow super-users to add partitions to tables.
[ https://issues.apache.org/jira/browse/HIVE-6392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13920359#comment-13920359 ] Hive QA commented on HIVE-6392: --- {color:green}Overall{color}: +1 all checks pass Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12632575/HIVE-6392.patch {color:green}SUCCESS:{color} +1 5244 tests passed Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1622/testReport Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1622/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12632575 Hive (and HCatalog) don't allow super-users to add partitions to tables. Key: HIVE-6392 URL: https://issues.apache.org/jira/browse/HIVE-6392 Project: Hive Issue Type: Bug Components: Authorization Affects Versions: 0.12.0, 0.13.0 Reporter: Mithun Radhakrishnan Assignee: Mithun Radhakrishnan Attachments: HIVE-6392.branch-0.12.patch, HIVE-6392.patch HDFS allows for users to be added to a supergroup (identified by the dfs.permissions.superusergroup key in hdfs-site.xml). Users in this group are allowed to modify HDFS contents regardless of the path's ogw permissions. However, Hive's StorageBasedAuthProvider disallows such a superuser from adding partitions to any table that doesn't explicitly grant write permissions to said superuser. This causes the odd scenario where the superuser writes data to a partition-directory (under the table's path), but can't register the appropriate partition. I have a patch that brings the Metastore's behaviour in line with what the HDFS allows. -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: Timeline for the Hive 0.13 release?
the branch is created. have changed the poms in both branches. Planning to setup a wikipage to track jiras that will get ported to 0.13 regards, Harish. On Mar 4, 2014, at 5:05 PM, Harish Butani hbut...@hortonworks.com wrote: branching now. Will be changing the pom files on trunk. Will send another email when the branch and trunk changes are in. On Mar 4, 2014, at 4:03 PM, Sushanth Sowmyan khorg...@gmail.com wrote: I have two patches still as patch-available, that have had +1s as well, but are waiting on pre-commit tests picking them up go in to 0.13: https://issues.apache.org/jira/browse/HIVE-6507 (refactor of table property names from string constants to an enum in OrcFile) https://issues.apache.org/jira/browse/HIVE-6499 (fixes bug where calls like create table and drop table can fail if metastore-side authorization is used in conjunction with custom inputformat/outputformat/serdes that are not loadable from the metastore-side) -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
[jira] [Commented] (HIVE-6548) Missing owner name and type fields in schema script for DBS table
[ https://issues.apache.org/jira/browse/HIVE-6548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13920421#comment-13920421 ] Thejas M Nair commented on HIVE-6548: - +1 Missing owner name and type fields in schema script for DBS table -- Key: HIVE-6548 URL: https://issues.apache.org/jira/browse/HIVE-6548 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 0.13.0 Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Attachments: HIVE-6548.patch HIVE-6386 introduced new columns in DBS table, but those are missing from schema scripts. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-5931) SQL std auth - add metastore get_role_participants api - to support DESCRIBE ROLE
[ https://issues.apache.org/jira/browse/HIVE-5931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13920423#comment-13920423 ] Ashutosh Chauhan commented on HIVE-5931: Few comments on proposed api: * Better name for 1st method : get_principals_in_role() ? * Better name for 2nd method : get_roles_granted_to_principal() ? * Also struct needs better name. Also, put explanation for struct, since it carries redundant info, depending on method it is used in. * principalType in struct should be enum SQL std auth - add metastore get_role_participants api - to support DESCRIBE ROLE - Key: HIVE-5931 URL: https://issues.apache.org/jira/browse/HIVE-5931 Project: Hive Issue Type: Sub-task Components: Authorization Reporter: Thejas M Nair Attachments: HIVE-5931.thriftapi.followup.patch, HIVE-5931.thriftapi.patch Original Estimate: 24h Remaining Estimate: 24h This is necessary for DESCRIBE ROLE role statement. This will list all users and roles that participate in a role. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6460) Need new show functionality for transactions
[ https://issues.apache.org/jira/browse/HIVE-6460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates updated HIVE-6460: - Status: Patch Available (was: Open) Need new show functionality for transactions -- Key: HIVE-6460 URL: https://issues.apache.org/jira/browse/HIVE-6460 Project: Hive Issue Type: Sub-task Components: SQL Reporter: Alan Gates Assignee: Alan Gates Fix For: 0.13.0 Attachments: 6460.wip.patch, HIVE-6460.patch With the addition of transactions and compactions for delta files some new show commands are required. * show transactions to show currently open or aborted transactions * show compactions to show currently waiting or running compactions * show locks needs to work with the new db style of locks as well. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6460) Need new show functionality for transactions
[ https://issues.apache.org/jira/browse/HIVE-6460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates updated HIVE-6460: - Attachment: HIVE-6460.patch Need new show functionality for transactions -- Key: HIVE-6460 URL: https://issues.apache.org/jira/browse/HIVE-6460 Project: Hive Issue Type: Sub-task Components: SQL Reporter: Alan Gates Assignee: Alan Gates Fix For: 0.13.0 Attachments: 6460.wip.patch, HIVE-6460.patch With the addition of transactions and compactions for delta files some new show commands are required. * show transactions to show currently open or aborted transactions * show compactions to show currently waiting or running compactions * show locks needs to work with the new db style of locks as well. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6392) Hive (and HCatalog) don't allow super-users to add partitions to tables.
[ https://issues.apache.org/jira/browse/HIVE-6392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thejas M Nair updated HIVE-6392: Resolution: Fixed Fix Version/s: 0.13.0 Status: Resolved (was: Patch Available) Patch committed to trunk. Thanks for the contribution Mithun! Hive (and HCatalog) don't allow super-users to add partitions to tables. Key: HIVE-6392 URL: https://issues.apache.org/jira/browse/HIVE-6392 Project: Hive Issue Type: Bug Components: Authorization Affects Versions: 0.12.0, 0.13.0 Reporter: Mithun Radhakrishnan Assignee: Mithun Radhakrishnan Fix For: 0.13.0 Attachments: HIVE-6392.branch-0.12.patch, HIVE-6392.patch HDFS allows for users to be added to a supergroup (identified by the dfs.permissions.superusergroup key in hdfs-site.xml). Users in this group are allowed to modify HDFS contents regardless of the path's ogw permissions. However, Hive's StorageBasedAuthProvider disallows such a superuser from adding partitions to any table that doesn't explicitly grant write permissions to said superuser. This causes the odd scenario where the superuser writes data to a partition-directory (under the table's path), but can't register the appropriate partition. I have a patch that brings the Metastore's behaviour in line with what the HDFS allows. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6549) removed templeton.jar from webhcat-default.xml
[ https://issues.apache.org/jira/browse/HIVE-6549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13920425#comment-13920425 ] Lefty Leverenz commented on HIVE-6549: -- When this gets committed, the wiki needs to be edited (with version information): * [WebHCat Configuration: Configuration Variables |https://cwiki.apache.org/confluence/display/Hive/WebHCat+Configure#WebHCatConfigure-ConfigurationVariables] The existing table shows configuration defaults for Hive 0.11.0, so they ought to be updated too. But if the only changes are 11 or 12 or 13 in file names and paths, then a note could explain that in the intro to the table. removed templeton.jar from webhcat-default.xml -- Key: HIVE-6549 URL: https://issues.apache.org/jira/browse/HIVE-6549 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.12.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Priority: Minor this property is no longer used also removed corresponding AppConfig.TEMPLETON_JAR_NAME -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HIVE-6550) SemanticAnalyzer.reset() doesn't clear all the state
Laljo John Pullokkaran created HIVE-6550: Summary: SemanticAnalyzer.reset() doesn't clear all the state Key: HIVE-6550 URL: https://issues.apache.org/jira/browse/HIVE-6550 Project: Hive Issue Type: Bug Reporter: Laljo John Pullokkaran Assignee: Laljo John Pullokkaran -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: Timeline for the Hive 0.13 release?
Tracking jiras to be applied to branch 0.13 here: https://cwiki.apache.org/confluence/display/Hive/Hive+0.13+release+status On Mar 4, 2014, at 5:45 PM, Harish Butani hbut...@hortonworks.com wrote: the branch is created. have changed the poms in both branches. Planning to setup a wikipage to track jiras that will get ported to 0.13 regards, Harish. On Mar 4, 2014, at 5:05 PM, Harish Butani hbut...@hortonworks.com wrote: branching now. Will be changing the pom files on trunk. Will send another email when the branch and trunk changes are in. On Mar 4, 2014, at 4:03 PM, Sushanth Sowmyan khorg...@gmail.com wrote: I have two patches still as patch-available, that have had +1s as well, but are waiting on pre-commit tests picking them up go in to 0.13: https://issues.apache.org/jira/browse/HIVE-6507 (refactor of table property names from string constants to an enum in OrcFile) https://issues.apache.org/jira/browse/HIVE-6499 (fixes bug where calls like create table and drop table can fail if metastore-side authorization is used in conjunction with custom inputformat/outputformat/serdes that are not loadable from the metastore-side) -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
[jira] [Commented] (HIVE-6486) Support secure Subject.doAs() in HiveServer2 JDBC client.
[ https://issues.apache.org/jira/browse/HIVE-6486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13920437#comment-13920437 ] Lefty Leverenz commented on HIVE-6486: -- Here's the user doc for HiveServer2 JDBC clients: * [HiveServer2 Clients: JDBC |https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Clients#HiveServer2Clients-JDBC] Administration doc is here: * [Setting Up HiveServer2 |https://cwiki.apache.org/confluence/display/Hive/Setting+Up+HiveServer2] In particular: * [Setting Up HiveServer2: Authentication/Security Configuration |https://cwiki.apache.org/confluence/display/Hive/Setting+Up+HiveServer2#SettingUpHiveServer2-Authentication/SecurityConfiguration] Support secure Subject.doAs() in HiveServer2 JDBC client. - Key: HIVE-6486 URL: https://issues.apache.org/jira/browse/HIVE-6486 Project: Hive Issue Type: Improvement Components: Authentication, HiveServer2, JDBC Affects Versions: 0.11.0, 0.12.0 Reporter: Shivaraju Gowda Assignee: Shivaraju Gowda Fix For: 0.13.0 Attachments: HIVE-6486.1.patch, HIVE-6486.2.patch, Hive_011_Support-Subject_doAS.patch, TestHive_SujectDoAs.java HIVE-5155 addresses the problem of kerberos authentication in multi-user middleware server using proxy user. In this mode the principal used by the middle ware server has privileges to impersonate selected users in Hive/Hadoop. This enhancement is to support Subject.doAs() authentication in Hive JDBC layer so that the end users Kerberos Subject is passed through in the middle ware server. With this improvement there won't be any additional setup in the server to grant proxy privileges to some users and there won't be need to specify a proxy user in the JDBC client. This version should also be more secure since it won't require principals with the privileges to impersonate other users in Hive/Hadoop setup. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6433) SQL std auth - allow grant/revoke roles if user has ADMIN OPTION
[ https://issues.apache.org/jira/browse/HIVE-6433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13920439#comment-13920439 ] Thejas M Nair commented on HIVE-6433: - I will comb through the issues and create a consolidated doc for parent HIVE-5837. This change is specific to HIVE-5837. SQL std auth - allow grant/revoke roles if user has ADMIN OPTION Key: HIVE-6433 URL: https://issues.apache.org/jira/browse/HIVE-6433 Project: Hive Issue Type: Sub-task Reporter: Thejas M Nair Assignee: Ashutosh Chauhan Fix For: 0.13.0 Attachments: HIVE-6433.1.patch, HIVE-6433.2.patch, HIVE-6433.patch Follow up jira for HIVE-5952. If a user/role has admin option on a role, then user should be able to grant /revoke other users to/from the role. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Assigned] (HIVE-6432) Remove deprecated methods in HCatalog
[ https://issues.apache.org/jira/browse/HIVE-6432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sushanth Sowmyan reassigned HIVE-6432: -- Assignee: Sushanth Sowmyan Remove deprecated methods in HCatalog - Key: HIVE-6432 URL: https://issues.apache.org/jira/browse/HIVE-6432 Project: Hive Issue Type: Task Components: HCatalog Reporter: Sushanth Sowmyan Assignee: Sushanth Sowmyan There are a lot of methods in HCatalog that have been deprecated in HCatalog 0.5, and some that were recently deprecated in Hive 0.11 (joint release with HCatalog). The goal for HCatalog deprecation is that in general, after something has been deprecated, it is expected to stay around for 2 releases, which means hive-0.13 will be the last release to ship with all the methods that were deprecated in hive-0.11 (the org.apache.hcatalog.* files should all be removed afterwards), and it is also good for us to clean out and nuke all other older deprecated methods. We should take this on early in a dev/release cycle to allow us time to resolve all fallout, so I propose that we remove all HCatalog deprecated methods after we branch out 0.13 and 0.14 becomes trunk. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6433) SQL std auth - allow grant/revoke roles if user has ADMIN OPTION
[ https://issues.apache.org/jira/browse/HIVE-6433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thejas M Nair updated HIVE-6433: Release Note: If a user/role has admin option on a role, then user should be able to grant /revoke other users to/from the role. SQL std auth - allow grant/revoke roles if user has ADMIN OPTION Key: HIVE-6433 URL: https://issues.apache.org/jira/browse/HIVE-6433 Project: Hive Issue Type: Sub-task Reporter: Thejas M Nair Assignee: Ashutosh Chauhan Fix For: 0.13.0 Attachments: HIVE-6433.1.patch, HIVE-6433.2.patch, HIVE-6433.patch Follow up jira for HIVE-5952. If a user/role has admin option on a role, then user should be able to grant /revoke other users to/from the role. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6432) Remove deprecated methods in HCatalog
[ https://issues.apache.org/jira/browse/HIVE-6432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sushanth Sowmyan updated HIVE-6432: --- Attachment: HIVE-6432.wip.1.patch Now that 0.13 has forked out, and 0.14 is trunk, it's time for mass destruction! I'm uploading a first attempt work-in-progress patch, which removes all org.apache.hcatalog entries. This is not backward-compatible, and removes the storage-handlers directory in hcat altogether. I still need to remove deprecated functions and api points in various classes. Remove deprecated methods in HCatalog - Key: HIVE-6432 URL: https://issues.apache.org/jira/browse/HIVE-6432 Project: Hive Issue Type: Task Components: HCatalog Reporter: Sushanth Sowmyan Assignee: Sushanth Sowmyan Attachments: HIVE-6432.wip.1.patch There are a lot of methods in HCatalog that have been deprecated in HCatalog 0.5, and some that were recently deprecated in Hive 0.11 (joint release with HCatalog). The goal for HCatalog deprecation is that in general, after something has been deprecated, it is expected to stay around for 2 releases, which means hive-0.13 will be the last release to ship with all the methods that were deprecated in hive-0.11 (the org.apache.hcatalog.* files should all be removed afterwards), and it is also good for us to clean out and nuke all other older deprecated methods. We should take this on early in a dev/release cycle to allow us time to resolve all fallout, so I propose that we remove all HCatalog deprecated methods after we branch out 0.13 and 0.14 becomes trunk. -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: Review Request 18179: Support more generic way of using composite key for HBaseHandler
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/18179/ --- (Updated March 5, 2014, 3:47 a.m.) Review request for hive. Changes --- Merged functionality of HIVE-2599 Bugs: HIVE-6411 https://issues.apache.org/jira/browse/HIVE-6411 Repository: hive-git Description --- HIVE-2599 introduced using custom object for the row key. But it forces key objects to extend HBaseCompositeKey, which is again extension of LazyStruct. If user provides proper Object and OI, we can replace internal key and keyOI with those. Initial implementation is based on factory interface. {code} public interface HBaseKeyFactory { void init(SerDeParameters parameters, Properties properties) throws SerDeException; ObjectInspector createObjectInspector(TypeInfo type) throws SerDeException; LazyObjectBase createObject(ObjectInspector inspector) throws SerDeException; } {code} Diffs (updated) - hbase-handler/pom.xml 7c3524c hbase-handler/src/java/org/apache/hadoop/hive/hbase/HBaseCompositeKey.java 5008f15 hbase-handler/src/java/org/apache/hadoop/hive/hbase/HBaseKeyFactory.java PRE-CREATION hbase-handler/src/java/org/apache/hadoop/hive/hbase/HBaseLazyObjectFactory.java PRE-CREATION hbase-handler/src/java/org/apache/hadoop/hive/hbase/HBaseScanRange.java PRE-CREATION hbase-handler/src/java/org/apache/hadoop/hive/hbase/HBaseSerDe.java 2cd65cb hbase-handler/src/java/org/apache/hadoop/hive/hbase/HBaseStorageHandler.java 29e5da5 hbase-handler/src/java/org/apache/hadoop/hive/hbase/HBaseWritableKeyFactory.java PRE-CREATION hbase-handler/src/java/org/apache/hadoop/hive/hbase/HiveHBaseTableInputFormat.java 704fcb9 hbase-handler/src/java/org/apache/hadoop/hive/hbase/LazyHBaseRow.java fc40195 hbase-handler/src/test/org/apache/hadoop/hive/hbase/HBaseTestCompositeKey.java 13c344b hbase-handler/src/test/org/apache/hadoop/hive/hbase/TestHBaseKeyFactory.java PRE-CREATION hbase-handler/src/test/org/apache/hadoop/hive/hbase/TestHBaseKeyFactory2.java PRE-CREATION hbase-handler/src/test/queries/positive/hbase_custom_key.q PRE-CREATION hbase-handler/src/test/queries/positive/hbase_custom_key2.q PRE-CREATION hbase-handler/src/test/results/positive/hbase_custom_key.q.out PRE-CREATION hbase-handler/src/test/results/positive/hbase_custom_key2.q.out PRE-CREATION itests/util/pom.xml 9885c53 ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 5995c14 ql/src/java/org/apache/hadoop/hive/ql/index/IndexPredicateAnalyzer.java d39ee2e ql/src/java/org/apache/hadoop/hive/ql/index/IndexSearchCondition.java 5f1329c ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java 647a9a6 ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveStoragePredicateHandler.java 9f35575 ql/src/java/org/apache/hadoop/hive/ql/plan/ExprNodeDescUtils.java e50026b ql/src/java/org/apache/hadoop/hive/ql/plan/TableScanDesc.java 10bae4d ql/src/java/org/apache/hadoop/hive/ql/ppd/OpProcFactory.java 40298e1 serde/src/java/org/apache/hadoop/hive/serde2/StructObject.java PRE-CREATION serde/src/java/org/apache/hadoop/hive/serde2/StructObjectBaseInspector.java PRE-CREATION serde/src/java/org/apache/hadoop/hive/serde2/columnar/ColumnarStructBase.java 1fd6853 serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyObject.java 10f4c05 serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyObjectBase.java 3334dff serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyStruct.java 8a1ea46 serde/src/java/org/apache/hadoop/hive/serde2/lazy/objectinspector/LazySimpleStructObjectInspector.java 8a5386a serde/src/java/org/apache/hadoop/hive/serde2/lazybinary/LazyBinaryObject.java 598683f serde/src/java/org/apache/hadoop/hive/serde2/lazybinary/LazyBinaryStruct.java caf3517 Diff: https://reviews.apache.org/r/18179/diff/ Testing --- Thanks, Navis Ryu
[jira] [Updated] (HIVE-6411) Support more generic way of using composite key for HBaseHandler
[ https://issues.apache.org/jira/browse/HIVE-6411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-6411: Attachment: HIVE-6411.4.patch.txt Support more generic way of using composite key for HBaseHandler Key: HIVE-6411 URL: https://issues.apache.org/jira/browse/HIVE-6411 Project: Hive Issue Type: Improvement Components: HBase Handler Reporter: Navis Assignee: Navis Priority: Minor Attachments: HIVE-6411.1.patch.txt, HIVE-6411.2.patch.txt, HIVE-6411.3.patch.txt, HIVE-6411.4.patch.txt HIVE-2599 introduced using custom object for the row key. But it forces key objects to extend HBaseCompositeKey, which is again extension of LazyStruct. If user provides proper Object and OI, we can replace internal key and keyOI with those. Initial implementation is based on factory interface. {code} public interface HBaseKeyFactory { void init(SerDeParameters parameters, Properties properties) throws SerDeException; ObjectInspector createObjectInspector(TypeInfo type) throws SerDeException; LazyObjectBase createObject(ObjectInspector inspector) throws SerDeException; } {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6455) Scalable dynamic partitioning and bucketing optimization
[ https://issues.apache.org/jira/browse/HIVE-6455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth J updated HIVE-6455: - Attachment: HIVE-6455.11.patch Addressed [~hagleitn]'s code review comments. This is intermediate checkin to look for precommit test failures. Scalable dynamic partitioning and bucketing optimization Key: HIVE-6455 URL: https://issues.apache.org/jira/browse/HIVE-6455 Project: Hive Issue Type: New Feature Components: Query Processor Affects Versions: 0.13.0 Reporter: Prasanth J Assignee: Prasanth J Labels: optimization Attachments: HIVE-6455.1.patch, HIVE-6455.1.patch, HIVE-6455.10.patch, HIVE-6455.10.patch, HIVE-6455.11.patch, HIVE-6455.2.patch, HIVE-6455.3.patch, HIVE-6455.4.patch, HIVE-6455.4.patch, HIVE-6455.5.patch, HIVE-6455.6.patch, HIVE-6455.7.patch, HIVE-6455.8.patch, HIVE-6455.9.patch, HIVE-6455.9.patch The current implementation of dynamic partition works by keeping at least one record writer open per dynamic partition directory. In case of bucketing there can be multispray file writers which further adds up to the number of open record writers. The record writers of column oriented file format (like ORC, RCFile etc.) keeps some sort of in-memory buffers (value buffer or compression buffers) open all the time to buffer up the rows and compress them before flushing it to disk. Since these buffers are maintained per column basis the amount of constant memory that will required at runtime increases as the number of partitions and number of columns per partition increases. This often leads to OutOfMemory (OOM) exception in mappers or reducers depending on the number of open record writers. Users often tune the JVM heapsize (runtime memory) to get over such OOM issues. With this optimization, the dynamic partition columns and bucketing columns (in case of bucketed tables) are sorted before being fed to the reducers. Since the partitioning and bucketing columns are sorted, each reducers can keep only one record writer open at any time thereby reducing the memory pressure on the reducers. This optimization is highly scalable as the number of partition and number of columns per partition increases at the cost of sorting the columns. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6541) Need to write documentation for ACID work
[ https://issues.apache.org/jira/browse/HIVE-6541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13920456#comment-13920456 ] Lefty Leverenz commented on HIVE-6541: -- bq. Should I just post it in here in text format Sounds good to me. Is it pretty much the same as InsertUpdatesinHive.pdf (attached to HIVE-5317)? Need to write documentation for ACID work - Key: HIVE-6541 URL: https://issues.apache.org/jira/browse/HIVE-6541 Project: Hive Issue Type: Sub-task Components: Documentation Affects Versions: 0.13.0 Reporter: Alan Gates Assignee: Alan Gates Fix For: 0.13.0 ACID introduces a number of new config file options, tables in the metastore, keywords in the grammar, and a new interface for use of tools like storm and flume. These need to be documented. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-4293) Predicates following UDTF operator are removed by PPD
[ https://issues.apache.org/jira/browse/HIVE-4293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-4293: Attachment: HIVE-4293.11.patch.txt Predicates following UDTF operator are removed by PPD - Key: HIVE-4293 URL: https://issues.apache.org/jira/browse/HIVE-4293 Project: Hive Issue Type: Bug Components: Query Processor Reporter: Navis Assignee: Navis Priority: Critical Attachments: D9933.6.patch, HIVE-4293.10.patch, HIVE-4293.11.patch.txt, HIVE-4293.7.patch.txt, HIVE-4293.8.patch.txt, HIVE-4293.9.patch.txt, HIVE-4293.D9933.1.patch, HIVE-4293.D9933.2.patch, HIVE-4293.D9933.3.patch, HIVE-4293.D9933.4.patch, HIVE-4293.D9933.5.patch For example, {noformat} explain SELECT value from ( select explode(array(key, value)) as (value) from ( select * FROM src WHERE key 200 ) A ) B WHERE value 300 ; {noformat} Makes plan like this, removing last predicates {noformat} TableScan alias: src Filter Operator predicate: expr: (key 200.0) type: boolean Select Operator expressions: expr: array(key,value) type: arraystring outputColumnNames: _col0 UDTF Operator function name: explode Select Operator expressions: expr: col type: string outputColumnNames: _col0 File Output Operator compressed: false GlobalTableId: 0 table: input format: org.apache.hadoop.mapred.TextInputFormat output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-4293) Predicates following UDTF operator are removed by PPD
[ https://issues.apache.org/jira/browse/HIVE-4293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13920459#comment-13920459 ] Navis commented on HIVE-4293: - Merged your patch with partial fix in HIVE-4598. Let's see the test result. Predicates following UDTF operator are removed by PPD - Key: HIVE-4293 URL: https://issues.apache.org/jira/browse/HIVE-4293 Project: Hive Issue Type: Bug Components: Query Processor Reporter: Navis Assignee: Navis Priority: Critical Attachments: D9933.6.patch, HIVE-4293.10.patch, HIVE-4293.11.patch.txt, HIVE-4293.7.patch.txt, HIVE-4293.8.patch.txt, HIVE-4293.9.patch.txt, HIVE-4293.D9933.1.patch, HIVE-4293.D9933.2.patch, HIVE-4293.D9933.3.patch, HIVE-4293.D9933.4.patch, HIVE-4293.D9933.5.patch For example, {noformat} explain SELECT value from ( select explode(array(key, value)) as (value) from ( select * FROM src WHERE key 200 ) A ) B WHERE value 300 ; {noformat} Makes plan like this, removing last predicates {noformat} TableScan alias: src Filter Operator predicate: expr: (key 200.0) type: boolean Select Operator expressions: expr: array(key,value) type: arraystring outputColumnNames: _col0 UDTF Operator function name: explode Select Operator expressions: expr: col type: string outputColumnNames: _col0 File Output Operator compressed: false GlobalTableId: 0 table: input format: org.apache.hadoop.mapred.TextInputFormat output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6549) removed templeton.jar from webhcat-default.xml
[ https://issues.apache.org/jira/browse/HIVE-6549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13920461#comment-13920461 ] Eugene Koifman commented on HIVE-6549: -- I'm not sure it's useful to maintain Configuration Variables section. Each variable is/should be documented in webhcat-default.xml (there is a special 'description' xml element there for it). Copying it to wiki only adds maintenance effort. The rest of the page is useful. removed templeton.jar from webhcat-default.xml -- Key: HIVE-6549 URL: https://issues.apache.org/jira/browse/HIVE-6549 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.12.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Priority: Minor this property is no longer used also removed corresponding AppConfig.TEMPLETON_JAR_NAME -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: Review Request 16281: Predicates following UDTF operator are removed by PPD
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/16281/ --- (Updated March 5, 2014, 4:07 a.m.) Review request for hive. Changes --- + Works of Harish + Partial fix in HIVE-4598 Bugs: HIVE-4293 https://issues.apache.org/jira/browse/HIVE-4293 Repository: hive-git Description --- For example, {noformat} explain SELECT value from ( select explode(array(key, value)) as (value) from ( select * FROM src WHERE key 200 ) A ) B WHERE value 300 ; {noformat} Makes plan like this, removing last predicates {noformat} TableScan alias: src Filter Operator predicate: expr: (key 200.0) type: boolean Select Operator expressions: expr: array(key,value) type: arraystring outputColumnNames: _col0 UDTF Operator function name: explode Select Operator expressions: expr: col type: string outputColumnNames: _col0 File Output Operator compressed: false GlobalTableId: 0 table: input format: org.apache.hadoop.mapred.TextInputFormat output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat {noformat} Diffs (updated) - ql/src/java/org/apache/hadoop/hive/ql/exec/LateralViewJoinOperator.java 2fbb81b ql/src/java/org/apache/hadoop/hive/ql/exec/ReduceSinkOperator.java c378dc7 ql/src/java/org/apache/hadoop/hive/ql/exec/mr/MapRedTask.java 326654f ql/src/java/org/apache/hadoop/hive/ql/optimizer/ColumnPrunerProcFactory.java 0798470 ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 89d2a9c ql/src/java/org/apache/hadoop/hive/ql/plan/LateralViewJoinDesc.java ebfcfc8 ql/src/java/org/apache/hadoop/hive/ql/ppd/ExprWalkerInfo.java 6a3dd99 ql/src/java/org/apache/hadoop/hive/ql/ppd/OpProcFactory.java 40298e1 ql/src/java/org/apache/hadoop/hive/ql/ppd/PredicatePushDown.java cd5ae51 ql/src/test/queries/clientpositive/lateral_view_ppd.q 7be86a6 ql/src/test/queries/clientpositive/ppd_join4.q PRE-CREATION ql/src/test/queries/clientpositive/ppd_transform.q 65a498d ql/src/test/queries/clientpositive/ppd_udtf.q PRE-CREATION ql/src/test/results/clientpositive/cluster.q.out 0cd0886 ql/src/test/results/clientpositive/ctas_colname.q.out 3d568ab ql/src/test/results/clientpositive/lateral_view_ppd.q.out da77f75 ql/src/test/results/clientpositive/ppd2.q.out 2f2c558 ql/src/test/results/clientpositive/ppd_gby.q.out 68092e0 ql/src/test/results/clientpositive/ppd_gby2.q.out a8ccace ql/src/test/results/clientpositive/ppd_join4.q.out PRE-CREATION ql/src/test/results/clientpositive/ppd_transform.q.out e7c07ed ql/src/test/results/clientpositive/ppd_udtf.q.out PRE-CREATION ql/src/test/results/clientpositive/udtf_json_tuple.q.out f151740 ql/src/test/results/clientpositive/udtf_parse_url_tuple.q.out 74d9e96 ql/src/test/results/compiler/plan/join1.q.xml 12b01ce ql/src/test/results/compiler/plan/join2.q.xml ed5bbb8 ql/src/test/results/compiler/plan/join3.q.xml 5437afa ql/src/test/results/compiler/plan/join4.q.xml aa69ada ql/src/test/results/compiler/plan/join5.q.xml ef0c69d ql/src/test/results/compiler/plan/join6.q.xml da528f5 ql/src/test/results/compiler/plan/join7.q.xml fcacc6d ql/src/test/results/compiler/plan/join8.q.xml c7591a4 Diff: https://reviews.apache.org/r/16281/diff/ Testing --- Thanks, Navis Ryu
[jira] [Commented] (HIVE-5761) Implement vectorized support for the DATE data type
[ https://issues.apache.org/jira/browse/HIVE-5761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13920464#comment-13920464 ] Lefty Leverenz commented on HIVE-5761: -- Does this need any user documentation? Implement vectorized support for the DATE data type --- Key: HIVE-5761 URL: https://issues.apache.org/jira/browse/HIVE-5761 Project: Hive Issue Type: Sub-task Reporter: Eric Hanson Assignee: Teddy Choi Attachments: HIVE-5761.1.patch, HIVE-5761.2.patch, HIVE-5761.3.patch, HIVE-5761.4.patch, HIVE-5761.5.patch, HIVE-5761.6.patch, HIVE-5761.6.patch Add support to allow queries referencing DATE columns and expression results to run efficiently in vectorized mode. This should re-use the code for the the integer/timestamp types to the extent possible and beneficial. Include unit tests and end-to-end tests. Consider re-using or extending existing end-to-end tests for vectorized integer and/or timestamp operations. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6549) removed templeton.jar from webhcat-default.xml
[ https://issues.apache.org/jira/browse/HIVE-6549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13920483#comment-13920483 ] Lefty Leverenz commented on HIVE-6549: -- Readability is the main advantage of putting config variables in the wiki. Some readers might also like seeing all the variables along with general configuration information, without having to hunt for webhcat-default.xml. But you're right about the maintenance problem. I'd say go ahead and remove the table from the wiki, but perhaps we need a few more opinions. removed templeton.jar from webhcat-default.xml -- Key: HIVE-6549 URL: https://issues.apache.org/jira/browse/HIVE-6549 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.12.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Priority: Minor this property is no longer used also removed corresponding AppConfig.TEMPLETON_JAR_NAME -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-5888) group by after join operation product no result when hive.optimize.skewjoin = true
[ https://issues.apache.org/jira/browse/HIVE-5888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13920493#comment-13920493 ] Navis commented on HIVE-5888: - I believe this is fixed by HIVE-6041, which is included in hive-0.13.0. There remains a minor issue in explain result. But it makes valid result now. group by after join operation product no result when hive.optimize.skewjoin = true Key: HIVE-5888 URL: https://issues.apache.org/jira/browse/HIVE-5888 Project: Hive Issue Type: Bug Affects Versions: 0.12.0 Reporter: cyril liao -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6290) Add support for hbase filters for composite keys
[ https://issues.apache.org/jira/browse/HIVE-6290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13920496#comment-13920496 ] Navis commented on HIVE-6290: - I've regarded that as a following issue but I've merged your patch into HIVE-6411. Add support for hbase filters for composite keys Key: HIVE-6290 URL: https://issues.apache.org/jira/browse/HIVE-6290 Project: Hive Issue Type: Sub-task Components: HBase Handler Affects Versions: 0.12.0 Reporter: Swarnim Kulkarni Assignee: Swarnim Kulkarni Attachments: HIVE-6290.1.patch.txt, HIVE-6290.2.patch.txt, HIVE-6290.3.patch.txt Add support for filters to be provided via the composite key class -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6520) Skew Join optimization doesn't work if parent gets converted to MapJoin task
[ https://issues.apache.org/jira/browse/HIVE-6520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13920505#comment-13920505 ] Navis commented on HIVE-6520: - MapJoinOperator cannot handle skew join, which should know the total number of a join key. We can disable converting to MapJoin when it's for skewjoin. But If it can be converted MapJoin, it would be faster than doing it in classical skew join. Skew Join optimization doesn't work if parent gets converted to MapJoin task Key: HIVE-6520 URL: https://issues.apache.org/jira/browse/HIVE-6520 Project: Hive Issue Type: Bug Affects Versions: 0.11.0 Reporter: Ankit Kamboj Skew join optimization (GenMRSkewJoinProcessor.java) assumes that its parent stage(that will create directory structure for skewed keys) will have a Reduce Join Operator. GenMRSkewJoinProcessor sets the handleSkewJoin flag only in that case. But it is possible that parent stage gets converted to MapJoin task (because of hive.auto.convert.join flag). In that case handleSkewJoin is not set for parent stage and it will not create directory structure for skewed keys in hdfs. This eventually leads to elimination of skew join conditional task (and its children) because the conditional task is not able to find the skewed key directories. Shouldn't the MapJoinOperator also handle skew join and create directory structure for skewed keys in addition to performing map join for the non-skewed keys? -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6519) Allow optional as in subquery definition
[ https://issues.apache.org/jira/browse/HIVE-6519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13920506#comment-13920506 ] Lefty Leverenz commented on HIVE-6519: -- Documented in the wiki here: * [SubQueries: Subqueries in the FROM Clause |https://cwiki.apache.org/confluence/display/Hive/LanguageManual+SubQueries#LanguageManualSubQueries-SubqueriesintheFROMClause] by adding a second line of syntax: {code} SELECT ... FROM (subquery) name ... SELECT ... FROM (subquery) AS name ... (Note: Only valid starting with Hive 0.13.0) {code} and this text: bq. The optional keyword AS can be included before the subquery name in Hive 0.13.0 and later versions (HIVE-6519). Allow optional as in subquery definition -- Key: HIVE-6519 URL: https://issues.apache.org/jira/browse/HIVE-6519 Project: Hive Issue Type: Bug Reporter: Gunther Hagleitner Assignee: Gunther Hagleitner Priority: Minor Fix For: 0.13.0 Attachments: HIVE-6519.1.patch Allow both: select * from (select * from foo) bar select * from (select * from foo) as bar -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HIVE-6551) group by after join with skew join optimization references invalid task sometimes
Navis created HIVE-6551: --- Summary: group by after join with skew join optimization references invalid task sometimes Key: HIVE-6551 URL: https://issues.apache.org/jira/browse/HIVE-6551 Project: Hive Issue Type: Bug Reporter: Navis Assignee: Navis Priority: Trivial For example, {noformat} hive set hive.auto.convert.join = true; hive set hive.optimize.skewjoin = true; hive set hive.skewjoin.key = 3; hive EXPLAIN FROM (SELECT src.* FROM src) x JOIN (SELECT src.* FROM src) Y ON (x.key = Y.key) SELECT sum(hash(Y.key)), sum(hash(Y.value)); OK STAGE DEPENDENCIES: Stage-8 is a root stage Stage-6 depends on stages: Stage-8 Stage-5 depends on stages: Stage-6 , consists of Stage-4, Stage-2 Stage-4 Stage-2 depends on stages: Stage-4, Stage-1 Stage-0 is a root stage ... {noformat} Stage-2 references not-existing Stage-1 -- This message was sent by Atlassian JIRA (v6.2#6252)
Getting difficulty in the way to work on HiveQL (Appache Hadoop, Big Data Analytic Platform)
Dear Sir, With due respect I would like to mention that I, Gaurav Kumar, am a M.Tech. student in Computer Science at Jawaharlal Nehru University, New Delhi. I am currently working on my dissertation titled 'Optimization of SQL query on Hive Platform (Apache Hive)'. I am unable to carry forward my work properly due to lack of valuable guidance and experience, leading to enormously increasing stress. As such, it would very helpful if you would provide me precious and constructive suggestions/guidance on my dissertation work. Kindly please spare a few moments of your time to help me with your valuable knowledge and experience garnered while working on this subject. Hope to get a positive reply at the earliest. Thanking You. Yours sincerely Gaurav Kumar gauravsp1...@yahoo.com
[jira] [Updated] (HIVE-6060) Define API for RecordUpdater and UpdateReader
[ https://issues.apache.org/jira/browse/HIVE-6060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley updated HIVE-6060: Attachment: h-6060.patch This patch puts everything together: * Defines AcidInputFormat and AcidOutputFormat. * Extends OrcInputFormat and OrcOutputFormat to implement them. * Creates AcidUtils to figure out which base and deltas need to be read. * Provides raw interfaces that the compactor uses to re-write small files. * Moves ValidTxnList and ValidTxnListImpl to common where they can be used by code in mapreduce tasks and the metastore. * Adds an interface to Orc Writers that provides callbacks when stripes are being written. * Adds a method to Orc Writers that allow the client to write the current stripe to disk and writes a temporary footer before the writer continues to write new stripes. Define API for RecordUpdater and UpdateReader - Key: HIVE-6060 URL: https://issues.apache.org/jira/browse/HIVE-6060 Project: Hive Issue Type: Sub-task Reporter: Owen O'Malley Assignee: Owen O'Malley Attachments: acid-io.patch, h-5317.patch, h-5317.patch, h-5317.patch, h-6060.patch, h-6060.patch We need to define some new APIs for how Hive interacts with the file formats since it needs to be much richer than the current RecordReader and RecordWriter. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6060) Define API for RecordUpdater and UpdateReader
[ https://issues.apache.org/jira/browse/HIVE-6060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley updated HIVE-6060: Status: Patch Available (was: Open) Define API for RecordUpdater and UpdateReader - Key: HIVE-6060 URL: https://issues.apache.org/jira/browse/HIVE-6060 Project: Hive Issue Type: Sub-task Reporter: Owen O'Malley Assignee: Owen O'Malley Attachments: acid-io.patch, h-5317.patch, h-5317.patch, h-5317.patch, h-6060.patch, h-6060.patch We need to define some new APIs for how Hive interacts with the file formats since it needs to be much richer than the current RecordReader and RecordWriter. -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: Getting difficulty in the way to work on HiveQL (Appache Hadoop, Big Data Analytic Platform)
Hey gaurav what is the problem wherebu r facing the pronlem On 5 Mar 2014 10:41, Gaurav Kumar gauravsp1...@yahoo.com wrote: Dear Sir, With due respect I would like to mention that I, Gaurav Kumar, am a M.Tech. student in Computer Science at Jawaharlal Nehru University, New Delhi. I am currently working on my dissertation titled 'Optimization of SQL query on Hive Platform (Apache Hive)'. I am unable to carry forward my work properly due to lack of valuable guidance and experience, leading to enormously increasing stress. As such, it would very helpful if you would provide me precious and constructive suggestions/guidance on my dissertation work. Kindly please spare a few moments of your time to help me with your valuable knowledge and experience garnered while working on this subject. Hope to get a positive reply at the earliest. Thanking You. Yours sincerely Gaurav Kumar gauravsp1...@yahoo.com
[jira] [Commented] (HIVE-6492) limit partition number involved in a table scan
[ https://issues.apache.org/jira/browse/HIVE-6492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13920579#comment-13920579 ] Hive QA commented on HIVE-6492: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12632689/HIVE-6492.5.patch.txt {color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 5358 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_bucketmapjoin6 org.apache.hive.beeline.TestSchemaTool.testSchemaInit org.apache.hive.beeline.TestSchemaTool.testSchemaUpgrade {noformat} Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1623/testReport Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1623/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 3 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12632689 limit partition number involved in a table scan --- Key: HIVE-6492 URL: https://issues.apache.org/jira/browse/HIVE-6492 Project: Hive Issue Type: New Feature Components: Query Processor Affects Versions: 0.12.0 Reporter: Selina Zhang Fix For: 0.13.0 Attachments: HIVE-6492.1.patch.txt, HIVE-6492.2.patch.txt, HIVE-6492.3.patch.txt, HIVE-6492.4.patch.txt, HIVE-6492.4.patch_suggestion, HIVE-6492.5.patch.txt Original Estimate: 24h Remaining Estimate: 24h To protect the cluster, a new configure variable hive.limit.query.max.table.partition is added to hive configuration to limit the table partitions involved in a table scan. The default value will be set to -1 which means there is no limit by default. This variable will not affect metadata only query. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6551) group by after join with skew join optimization references invalid task sometimes
[ https://issues.apache.org/jira/browse/HIVE-6551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-6551: Attachment: HIVE-6551.1.patch.txt group by after join with skew join optimization references invalid task sometimes - Key: HIVE-6551 URL: https://issues.apache.org/jira/browse/HIVE-6551 Project: Hive Issue Type: Bug Reporter: Navis Assignee: Navis Priority: Trivial Attachments: HIVE-6551.1.patch.txt For example, {noformat} hive set hive.auto.convert.join = true; hive set hive.optimize.skewjoin = true; hive set hive.skewjoin.key = 3; hive EXPLAIN FROM (SELECT src.* FROM src) x JOIN (SELECT src.* FROM src) Y ON (x.key = Y.key) SELECT sum(hash(Y.key)), sum(hash(Y.value)); OK STAGE DEPENDENCIES: Stage-8 is a root stage Stage-6 depends on stages: Stage-8 Stage-5 depends on stages: Stage-6 , consists of Stage-4, Stage-2 Stage-4 Stage-2 depends on stages: Stage-4, Stage-1 Stage-0 is a root stage ... {noformat} Stage-2 references not-existing Stage-1 -- This message was sent by Atlassian JIRA (v6.2#6252)