date:20140304


[ 
https://issues.apache.org/jira/browse/HIVE-6545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13919834#comment-13919834
 ] 

Harish Butani commented on HIVE-6545:
-

+1

 analyze table throws NPE for non-existent tables.
 -

 Key: HIVE-6545
 URL: https://issues.apache.org/jira/browse/HIVE-6545
 Project: Hive
  Issue Type: Bug
  Components: Statistics
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Attachments: HIVE-6545.patch


 Instead of NPE, we should give error message to user.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Re: Timeline for the Hive 0.13 release?

Yes sure. I am fine with porting over HIVE-5317 and dependents.
Besides, couldn't handle 104 angry fans (uh watchers)  :)

Let’s follow this procedure: if you have features that should go into 
branch-0.13 please post a message here, give the community a chance to voice 
their opinions.

regards,
Harish.


On Mar 4, 2014, at 8:03 AM, Alan Gates ga...@hortonworks.com wrote:

 Sure.  I’d really like to get the work related to HIVE-5317 in 0.13.  
 HIVE-5843 is patch available and hopefully can be checked in today.  There 
 are several more that depend on that one and can’t be made patch available 
 until then (HIVE-6060, HIVE-6319, HIVE-6460, and HIVE-5687).  I don’t want to 
 hold up the branching, but are you ok with those going in after the branch?
 
 Alan.
 
 On Mar 3, 2014, at 7:53 PM, Harish Butani hbut...@hortonworks.com wrote:
 
 I plan to create the branch 5pm PST tomorrow.
 Ok with everybody?
 
 regards,
 Harish.
 
 On Feb 21, 2014, at 5:44 PM, Lefty Leverenz leftylever...@gmail.com wrote:
 
 That's appropriate -- let the Hive release march forth on March 4th.
 
 
 -- Lefty
 
 
 On Fri, Feb 21, 2014 at 4:04 PM, Harish Butani 
 hbut...@hortonworks.comwrote:
 
 Ok,let’s set it for March 4th .
 
 regards,
 Harish.
 
 On Feb 21, 2014, at 12:14 PM, Brock Noland br...@cloudera.com wrote:
 
 Might as well make it March 4th or 5th. Otherwise folks will burn
 weekend time to get patches in.
 
 On Fri, Feb 21, 2014 at 2:10 PM, Harish Butani hbut...@hortonworks.com
 wrote:
 Yes makes sense.
 How about we postpone the branching until 10am PST March 3rd, which is
 the following Monday.
 Don’t see a point of setting the branch time to a Friday evening.
 Do people agree?
 
 regards,
 Harish.
 
 On Feb 21, 2014, at 11:04 AM, Brock Noland br...@cloudera.com wrote:
 
 +1
 
 On Fri, Feb 21, 2014 at 1:02 PM, Thejas Nair the...@hortonworks.com
 wrote:
 Can we wait for some few more days for the branching ? I have a few
 more
 security fixes that I would like to get in, and we also have a long
 pre-commit queue ahead right now. How about branching around Friday
 next
 week ?  By then hadoop 2.3 should also be out as that vote has been
 concluded, and we can get HIVE-6037 in as well.
 -Thejas
 
 
 
 On Sun, Feb 16, 2014 at 5:32 PM, Brock Noland br...@cloudera.com
 wrote:
 
 I'd love to see HIVE-6037 in the 0.13 release. I have +1'ed it
 pending
 tests.
 
 Brock
 
 On Sun, Feb 16, 2014 at 7:23 PM, Navis류승우 navis@nexr.com
 wrote:
 HIVE-6037 is for generating hive-default.template file from
 HiveConf.
 Could
 it be included in this release? If it's not, I'll suspend further
 rebasing
 of it till next release (conflicts too frequently).
 
 
 2014-02-16 20:38 GMT+09:00 Lefty Leverenz leftylever...@gmail.com
 :
 
 I'll try to catch up on the wikidocs backlog for 0.13.0 patches in
 time
 for
 the release.  It's a long and growing list, though, so no promises.
 
 Feel free to do your own documentation, or hand it off to a
 friendly
 in-house writer.
 
 -- Lefty, self-appointed Hive docs maven
 
 
 
 On Sat, Feb 15, 2014 at 1:28 PM, Thejas Nair 
 the...@hortonworks.com
 wrote:
 
 Sounds good to me.
 
 
 On Fri, Feb 14, 2014 at 7:29 PM, Harish Butani 
 hbut...@hortonworks.com
 wrote:
 
 Hi,
 
 Its mid feb. Wanted to check if the community is ready to cut a
 branch.
 Could we cut the branch in a week , say 5pm PST 2/21/14?
 The goal is to keep the release cycle short: couple of weeks; so
 after
 the
 branch we go into stabilizing mode for hive 0.13, checking in
 only
 blocker/critical bug fixes.
 
 regards,
 Harish.
 
 
 On Jan 20, 2014, at 9:25 AM, Brock Noland br...@cloudera.com
 wrote:
 
 Hi,
 
 I agree that picking a date to branch and then restricting
 commits to
 that
 branch would be a less time intensive plan for the RM.
 
 Brock
 
 
 On Sat, Jan 18, 2014 at 4:21 PM, Harish Butani 
 hbut...@hortonworks.com
 wrote:
 
 Yes agree it is time to start planning for the next release.
 I would like to volunteer to do the release management duties
 for
 this
 release(will be a great experience for me)
 Will be happy to do it, if the community is fine with this.
 
 regards,
 Harish.
 
 On Jan 17, 2014, at 7:05 PM, Thejas Nair 
 the...@hortonworks.com
 
 wrote:
 
 Yes, I think it is time to start planning for the next
 release.
 For 0.12 release I created a branch and then accepted patches
 that
 people asked to be included for sometime, before moving a
 phase
 of
 accepting only critical bug fixes. This turned out to be
 laborious.
 I think we should instead give everyone a few weeks to get any
 patches
 they are working on to be ready, cut the branch, and take in
 only
 critical bug fixes to the branch after that.
 How about cutting the branch around mid-February and targeting
 to
 release in a week or two after that.
 
 Thanks,
 Thejas
 
 
 On Fri, Jan 17, 2014 at 4:39 PM, Carl Steinbach 
 c...@apache.org
 
 wrote:
 I was wondering what people think about setting a tentative
 date
 for
 the
 Hive 0.13 release? At an old Hive

Re: Review Request 18464: Support secure Subject.doAs() in HiveServer2 JDBC client

2014-03-04 Thread Larry McCay

Hi Shiv - I believe that the auth mechanism in play is still considered
kerberos in this case. It is just based on a preauthenticated subject
rather than a UGI. In the end - it is kerberos.

On Tue, Mar 4, 2014 at 2:34 PM, Shivaraju Gowda shiv...@cisco.com wrote:

On Feb. 27, 2014, 4:59 p.m., Vaibhav Gumashta wrote:
service/src/java/org/apache/hive/service/auth/KerberosSaslHelper.java,
line 68

https://reviews.apache.org/r/18464/diff/1/?file=503361#file503361line68

Can you push this to
HadoopThriftAuthBridge.Client#createClientTransport just like the way the
else portion does instead of the createSubjectAssumedTransport method? From
within the method you can return the TSubjectAssumingTransport.

Shivaraju Gowda wrote:
Again this was in my first cut. I was passing the value as
tokenStrForm parameter to keep the method signature same. I later moved
away from it since it was not elegant and changing the method signature
involved broader implications. I felt this functionality didn't belong in
Hadoop shim layer. Having the change in there also meant one more jar
getting affected(hive-exec.jar)

Shivaraju Gowda wrote:
Another issue was the dependency on hadoop.core.jar. The calls
AuthMethod.valueOf(AuthMethod.class, methodStr) and
SaslRpcServer.splitKerberosName(serverPrincipal) in
HadoopThriftAuthBridge.Client#createClientTransport are from hadoop.core.jar

Vaibhav Gumashta wrote:
Actually in case of a kerberos setting, those jars are already
required in the client's classpath (
https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Clients#HiveServer2Clients-JDBCClientSetupforaSecureCluster-
check Running the JDBC Sample Code section). And this jira is
applicable only to a kerberos setup.

Correct. But my point is we don't have to have that dependency on external
Hadoop component for using kerberos in this way.

On Feb. 27, 2014, 4:59 p.m., Vaibhav Gumashta wrote:
jdbc/src/java/org/apache/hive/jdbc/HiveConnection.java, line 136

https://reviews.apache.org/r/18464/diff/1/?file=503360#file503360line136

I think, instead of having to do identityContext equals
fromKerberosSubject, we can just use assumeSubject equals true/false,
keeping the default to false.

Shivaraju Gowda wrote:
Passing it as assumeSubject boolean url property was my first cut.
However I thought assumeSubject itself doesn't convey the message for its
intended use in and off by itself(need to refer to the documentation) and
making it key-value pair might give it some more meaning and there is also
a possibility of it being later used for other use cases (say
hypothetically the value can be fromKeyTab, fromTicketCache or fromLogin
etc.).

Shivaraju Gowda wrote:
Do you think it might better if we use auth property here, i.e
auth=fromKerberosSubject. Right now the only values for auth=noSasl.

Vaibhav Gumashta wrote:
auth property is kind of meant to map to the hiveserver2 auth modes
[none, sasl, nosasl, kerberos]. The way it is used currently is not very
clean and there are some jiras out there to clean that up and make the
mapping more evident.

OK, I look at this feature as an authentication mechanism. We are
authenticating using the KerberosSubject passed by the user.

- Shivaraju

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18464/#review35730
---

On Feb. 25, 2014, 6:50 a.m., Kevin Minder wrote:

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18464/
---

(Updated Feb. 25, 2014, 6:50 a.m.)

Review request for hive, Kevin Minder and Vaibhav Gumashta.

Bugs: HIVE-6486
https://issues.apache.org/jira/browse/HIVE-6486

Repository: hive-git

Description
---

Support secure Subject.doAs() in HiveServer2 JDBC client

Diffs
-

jdbc/src/java/org/apache/hive/jdbc/HiveConnection.java 17b4d39
service/src/java/org/apache/hive/service/auth/KerberosSaslHelper.java
379dafb

service/src/java/org/apache/hive/service/auth/TSubjectAssumingTransport.java
PRE-CREATION

Diff: https://reviews.apache.org/r/18464/diff/

Testing
---

Manual testing

Thanks,

Kevin Minder

--
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to
which it is addressed and may contain information that is confidential,
privileged and exempt from disclosure under applicable law. If the reader
of this message is not the intended recipient, you are hereby notified that
any printing, copying, dissemination, distribution, disclosure or
forwarding of this communication is strictly

[jira] [Created] (HIVE-6546) WebHCat job submission for pig with -useHCatalog argument fails on Windows

2014-03-04 Thread Eric Hanson (JIRA)

Eric Hanson created HIVE-6546:
-

 Summary: WebHCat job submission for pig with -useHCatalog argument 
fails on Windows
 Key: HIVE-6546
 URL: https://issues.apache.org/jira/browse/HIVE-6546
 Project: Hive
  Issue Type: Bug
  Components: WebHCat
Affects Versions: 0.12.0, 0.11.0, 0.13.0
 Environment: Windows Azure HDINSIGHT and Windows one-box installations.
Reporter: Eric Hanson






--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-6546) WebHCat job submission for pig with -useHCatalog argument fails on Windows

2014-03-04 Thread Eric Hanson (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Hanson updated HIVE-6546:
--

Description: 
On a one-box windows setup, do the following from a powershell prompt:

cmd /c curl.exe -s `
  -d user.name=hadoop `
  -d arg=-useHCatalog `
  -d execute=emp = load '/data/emp/emp_0.dat'; dump emp; `
  -d statusdir=/tmp/webhcat.output01 `
  'http://localhost:50111/templeton/v1/pig' -v

The job fails with error code 7, but it should run. 

I traced this down to the following. In the job configuration for the 
TempletonJobController, we have templeton.args set to

cmd,/c,call,C:\\hadooppig-0.11.0.1.3.0.0-0846/bin/pig.cmd,-D__WEBHCAT_TOKEN_FILE_LOCATION__=-useHCatalog,-execute,emp
 = load '/data/emp/emp_0.dat'; dump emp;

Notice the = sign before -useHCatalog. I think this should be a comma.

The bad string D__WEBHCAT_TOKEN_FILE_LOCATION__=-useHCatalog gets created in  
org.apache.hadoop.util.GenericOptionsParser.preProcessForWindows().

It happens at line 434:
{code}
  } else {
  if (i  args.length - 1) {
prop += = + args[++i];   // RIGHT HERE! at iterations i = 37, 38
  }
}
{code}

Bug is here:
{code}
  if (prop != null) {
if (prop.contains(=)) {  // -D__WEBHCAT_TOKEN_FILE_LOCATION__ does 
not contain equal, so else branch is run and appends =-useHCatalog,
  // everything good
} else {
  if (i  args.length - 1) {
prop += = + args[++i];
  }
}
newArgs.add(prop);
  }
{code}
One possible fix is to change the string constant 
org.apache.hcatalog.templeton.tool.TempletonControllerJob.TOKEN_FILE_ARG_PLACEHOLDER
 to have an = sign in it. Or, preProcessForWindows() itself could be changed.


 WebHCat job submission for pig with -useHCatalog argument fails on Windows
 --

 Key: HIVE-6546
 URL: https://issues.apache.org/jira/browse/HIVE-6546
 Project: Hive
  Issue Type: Bug
  Components: WebHCat
Affects Versions: 0.11.0, 0.12.0, 0.13.0
 Environment: Windows Azure HDINSIGHT and Windows one-box 
 installations.
Reporter: Eric Hanson

 On a one-box windows setup, do the following from a powershell prompt:
 cmd /c curl.exe -s `
   -d user.name=hadoop `
   -d arg=-useHCatalog `
   -d execute=emp = load '/data/emp/emp_0.dat'; dump emp; `
   -d statusdir=/tmp/webhcat.output01 `
   'http://localhost:50111/templeton/v1/pig' -v
 The job fails with error code 7, but it should run. 
 I traced this down to the following. In the job configuration for the 
 TempletonJobController, we have templeton.args set to
 cmd,/c,call,C:\\hadooppig-0.11.0.1.3.0.0-0846/bin/pig.cmd,-D__WEBHCAT_TOKEN_FILE_LOCATION__=-useHCatalog,-execute,emp
  = load '/data/emp/emp_0.dat'; dump emp;
 Notice the = sign before -useHCatalog. I think this should be a comma.
 The bad string D__WEBHCAT_TOKEN_FILE_LOCATION__=-useHCatalog gets created 
 in  org.apache.hadoop.util.GenericOptionsParser.preProcessForWindows().
 It happens at line 434:
 {code}
   } else {
   if (i  args.length - 1) {
 prop += = + args[++i];   // RIGHT HERE! at iterations i = 37, 38
   }
 }
 {code}
 Bug is here:
 {code}
   if (prop != null) {
 if (prop.contains(=)) {  // -D__WEBHCAT_TOKEN_FILE_LOCATION__ does 
 not contain equal, so else branch is run and appends =-useHCatalog,
   // everything good
 } else {
   if (i  args.length - 1) {
 prop += = + args[++i];
   }
 }
 newArgs.add(prop);
   }
 {code}
 One possible fix is to change the string constant 
 org.apache.hcatalog.templeton.tool.TempletonControllerJob.TOKEN_FILE_ARG_PLACEHOLDER
  to have an = sign in it. Or, preProcessForWindows() itself could be 
 changed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-6546) WebHCat job submission for pig with -useHCatalog argument fails on Windows

2014-03-04 Thread Eric Hanson (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Hanson updated HIVE-6546:
--

Environment: 
HDInsight deploying HDP 1.3:  c:\apps\dist\pig-0.11.0.1.3.2.0-05
Also on Windows HDP 1.3 one-box configuration.

  was:Windows Azure HDINSIGHT and Windows one-box installations.


 WebHCat job submission for pig with -useHCatalog argument fails on Windows
 --

 Key: HIVE-6546
 URL: https://issues.apache.org/jira/browse/HIVE-6546
 Project: Hive
  Issue Type: Bug
  Components: WebHCat
Affects Versions: 0.11.0, 0.12.0, 0.13.0
 Environment: HDInsight deploying HDP 1.3:  
 c:\apps\dist\pig-0.11.0.1.3.2.0-05
 Also on Windows HDP 1.3 one-box configuration.
Reporter: Eric Hanson

 On a one-box windows setup, do the following from a powershell prompt:
 cmd /c curl.exe -s `
   -d user.name=hadoop `
   -d arg=-useHCatalog `
   -d execute=emp = load '/data/emp/emp_0.dat'; dump emp; `
   -d statusdir=/tmp/webhcat.output01 `
   'http://localhost:50111/templeton/v1/pig' -v
 The job fails with error code 7, but it should run. 
 I traced this down to the following. In the job configuration for the 
 TempletonJobController, we have templeton.args set to
 cmd,/c,call,C:\\hadooppig-0.11.0.1.3.0.0-0846/bin/pig.cmd,-D__WEBHCAT_TOKEN_FILE_LOCATION__=-useHCatalog,-execute,emp
  = load '/data/emp/emp_0.dat'; dump emp;
 Notice the = sign before -useHCatalog. I think this should be a comma.
 The bad string D__WEBHCAT_TOKEN_FILE_LOCATION__=-useHCatalog gets created 
 in  org.apache.hadoop.util.GenericOptionsParser.preProcessForWindows().
 It happens at line 434:
 {code}
   } else {
   if (i  args.length - 1) {
 prop += = + args[++i];   // RIGHT HERE! at iterations i = 37, 38
   }
 }
 {code}
 Bug is here:
 {code}
   if (prop != null) {
 if (prop.contains(=)) {  // -D__WEBHCAT_TOKEN_FILE_LOCATION__ does 
 not contain equal, so else branch is run and appends =-useHCatalog,
   // everything good
 } else {
   if (i  args.length - 1) {
 prop += = + args[++i];
   }
 }
 newArgs.add(prop);
   }
 {code}
 One possible fix is to change the string constant 
 org.apache.hcatalog.templeton.tool.TempletonControllerJob.TOKEN_FILE_ARG_PLACEHOLDER
  to have an = sign in it. Or, preProcessForWindows() itself could be 
 changed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Created] (HIVE-6547) normalize struct Role in metastore thrift interface

Thejas M Nair created HIVE-6547:
---

 Summary: normalize struct Role in metastore thrift interface
 Key: HIVE-6547
 URL: https://issues.apache.org/jira/browse/HIVE-6547
 Project: Hive
  Issue Type: Bug
  Components: Metastore, Thrift API
Affects Versions: 0.13.0
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Fix For: 0.13.0


As discussed in HIVE-5931, it will be cleaner to have the information about 
Role to role member mapping removed from the Role object, as it is not part of 
a logical Role. This information not relevant for actions such as creating a 
Role.
As part of this change  get_role_grants_for_principal api will be added, so 
that it can be used in place of  list_roles, when role mapping information is 
desired.




--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6411) Support more generic way of using composite key for HBaseHandler


[ 
https://issues.apache.org/jira/browse/HIVE-6411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13919872#comment-13919872
 ] 

Hive QA commented on HIVE-6411:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12632418/HIVE-6411.3.patch.txt

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 5240 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver_hbase_custom_key2
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_auto_sortmerge_join_16
org.apache.hadoop.hive.cli.TestNegativeMinimrCliDriver.testNegativeCliDriver_mapreduce_stack_trace_hadoop20
{noformat}

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1616/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1616/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12632418

 Support more generic way of using composite key for HBaseHandler
 

 Key: HIVE-6411
 URL: https://issues.apache.org/jira/browse/HIVE-6411
 Project: Hive
  Issue Type: Improvement
  Components: HBase Handler
Reporter: Navis
Assignee: Navis
Priority: Minor
 Attachments: HIVE-6411.1.patch.txt, HIVE-6411.2.patch.txt, 
 HIVE-6411.3.patch.txt


 HIVE-2599 introduced using custom object for the row key. But it forces key 
 objects to extend HBaseCompositeKey, which is again extension of LazyStruct. 
 If user provides proper Object and OI, we can replace internal key and keyOI 
 with those. 
 Initial implementation is based on factory interface.
 {code}
 public interface HBaseKeyFactory {
   void init(SerDeParameters parameters, Properties properties) throws 
 SerDeException;
   ObjectInspector createObjectInspector(TypeInfo type) throws SerDeException;
   LazyObjectBase createObject(ObjectInspector inspector) throws 
 SerDeException;
 }
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Re: Timeline for the Hive 0.13 release?

2014-03-04 Thread Thejas Nair

I would like to include HIVE-5943/HIVE-5942 (describe role support ) and
HIVE-6547 (metastore api - Role struct cleanup) in the release. I should
have the patch for describe-role ready in a day or two, and for HIVE-6547
as well this week.



On Tue, Mar 4, 2014 at 11:55 AM, Harish Butani hbut...@hortonworks.comwrote:

 Yes sure. I am fine with porting over HIVE-5317 and dependents.
 Besides, couldn't handle 104 angry fans (uh watchers)  :)

 Let’s follow this procedure: if you have features that should go into
 branch-0.13 please post a message here, give the community a chance to
 voice their opinions.

 regards,
 Harish.


 On Mar 4, 2014, at 8:03 AM, Alan Gates ga...@hortonworks.com wrote:

  Sure.  I’d really like to get the work related to HIVE-5317 in 0.13.
  HIVE-5843 is patch available and hopefully can be checked in today.  There
 are several more that depend on that one and can’t be made patch available
 until then (HIVE-6060, HIVE-6319, HIVE-6460, and HIVE-5687).  I don’t want
 to hold up the branching, but are you ok with those going in after the
 branch?
 
  Alan.
 
  On Mar 3, 2014, at 7:53 PM, Harish Butani hbut...@hortonworks.com
 wrote:
 
  I plan to create the branch 5pm PST tomorrow.
  Ok with everybody?
 
  regards,
  Harish.
 
  On Feb 21, 2014, at 5:44 PM, Lefty Leverenz leftylever...@gmail.com
 wrote:
 
  That's appropriate -- let the Hive release march forth on March 4th.
 
 
  -- Lefty
 
 
  On Fri, Feb 21, 2014 at 4:04 PM, Harish Butani 
 hbut...@hortonworks.comwrote:
 
  Ok,let’s set it for March 4th .
 
  regards,
  Harish.
 
  On Feb 21, 2014, at 12:14 PM, Brock Noland br...@cloudera.com
 wrote:
 
  Might as well make it March 4th or 5th. Otherwise folks will burn
  weekend time to get patches in.
 
  On Fri, Feb 21, 2014 at 2:10 PM, Harish Butani 
 hbut...@hortonworks.com
  wrote:
  Yes makes sense.
  How about we postpone the branching until 10am PST March 3rd, which
 is
  the following Monday.
  Don’t see a point of setting the branch time to a Friday evening.
  Do people agree?
 
  regards,
  Harish.
 
  On Feb 21, 2014, at 11:04 AM, Brock Noland br...@cloudera.com
 wrote:
 
  +1
 
  On Fri, Feb 21, 2014 at 1:02 PM, Thejas Nair 
 the...@hortonworks.com
  wrote:
  Can we wait for some few more days for the branching ? I have a
 few
  more
  security fixes that I would like to get in, and we also have a
 long
  pre-commit queue ahead right now. How about branching around
 Friday
  next
  week ?  By then hadoop 2.3 should also be out as that vote has
 been
  concluded, and we can get HIVE-6037 in as well.
  -Thejas
 
 
 
  On Sun, Feb 16, 2014 at 5:32 PM, Brock Noland br...@cloudera.com
 
  wrote:
 
  I'd love to see HIVE-6037 in the 0.13 release. I have +1'ed it
  pending
  tests.
 
  Brock
 
  On Sun, Feb 16, 2014 at 7:23 PM, Navis류승우 navis@nexr.com
  wrote:
  HIVE-6037 is for generating hive-default.template file from
  HiveConf.
  Could
  it be included in this release? If it's not, I'll suspend
 further
  rebasing
  of it till next release (conflicts too frequently).
 
 
  2014-02-16 20:38 GMT+09:00 Lefty Leverenz 
 leftylever...@gmail.com
  :
 
  I'll try to catch up on the wikidocs backlog for 0.13.0
 patches in
  time
  for
  the release.  It's a long and growing list, though, so no
 promises.
 
  Feel free to do your own documentation, or hand it off to a
  friendly
  in-house writer.
 
  -- Lefty, self-appointed Hive docs maven
 
 
 
  On Sat, Feb 15, 2014 at 1:28 PM, Thejas Nair 
  the...@hortonworks.com
  wrote:
 
  Sounds good to me.
 
 
  On Fri, Feb 14, 2014 at 7:29 PM, Harish Butani 
  hbut...@hortonworks.com
  wrote:
 
  Hi,
 
  Its mid feb. Wanted to check if the community is ready to
 cut a
  branch.
  Could we cut the branch in a week , say 5pm PST 2/21/14?
  The goal is to keep the release cycle short: couple of
 weeks; so
  after
  the
  branch we go into stabilizing mode for hive 0.13, checking in
  only
  blocker/critical bug fixes.
 
  regards,
  Harish.
 
 
  On Jan 20, 2014, at 9:25 AM, Brock Noland 
 br...@cloudera.com
  wrote:
 
  Hi,
 
  I agree that picking a date to branch and then restricting
  commits to
  that
  branch would be a less time intensive plan for the RM.
 
  Brock
 
 
  On Sat, Jan 18, 2014 at 4:21 PM, Harish Butani 
  hbut...@hortonworks.com
  wrote:
 
  Yes agree it is time to start planning for the next
 release.
  I would like to volunteer to do the release management
 duties
  for
  this
  release(will be a great experience for me)
  Will be happy to do it, if the community is fine with this.
 
  regards,
  Harish.
 
  On Jan 17, 2014, at 7:05 PM, Thejas Nair 
  the...@hortonworks.com
 
  wrote:
 
  Yes, I think it is time to start planning for the next
  release.
  For 0.12 release I created a branch and then accepted
 patches
  that
  people asked to be included for sometime, before moving a
  phase
  of
  accepting only critical bug fixes. This turned out to be
  laborious.
  I think we should instead give everyone a

[jira] [Commented] (HIVE-6537) NullPointerException when loading hashtable for MapJoin directly

2014-03-04 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13919998#comment-13919998
 ] 

Sergey Shelukhin commented on HIVE-6537:


I don't think failure is related (passed for me, too). New test looks good for 
me. Do I have +1 for the patch itself? I can commit (after 24 hours :))

 NullPointerException when loading hashtable for MapJoin directly
 

 Key: HIVE-6537
 URL: https://issues.apache.org/jira/browse/HIVE-6537
 Project: Hive
  Issue Type: Bug
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HIVE-6537.01.patch, HIVE-6537.2.patch.txt, 
 HIVE-6537.patch


 We see the following error:
 {noformat}
 2014-02-20 23:33:15,743 FATAL [main] 
 org.apache.hadoop.hive.ql.exec.mr.ExecMapper: 
 org.apache.hadoop.hive.ql.metadata.HiveException: 
 java.lang.NullPointerException
 at 
 org.apache.hadoop.hive.ql.exec.mr.HashTableLoader.load(HashTableLoader.java:103)
 at 
 org.apache.hadoop.hive.ql.exec.MapJoinOperator.loadHashTable(MapJoinOperator.java:149)
 at 
 org.apache.hadoop.hive.ql.exec.MapJoinOperator.cleanUpInputFileChangedOp(MapJoinOperator.java:164)
 at 
 org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1026)
 at 
 org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1030)
 at 
 org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1030)
 at 
 org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:489)
 at 
 org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:177)
 at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
 at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430)
 at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
 at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
 at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
 Caused by: java.lang.NullPointerException
 at java.util.Arrays.fill(Arrays.java:2685)
 at 
 org.apache.hadoop.hive.ql.exec.mr.HashTableLoader.loadDirectly(HashTableLoader.java:155)
 at 
 org.apache.hadoop.hive.ql.exec.mr.HashTableLoader.load(HashTableLoader.java:81)
 ... 15 more
 {noformat}
 It appears that the tables in Arrays.fill call is nulls. I don't really have 
 full understanding of this path, but what I gleaned so far is this...
 From what I see, tables would be set unconditionally in initializeOp of the 
 sink, and in no other place, so I assume for this code to ever  work that 
 startForward calls it at least some time.
 Here, it doesn't call it, so it's null. 
 Previous loop also uses tables, and should have NPE-d before fill was ever 
 called; it didn't, so I'd assume it never executed. 
 There's a little bit of inconsistency in the above code where directWorks are 
 added to parents unconditionally but sink is only added as child 
 conditionally. I think it may be that some of the direct works are not table 
 scans; in fact given that loop never executes they may be null (which is 
 rather strange). 
 Regardless, it seems that the logic should be fixed, it may be the root cause



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-5317) Implement insert, update, and delete in Hive with full ACID support

2014-03-04 Thread Vinod Kumar Vavilapalli (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-5317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1392#comment-1392
 ] 

Vinod Kumar Vavilapalli commented on HIVE-5317:
---

bq. MAPREDUCE-279, at 109, currently out scores us. There may be others, but it 
would be cool to have more watchers than Yarn.
Hehe, looks like we have a race. I'll go ask some of us YARN folks who are also 
watching this JIRA to stop watching this one :D

 Implement insert, update, and delete in Hive with full ACID support
 ---

 Key: HIVE-5317
 URL: https://issues.apache.org/jira/browse/HIVE-5317
 Project: Hive
  Issue Type: New Feature
Reporter: Owen O'Malley
Assignee: Owen O'Malley
 Attachments: InsertUpdatesinHive.pdf


 Many customers want to be able to insert, update and delete rows from Hive 
 tables with full ACID support. The use cases are varied, but the form of the 
 queries that should be supported are:
 * INSERT INTO tbl SELECT …
 * INSERT INTO tbl VALUES ...
 * UPDATE tbl SET … WHERE …
 * DELETE FROM tbl WHERE …
 * MERGE INTO tbl USING src ON … WHEN MATCHED THEN ... WHEN NOT MATCHED THEN 
 ...
 * SET TRANSACTION LEVEL …
 * BEGIN/END TRANSACTION
 Use Cases
 * Once an hour, a set of inserts and updates (up to 500k rows) for various 
 dimension tables (eg. customer, inventory, stores) needs to be processed. The 
 dimension tables have primary keys and are typically bucketed and sorted on 
 those keys.
 * Once a day a small set (up to 100k rows) of records need to be deleted for 
 regulatory compliance.
 * Once an hour a log of transactions is exported from a RDBS and the fact 
 tables need to be updated (up to 1m rows)  to reflect the new data. The 
 transactions are a combination of inserts, updates, and deletes. The table is 
 partitioned and bucketed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6486) Support secure Subject.doAs() in HiveServer2 JDBC client.


[ 
https://issues.apache.org/jira/browse/HIVE-6486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13920015#comment-13920015
 ] 

Thejas M Nair commented on HIVE-6486:
-

+1
[~shivshi] Can you include the usage notes in the release notes section of the 
jira, so we can pick information from there for documentation ? If you would 
like to also help with adding this to the wiki documentation, that would be 
great too!




 Support secure Subject.doAs() in HiveServer2 JDBC client.
 -

 Key: HIVE-6486
 URL: https://issues.apache.org/jira/browse/HIVE-6486
 Project: Hive
  Issue Type: Improvement
  Components: Authentication, HiveServer2, JDBC
Affects Versions: 0.11.0, 0.12.0
Reporter: Shivaraju Gowda
Assignee: Shivaraju Gowda
 Fix For: 0.13.0

 Attachments: HIVE-6486.1.patch, Hive_011_Support-Subject_doAS.patch, 
 TestHive_SujectDoAs.java


 HIVE-5155 addresses the problem of kerberos authentication in multi-user 
 middleware server using proxy user.  In this mode the principal used by the 
 middle ware server has privileges to impersonate selected users in 
 Hive/Hadoop. 
 This enhancement is to support Subject.doAs() authentication in  Hive JDBC 
 layer so that the end users Kerberos Subject is passed through in the middle 
 ware server. With this improvement there won't be any additional setup in the 
 server to grant proxy privileges to some users and there won't be need to 
 specify a proxy user in the JDBC client. This version should also be more 
 secure since it won't require principals with the privileges to impersonate 
 other users in Hive/Hadoop setup.
  



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Re: Timeline for the Hive 0.13 release?

2014-03-04 Thread Thejas Nair

 I think we should include the following patches as well, they are in
patch available/review stage -
HIVE-5155 - (proxy user support for HS2) is patch available and has
undergone some reviews.
HIVE-6486 - Support secure Subject.doAs() in HiveServer2 JDBC client.
I have reviewed and +1'd it.


On Tue, Mar 4, 2014 at 12:13 PM, Thejas Nair the...@hortonworks.com wrote:
 I would like to include HIVE-5943/HIVE-5942 (describe role support ) and
 HIVE-6547 (metastore api - Role struct cleanup) in the release. I should
 have the patch for describe-role ready in a day or two, and for HIVE-6547 as
 well this week.



 On Tue, Mar 4, 2014 at 11:55 AM, Harish Butani hbut...@hortonworks.com
 wrote:

 Yes sure. I am fine with porting over HIVE-5317 and dependents.
 Besides, couldn't handle 104 angry fans (uh watchers)  :)

 Let’s follow this procedure: if you have features that should go into
 branch-0.13 please post a message here, give the community a chance to voice
 their opinions.

 regards,
 Harish.


 On Mar 4, 2014, at 8:03 AM, Alan Gates ga...@hortonworks.com wrote:

  Sure.  I’d really like to get the work related to HIVE-5317 in 0.13.
  HIVE-5843 is patch available and hopefully can be checked in today.  There
  are several more that depend on that one and can’t be made patch available
  until then (HIVE-6060, HIVE-6319, HIVE-6460, and HIVE-5687).  I don’t want
  to hold up the branching, but are you ok with those going in after the
  branch?
 
  Alan.
 
  On Mar 3, 2014, at 7:53 PM, Harish Butani hbut...@hortonworks.com
  wrote:
 
  I plan to create the branch 5pm PST tomorrow.
  Ok with everybody?
 
  regards,
  Harish.
 
  On Feb 21, 2014, at 5:44 PM, Lefty Leverenz leftylever...@gmail.com
  wrote:
 
  That's appropriate -- let the Hive release march forth on March 4th.
 
 
  -- Lefty
 
 
  On Fri, Feb 21, 2014 at 4:04 PM, Harish Butani
  hbut...@hortonworks.comwrote:
 
  Ok,let’s set it for March 4th .
 
  regards,
  Harish.
 
  On Feb 21, 2014, at 12:14 PM, Brock Noland br...@cloudera.com
  wrote:
 
  Might as well make it March 4th or 5th. Otherwise folks will burn
  weekend time to get patches in.
 
  On Fri, Feb 21, 2014 at 2:10 PM, Harish Butani
  hbut...@hortonworks.com
  wrote:
  Yes makes sense.
  How about we postpone the branching until 10am PST March 3rd, which
  is
  the following Monday.
  Don’t see a point of setting the branch time to a Friday evening.
  Do people agree?
 
  regards,
  Harish.
 
  On Feb 21, 2014, at 11:04 AM, Brock Noland br...@cloudera.com
  wrote:
 
  +1
 
  On Fri, Feb 21, 2014 at 1:02 PM, Thejas Nair
  the...@hortonworks.com
  wrote:
  Can we wait for some few more days for the branching ? I have a
  few
  more
  security fixes that I would like to get in, and we also have a
  long
  pre-commit queue ahead right now. How about branching around
  Friday
  next
  week ?  By then hadoop 2.3 should also be out as that vote has
  been
  concluded, and we can get HIVE-6037 in as well.
  -Thejas
 
 
 
  On Sun, Feb 16, 2014 at 5:32 PM, Brock Noland
  br...@cloudera.com
  wrote:
 
  I'd love to see HIVE-6037 in the 0.13 release. I have +1'ed it
  pending
  tests.
 
  Brock
 
  On Sun, Feb 16, 2014 at 7:23 PM, Navis류승우 navis@nexr.com
  wrote:
  HIVE-6037 is for generating hive-default.template file from
  HiveConf.
  Could
  it be included in this release? If it's not, I'll suspend
  further
  rebasing
  of it till next release (conflicts too frequently).
 
 
  2014-02-16 20:38 GMT+09:00 Lefty Leverenz
  leftylever...@gmail.com
  :
 
  I'll try to catch up on the wikidocs backlog for 0.13.0
  patches in
  time
  for
  the release.  It's a long and growing list, though, so no
  promises.
 
  Feel free to do your own documentation, or hand it off to a
  friendly
  in-house writer.
 
  -- Lefty, self-appointed Hive docs maven
 
 
 
  On Sat, Feb 15, 2014 at 1:28 PM, Thejas Nair 
  the...@hortonworks.com
  wrote:
 
  Sounds good to me.
 
 
  On Fri, Feb 14, 2014 at 7:29 PM, Harish Butani 
  hbut...@hortonworks.com
  wrote:
 
  Hi,
 
  Its mid feb. Wanted to check if the community is ready to
  cut a
  branch.
  Could we cut the branch in a week , say 5pm PST 2/21/14?
  The goal is to keep the release cycle short: couple of
  weeks; so
  after
  the
  branch we go into stabilizing mode for hive 0.13, checking
  in
  only
  blocker/critical bug fixes.
 
  regards,
  Harish.
 
 
  On Jan 20, 2014, at 9:25 AM, Brock Noland
  br...@cloudera.com
  wrote:
 
  Hi,
 
  I agree that picking a date to branch and then restricting
  commits to
  that
  branch would be a less time intensive plan for the RM.
 
  Brock
 
 
  On Sat, Jan 18, 2014 at 4:21 PM, Harish Butani 
  hbut...@hortonworks.com
  wrote:
 
  Yes agree it is time to start planning for the next
  release.
  I would like to volunteer to do the release management
  duties
  for
  this
  release(will be a great experience for me)
  Will be happy to do it, if the community is fine with
  this.
 
  regards,
  Harish.
 
  On Jan

Re: Timeline for the Hive 0.13 release?

2014-03-04 Thread Prasanth Jayachandran

I would like HIVE-6455 and HIVE-4177 to go in the release.
HIVE-6455 - Scalable dynamic partitioning optimization (I already have a patch 
for it and is under code review)
HIVE-4177 - Support partial scan for analyze command - ORC (I will post a patch 
within this week)

Thanks
Prasanth Jayachandran

On Mar 4, 2014, at 12:13 PM, Thejas Nair the...@hortonworks.com wrote:

 I would like to include HIVE-5943/HIVE-5942 (describe role support ) and
 HIVE-6547 (metastore api - Role struct cleanup) in the release. I should
 have the patch for describe-role ready in a day or two, and for HIVE-6547
 as well this week.
 
 
 
 On Tue, Mar 4, 2014 at 11:55 AM, Harish Butani hbut...@hortonworks.comwrote:
 
 Yes sure. I am fine with porting over HIVE-5317 and dependents.
 Besides, couldn't handle 104 angry fans (uh watchers)  :)
 
 Let’s follow this procedure: if you have features that should go into
 branch-0.13 please post a message here, give the community a chance to
 voice their opinions.
 
 regards,
 Harish.
 
 
 On Mar 4, 2014, at 8:03 AM, Alan Gates ga...@hortonworks.com wrote:
 
 Sure.  I’d really like to get the work related to HIVE-5317 in 0.13.
 HIVE-5843 is patch available and hopefully can be checked in today.  There
 are several more that depend on that one and can’t be made patch available
 until then (HIVE-6060, HIVE-6319, HIVE-6460, and HIVE-5687).  I don’t want
 to hold up the branching, but are you ok with those going in after the
 branch?
 
 Alan.
 
 On Mar 3, 2014, at 7:53 PM, Harish Butani hbut...@hortonworks.com
 wrote:
 
 I plan to create the branch 5pm PST tomorrow.
 Ok with everybody?
 
 regards,
 Harish.
 
 On Feb 21, 2014, at 5:44 PM, Lefty Leverenz leftylever...@gmail.com
 wrote:
 
 That's appropriate -- let the Hive release march forth on March 4th.
 
 
 -- Lefty
 
 
 On Fri, Feb 21, 2014 at 4:04 PM, Harish Butani 
 hbut...@hortonworks.comwrote:
 
 Ok,let’s set it for March 4th .
 
 regards,
 Harish.
 
 On Feb 21, 2014, at 12:14 PM, Brock Noland br...@cloudera.com
 wrote:
 
 Might as well make it March 4th or 5th. Otherwise folks will burn
 weekend time to get patches in.
 
 On Fri, Feb 21, 2014 at 2:10 PM, Harish Butani 
 hbut...@hortonworks.com
 wrote:
 Yes makes sense.
 How about we postpone the branching until 10am PST March 3rd, which
 is
 the following Monday.
 Don’t see a point of setting the branch time to a Friday evening.
 Do people agree?
 
 regards,
 Harish.
 
 On Feb 21, 2014, at 11:04 AM, Brock Noland br...@cloudera.com
 wrote:
 
 +1
 
 On Fri, Feb 21, 2014 at 1:02 PM, Thejas Nair 
 the...@hortonworks.com
 wrote:
 Can we wait for some few more days for the branching ? I have a
 few
 more
 security fixes that I would like to get in, and we also have a
 long
 pre-commit queue ahead right now. How about branching around
 Friday
 next
 week ?  By then hadoop 2.3 should also be out as that vote has
 been
 concluded, and we can get HIVE-6037 in as well.
 -Thejas
 
 
 
 On Sun, Feb 16, 2014 at 5:32 PM, Brock Noland br...@cloudera.com
 
 wrote:
 
 I'd love to see HIVE-6037 in the 0.13 release. I have +1'ed it
 pending
 tests.
 
 Brock
 
 On Sun, Feb 16, 2014 at 7:23 PM, Navis류승우 navis@nexr.com
 wrote:
 HIVE-6037 is for generating hive-default.template file from
 HiveConf.
 Could
 it be included in this release? If it's not, I'll suspend
 further
 rebasing
 of it till next release (conflicts too frequently).
 
 
 2014-02-16 20:38 GMT+09:00 Lefty Leverenz 
 leftylever...@gmail.com
 :
 
 I'll try to catch up on the wikidocs backlog for 0.13.0
 patches in
 time
 for
 the release.  It's a long and growing list, though, so no
 promises.
 
 Feel free to do your own documentation, or hand it off to a
 friendly
 in-house writer.
 
 -- Lefty, self-appointed Hive docs maven
 
 
 
 On Sat, Feb 15, 2014 at 1:28 PM, Thejas Nair 
 the...@hortonworks.com
 wrote:
 
 Sounds good to me.
 
 
 On Fri, Feb 14, 2014 at 7:29 PM, Harish Butani 
 hbut...@hortonworks.com
 wrote:
 
 Hi,
 
 Its mid feb. Wanted to check if the community is ready to
 cut a
 branch.
 Could we cut the branch in a week , say 5pm PST 2/21/14?
 The goal is to keep the release cycle short: couple of
 weeks; so
 after
 the
 branch we go into stabilizing mode for hive 0.13, checking in
 only
 blocker/critical bug fixes.
 
 regards,
 Harish.
 
 
 On Jan 20, 2014, at 9:25 AM, Brock Noland 
 br...@cloudera.com
 wrote:
 
 Hi,
 
 I agree that picking a date to branch and then restricting
 commits to
 that
 branch would be a less time intensive plan for the RM.
 
 Brock
 
 
 On Sat, Jan 18, 2014 at 4:21 PM, Harish Butani 
 hbut...@hortonworks.com
 wrote:
 
 Yes agree it is time to start planning for the next
 release.
 I would like to volunteer to do the release management
 duties
 for
 this
 release(will be a great experience for me)
 Will be happy to do it, if the community is fine with this.
 
 regards,
 Harish.
 
 On Jan 17, 2014, at 7:05 PM, Thejas Nair 
 the...@hortonworks.com
 
 wrote:
 
 Yes, I think it is time to start planning

Re: Timeline for the Hive 0.13 release?

2014-03-04 Thread Vaibhav Gumashta

I'd like to have the following go in:
HIVE-4764 [Support Kerberos HTTP authentication for HiveServer2 running in
http mode] https://issues.apache.org/jira/browse/HIVE-4764
HIVE-6306 [HiveServer2 running in http mode should support for doAs
functionality] https://issues.apache.org/jira/browse/HIVE-6306
HIVE-6350 [Support LDAP authentication for HiveServer2 in http
mode]https://issues.apache.org/jira/browse/HIVE-6350
HIVE-6485 [Downgrade to httpclient-4.2.5 in JDBC from
httpclient-4.3.2]https://issues.apache.org/jira/browse/HIVE-6485

And it would awesome to have HIVE-5155 - (proxy user support for HS2).

Thanks,
--Vaibhav


On Tue, Mar 4, 2014 at 1:15 PM, Prasanth Jayachandran 
pjayachand...@hortonworks.com wrote:

 I would like HIVE-6455 and HIVE-4177 to go in the release.
 HIVE-6455 - Scalable dynamic partitioning optimization (I already have a
 patch for it and is under code review)
 HIVE-4177 - Support partial scan for analyze command - ORC (I will post a
 patch within this week)

 Thanks
 Prasanth Jayachandran

 On Mar 4, 2014, at 12:13 PM, Thejas Nair the...@hortonworks.com wrote:

  I would like to include HIVE-5943/HIVE-5942 (describe role support ) and
  HIVE-6547 (metastore api - Role struct cleanup) in the release. I should
  have the patch for describe-role ready in a day or two, and for HIVE-6547
  as well this week.
 
 
 
  On Tue, Mar 4, 2014 at 11:55 AM, Harish Butani hbut...@hortonworks.com
 wrote:
 
  Yes sure. I am fine with porting over HIVE-5317 and dependents.
  Besides, couldn't handle 104 angry fans (uh watchers)  :)
 
  Let’s follow this procedure: if you have features that should go into
  branch-0.13 please post a message here, give the community a chance to
  voice their opinions.
 
  regards,
  Harish.
 
 
  On Mar 4, 2014, at 8:03 AM, Alan Gates ga...@hortonworks.com wrote:
 
  Sure.  I’d really like to get the work related to HIVE-5317 in 0.13.
  HIVE-5843 is patch available and hopefully can be checked in today.
  There
  are several more that depend on that one and can’t be made patch
 available
  until then (HIVE-6060, HIVE-6319, HIVE-6460, and HIVE-5687).  I don’t
 want
  to hold up the branching, but are you ok with those going in after the
  branch?
 
  Alan.
 
  On Mar 3, 2014, at 7:53 PM, Harish Butani hbut...@hortonworks.com
  wrote:
 
  I plan to create the branch 5pm PST tomorrow.
  Ok with everybody?
 
  regards,
  Harish.
 
  On Feb 21, 2014, at 5:44 PM, Lefty Leverenz leftylever...@gmail.com
  wrote:
 
  That's appropriate -- let the Hive release march forth on March 4th.
 
 
  -- Lefty
 
 
  On Fri, Feb 21, 2014 at 4:04 PM, Harish Butani 
  hbut...@hortonworks.comwrote:
 
  Ok,let’s set it for March 4th .
 
  regards,
  Harish.
 
  On Feb 21, 2014, at 12:14 PM, Brock Noland br...@cloudera.com
  wrote:
 
  Might as well make it March 4th or 5th. Otherwise folks will burn
  weekend time to get patches in.
 
  On Fri, Feb 21, 2014 at 2:10 PM, Harish Butani 
  hbut...@hortonworks.com
  wrote:
  Yes makes sense.
  How about we postpone the branching until 10am PST March 3rd,
 which
  is
  the following Monday.
  Don’t see a point of setting the branch time to a Friday evening.
  Do people agree?
 
  regards,
  Harish.
 
  On Feb 21, 2014, at 11:04 AM, Brock Noland br...@cloudera.com
  wrote:
 
  +1
 
  On Fri, Feb 21, 2014 at 1:02 PM, Thejas Nair 
  the...@hortonworks.com
  wrote:
  Can we wait for some few more days for the branching ? I have a
  few
  more
  security fixes that I would like to get in, and we also have a
  long
  pre-commit queue ahead right now. How about branching around
  Friday
  next
  week ?  By then hadoop 2.3 should also be out as that vote has
  been
  concluded, and we can get HIVE-6037 in as well.
  -Thejas
 
 
 
  On Sun, Feb 16, 2014 at 5:32 PM, Brock Noland 
 br...@cloudera.com
 
  wrote:
 
  I'd love to see HIVE-6037 in the 0.13 release. I have +1'ed it
  pending
  tests.
 
  Brock
 
  On Sun, Feb 16, 2014 at 7:23 PM, Navis류승우 navis@nexr.com
  wrote:
  HIVE-6037 is for generating hive-default.template file from
  HiveConf.
  Could
  it be included in this release? If it's not, I'll suspend
  further
  rebasing
  of it till next release (conflicts too frequently).
 
 
  2014-02-16 20:38 GMT+09:00 Lefty Leverenz 
  leftylever...@gmail.com
  :
 
  I'll try to catch up on the wikidocs backlog for 0.13.0
  patches in
  time
  for
  the release.  It's a long and growing list, though, so no
  promises.
 
  Feel free to do your own documentation, or hand it off to a
  friendly
  in-house writer.
 
  -- Lefty, self-appointed Hive docs maven
 
 
 
  On Sat, Feb 15, 2014 at 1:28 PM, Thejas Nair 
  the...@hortonworks.com
  wrote:
 
  Sounds good to me.
 
 
  On Fri, Feb 14, 2014 at 7:29 PM, Harish Butani 
  hbut...@hortonworks.com
  wrote:
 
  Hi,
 
  Its mid feb. Wanted to check if the community is ready to
  cut a
  branch.
  Could we cut the branch in a week , say 5pm PST 2/21/14?
  The goal is to keep the release cycle short: couple of

[jira] [Commented] (HIVE-6414) ParquetInputFormat provides data values that do not match the object inspectors

2014-03-04 Thread Szehon Ho (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13920048#comment-13920048
 ] 

Szehon Ho commented on HIVE-6414:
-

Hi Justin, thanks for taking care of it.  Do you want resubmit the patch for 
testing for this issue?  There had been an issue where the pre-commit test 
queue got lost.

 ParquetInputFormat provides data values that do not match the object 
 inspectors
 ---

 Key: HIVE-6414
 URL: https://issues.apache.org/jira/browse/HIVE-6414
 Project: Hive
  Issue Type: Bug
  Components: Serializers/Deserializers
Affects Versions: 0.13.0
Reporter: Remus Rusanu
Assignee: Justin Coffey
  Labels: Parquet
 Fix For: 0.13.0

 Attachments: HIVE-6414.2.patch, HIVE-6414.3.patch, HIVE-6414.patch


 While working on HIVE-5998 I noticed that the ParquetRecordReader returns 
 IntWritable for all 'int like' types, in disaccord with the row object 
 inspectors. I though fine, and I worked my way around it. But I see now that 
 the issue trigger failuers in other places, eg. in aggregates:
 {noformat}
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
 Error while processing row 
 {cint:528534767,ctinyint:31,csmallint:4963,cfloat:31.0,cdouble:4963.0,cstring1:cvLH6Eat2yFsyy7p}
 at 
 org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:534)
 at 
 org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:177)
 ... 8 more
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
 java.lang.ClassCastException: org.apache.hadoop.io.IntWritable cannot be cast 
 to java.lang.Short
 at 
 org.apache.hadoop.hive.ql.exec.GroupByOperator.processOp(GroupByOperator.java:808)
 at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:790)
 at 
 org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:87)
 at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:790)
 at 
 org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:92)
 at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:790)
 at 
 org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:524)
 ... 9 more
 Caused by: java.lang.ClassCastException: org.apache.hadoop.io.IntWritable 
 cannot be cast to java.lang.Short
 at 
 org.apache.hadoop.hive.serde2.objectinspector.primitive.JavaShortObjectInspector.get(JavaShortObjectInspector.java:41)
 at 
 org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.compare(ObjectInspectorUtils.java:671)
 at 
 org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.compare(ObjectInspectorUtils.java:631)
 at 
 org.apache.hadoop.hive.ql.udf.generic.GenericUDAFMin$GenericUDAFMinEvaluator.merge(GenericUDAFMin.java:109)
 at 
 org.apache.hadoop.hive.ql.udf.generic.GenericUDAFMin$GenericUDAFMinEvaluator.iterate(GenericUDAFMin.java:96)
 at 
 org.apache.hadoop.hive.ql.udf.generic.GenericUDAFEvaluator.aggregate(GenericUDAFEvaluator.java:183)
 at 
 org.apache.hadoop.hive.ql.exec.GroupByOperator.updateAggregations(GroupByOperator.java:641)
 at 
 org.apache.hadoop.hive.ql.exec.GroupByOperator.processHashAggr(GroupByOperator.java:838)
 at 
 org.apache.hadoop.hive.ql.exec.GroupByOperator.processKey(GroupByOperator.java:735)
 at 
 org.apache.hadoop.hive.ql.exec.GroupByOperator.processOp(GroupByOperator.java:803)
 ... 15 more
 {noformat}
 My test is (I'm writing a test .q from HIVE-5998, but the repro does not 
 involve vectorization):
 {noformat}
 create table if not exists alltypes_parquet (
   cint int,
   ctinyint tinyint,
   csmallint smallint,
   cfloat float,
   cdouble double,
   cstring1 string) stored as parquet;
 insert overwrite table alltypes_parquet
   select cint,
 ctinyint,
 csmallint,
 cfloat,
 cdouble,
 cstring1
   from alltypesorc;
 explain select * from alltypes_parquet limit 10; select * from 
 alltypes_parquet limit 10;
 explain select ctinyint,
   max(cint),
   min(csmallint),
   count(cstring1),
   avg(cfloat),
   stddev_pop(cdouble)
   from alltypes_parquet
   group by ctinyint;
 select ctinyint,
   max(cint),
   min(csmallint),
   count(cstring1),
   avg(cfloat),
   stddev_pop(cdouble)
   from alltypes_parquet
   group by ctinyint;
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6486) Support secure Subject.doAs() in HiveServer2 JDBC client.

2014-03-04 Thread Vaibhav Gumashta (JIRA)

[
https://issues.apache.org/jira/browse/HIVE-6486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13920061#comment-13920061
]

Vaibhav Gumashta commented on HIVE-6486:

[~shivshi] Thanks for the updated patch. Can you also update the rb diff? It
seems to have the older patch.

Support secure Subject.doAs() in HiveServer2 JDBC client.
-

Key: HIVE-6486
URL: https://issues.apache.org/jira/browse/HIVE-6486
Project: Hive
Issue Type: Improvement
Components: Authentication, HiveServer2, JDBC
Affects Versions: 0.11.0, 0.12.0
Reporter: Shivaraju Gowda
Assignee: Shivaraju Gowda
Fix For: 0.13.0

Attachments: HIVE-6486.1.patch, Hive_011_Support-Subject_doAS.patch,
TestHive_SujectDoAs.java

HIVE-5155 addresses the problem of kerberos authentication in multi-user
middleware server using proxy user. In this mode the principal used by the
middle ware server has privileges to impersonate selected users in
Hive/Hadoop.
This enhancement is to support Subject.doAs() authentication in Hive JDBC
layer so that the end users Kerberos Subject is passed through in the middle
ware server. With this improvement there won't be any additional setup in the
server to grant proxy privileges to some users and there won't be need to
specify a proxy user in the JDBC client. This version should also be more
secure since it won't require principals with the privileges to impersonate
other users in Hive/Hadoop setup.

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6503) document pluggable authentication modules (PAM) in template config, wiki


[ 
https://issues.apache.org/jira/browse/HIVE-6503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13920063#comment-13920063
 ] 

Lefty Leverenz commented on HIVE-6503:
--

You could update the parameter description in a release note, either here or on 
HIVE-6466 (or both).  The wiki needs to be updated here:

* [Setting Up HiveServer2:  Authentication/Security Configuration 
|https://cwiki.apache.org/confluence/display/Hive/Setting+Up+HiveServer2#SettingUpHiveServer2-Authentication/SecurityConfiguration]

(Eventually HiveServer2 configuration parameters will be documented in 
Configuration Properties, but they're not there yet.)

 document pluggable authentication modules (PAM) in template config, wiki
 

 Key: HIVE-6503
 URL: https://issues.apache.org/jira/browse/HIVE-6503
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0
Reporter: Thejas M Nair
Assignee: Vaibhav Gumashta
Priority: Blocker
 Fix For: 0.13.0


 HIVE-6466 adds support for PAM as a supported value for 
 hive.server2.authentication. 
 It also adds a config parameter hive.server2.authentication.pam.services.
 The default template file needs to be updated to document these. The wiki 
 docs should also document the support for pluggable authentication modules.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6414) ParquetInputFormat provides data values that do not match the object inspectors

2014-03-04 Thread Xuefu Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13920071#comment-13920071
 ] 

Xuefu Zhang commented on HIVE-6414:
---

+1 to the patch #3.

 ParquetInputFormat provides data values that do not match the object 
 inspectors
 ---

 Key: HIVE-6414
 URL: https://issues.apache.org/jira/browse/HIVE-6414
 Project: Hive
  Issue Type: Bug
  Components: Serializers/Deserializers
Affects Versions: 0.13.0
Reporter: Remus Rusanu
Assignee: Justin Coffey
  Labels: Parquet
 Fix For: 0.13.0

 Attachments: HIVE-6414.2.patch, HIVE-6414.3.patch, HIVE-6414.patch


 While working on HIVE-5998 I noticed that the ParquetRecordReader returns 
 IntWritable for all 'int like' types, in disaccord with the row object 
 inspectors. I though fine, and I worked my way around it. But I see now that 
 the issue trigger failuers in other places, eg. in aggregates:
 {noformat}
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
 Error while processing row 
 {cint:528534767,ctinyint:31,csmallint:4963,cfloat:31.0,cdouble:4963.0,cstring1:cvLH6Eat2yFsyy7p}
 at 
 org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:534)
 at 
 org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:177)
 ... 8 more
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
 java.lang.ClassCastException: org.apache.hadoop.io.IntWritable cannot be cast 
 to java.lang.Short
 at 
 org.apache.hadoop.hive.ql.exec.GroupByOperator.processOp(GroupByOperator.java:808)
 at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:790)
 at 
 org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:87)
 at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:790)
 at 
 org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:92)
 at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:790)
 at 
 org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:524)
 ... 9 more
 Caused by: java.lang.ClassCastException: org.apache.hadoop.io.IntWritable 
 cannot be cast to java.lang.Short
 at 
 org.apache.hadoop.hive.serde2.objectinspector.primitive.JavaShortObjectInspector.get(JavaShortObjectInspector.java:41)
 at 
 org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.compare(ObjectInspectorUtils.java:671)
 at 
 org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.compare(ObjectInspectorUtils.java:631)
 at 
 org.apache.hadoop.hive.ql.udf.generic.GenericUDAFMin$GenericUDAFMinEvaluator.merge(GenericUDAFMin.java:109)
 at 
 org.apache.hadoop.hive.ql.udf.generic.GenericUDAFMin$GenericUDAFMinEvaluator.iterate(GenericUDAFMin.java:96)
 at 
 org.apache.hadoop.hive.ql.udf.generic.GenericUDAFEvaluator.aggregate(GenericUDAFEvaluator.java:183)
 at 
 org.apache.hadoop.hive.ql.exec.GroupByOperator.updateAggregations(GroupByOperator.java:641)
 at 
 org.apache.hadoop.hive.ql.exec.GroupByOperator.processHashAggr(GroupByOperator.java:838)
 at 
 org.apache.hadoop.hive.ql.exec.GroupByOperator.processKey(GroupByOperator.java:735)
 at 
 org.apache.hadoop.hive.ql.exec.GroupByOperator.processOp(GroupByOperator.java:803)
 ... 15 more
 {noformat}
 My test is (I'm writing a test .q from HIVE-5998, but the repro does not 
 involve vectorization):
 {noformat}
 create table if not exists alltypes_parquet (
   cint int,
   ctinyint tinyint,
   csmallint smallint,
   cfloat float,
   cdouble double,
   cstring1 string) stored as parquet;
 insert overwrite table alltypes_parquet
   select cint,
 ctinyint,
 csmallint,
 cfloat,
 cdouble,
 cstring1
   from alltypesorc;
 explain select * from alltypes_parquet limit 10; select * from 
 alltypes_parquet limit 10;
 explain select ctinyint,
   max(cint),
   min(csmallint),
   count(cstring1),
   avg(cfloat),
   stddev_pop(cdouble)
   from alltypes_parquet
   group by ctinyint;
 select ctinyint,
   max(cint),
   min(csmallint),
   count(cstring1),
   avg(cfloat),
   stddev_pop(cdouble)
   from alltypes_parquet
   group by ctinyint;
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6433) SQL std auth - allow grant/revoke roles if user has ADMIN OPTION


[ 
https://issues.apache.org/jira/browse/HIVE-6433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13920084#comment-13920084
 ] 

Lefty Leverenz commented on HIVE-6433:
--

Does this need any documentation, besides general docs for the parent HIVE-5837?

 SQL std auth - allow grant/revoke roles if user has ADMIN OPTION
 

 Key: HIVE-6433
 URL: https://issues.apache.org/jira/browse/HIVE-6433
 Project: Hive
  Issue Type: Sub-task
Reporter: Thejas M Nair
Assignee: Ashutosh Chauhan
 Fix For: 0.13.0

 Attachments: HIVE-6433.1.patch, HIVE-6433.2.patch, HIVE-6433.patch


 Follow up jira for HIVE-5952.
 If a user/role has admin option on a role, then user should be able to grant 
 /revoke other users to/from the role.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-1662) Add file pruning into Hive.


[ 
https://issues.apache.org/jira/browse/HIVE-1662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13920087#comment-13920087
 ] 

Hive QA commented on HIVE-1662:
---



{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12632450/HIVE-1662.12.patch.txt

{color:green}SUCCESS:{color} +1 5240 tests passed

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1619/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1619/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12632450

 Add file pruning into Hive.
 ---

 Key: HIVE-1662
 URL: https://issues.apache.org/jira/browse/HIVE-1662
 Project: Hive
  Issue Type: New Feature
Reporter: He Yongqiang
Assignee: Navis
 Attachments: HIVE-1662.10.patch.txt, HIVE-1662.11.patch.txt, 
 HIVE-1662.12.patch.txt, HIVE-1662.8.patch.txt, HIVE-1662.9.patch.txt, 
 HIVE-1662.D8391.1.patch, HIVE-1662.D8391.2.patch, HIVE-1662.D8391.3.patch, 
 HIVE-1662.D8391.4.patch, HIVE-1662.D8391.5.patch, HIVE-1662.D8391.6.patch, 
 HIVE-1662.D8391.7.patch


 now hive support filename virtual column. 
 if a file name filter presents in a query, hive should be able to only add 
 files which passed the filter to input paths.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-6492) limit partition number involved in a table scan


 [ 
https://issues.apache.org/jira/browse/HIVE-6492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Selina Zhang updated HIVE-6492:
---

Attachment: HIVE-6492.4.patch.txt

Gunther, thanks for your comments!

Removed the logic for simple fetch query. Let the query pass if it is a fetch 
operator (no mapreduce job launched). 

However, I still need to put the logic right after the physical optimizers 
because only till then I have the information that if the query is a metadata 
only query. 

 limit partition number involved in a table scan
 ---

 Key: HIVE-6492
 URL: https://issues.apache.org/jira/browse/HIVE-6492
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor
Affects Versions: 0.12.0
Reporter: Selina Zhang
 Fix For: 0.13.0

 Attachments: HIVE-6492.1.patch.txt, HIVE-6492.2.patch.txt, 
 HIVE-6492.3.patch.txt, HIVE-6492.4.patch.txt

   Original Estimate: 24h
  Remaining Estimate: 24h

 To protect the cluster, a new configure variable 
 hive.limit.query.max.table.partition is added to hive configuration to
 limit the table partitions involved in a table scan. 
 The default value will be set to -1 which means there is no limit by default. 
 This variable will not affect metadata only query.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-4293) Predicates following UDTF operator are removed by PPD


 [ 
https://issues.apache.org/jira/browse/HIVE-4293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harish Butani updated HIVE-4293:


Attachment: HIVE-4293.10.patch

 Predicates following UDTF operator are removed by PPD
 -

 Key: HIVE-4293
 URL: https://issues.apache.org/jira/browse/HIVE-4293
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Critical
 Attachments: D9933.6.patch, HIVE-4293.10.patch, 
 HIVE-4293.7.patch.txt, HIVE-4293.8.patch.txt, HIVE-4293.9.patch.txt, 
 HIVE-4293.D9933.1.patch, HIVE-4293.D9933.2.patch, HIVE-4293.D9933.3.patch, 
 HIVE-4293.D9933.4.patch, HIVE-4293.D9933.5.patch


 For example, 
 {noformat}
 explain SELECT value from (
   select explode(array(key, value)) as (value) from (
 select * FROM src WHERE key  200
   ) A
 ) B WHERE value  300
 ;
 {noformat}
 Makes plan like this, removing last predicates
 {noformat}
   TableScan
 alias: src
 Filter Operator
   predicate:
   expr: (key  200.0)
   type: boolean
   Select Operator
 expressions:
   expr: array(key,value)
   type: arraystring
 outputColumnNames: _col0
 UDTF Operator
   function name: explode
   Select Operator
 expressions:
   expr: col
   type: string
 outputColumnNames: _col0
 File Output Operator
   compressed: false
   GlobalTableId: 0
   table:
   input format: org.apache.hadoop.mapred.TextInputFormat
   output format: 
 org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-4293) Predicates following UDTF operator are removed by PPD


[ 
https://issues.apache.org/jira/browse/HIVE-4293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13920107#comment-13920107
 ] 

Harish Butani commented on HIVE-4293:
-

[~navis] this looks good
+1
attaching an updated patch, since the last one is couple of months old.
Also added the testcase from HIVE-5964

Please take a look; hope you don't mind that I uploaded an updated patch.

 Predicates following UDTF operator are removed by PPD
 -

 Key: HIVE-4293
 URL: https://issues.apache.org/jira/browse/HIVE-4293
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Critical
 Attachments: D9933.6.patch, HIVE-4293.7.patch.txt, 
 HIVE-4293.8.patch.txt, HIVE-4293.9.patch.txt, HIVE-4293.D9933.1.patch, 
 HIVE-4293.D9933.2.patch, HIVE-4293.D9933.3.patch, HIVE-4293.D9933.4.patch, 
 HIVE-4293.D9933.5.patch


 For example, 
 {noformat}
 explain SELECT value from (
   select explode(array(key, value)) as (value) from (
 select * FROM src WHERE key  200
   ) A
 ) B WHERE value  300
 ;
 {noformat}
 Makes plan like this, removing last predicates
 {noformat}
   TableScan
 alias: src
 Filter Operator
   predicate:
   expr: (key  200.0)
   type: boolean
   Select Operator
 expressions:
   expr: array(key,value)
   type: arraystring
 outputColumnNames: _col0
 UDTF Operator
   function name: explode
   Select Operator
 expressions:
   expr: col
   type: string
 outputColumnNames: _col0
 File Output Operator
   compressed: false
   GlobalTableId: 0
   table:
   input format: org.apache.hadoop.mapred.TextInputFormat
   output format: 
 org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Comment Edited] (HIVE-4293) Predicates following UDTF operator are removed by PPD


[ 
https://issues.apache.org/jira/browse/HIVE-4293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13920107#comment-13920107
 ] 

Harish Butani edited comment on HIVE-4293 at 3/4/14 10:17 PM:
--

[~navis] this looks good
+1
attaching an updated patch, since the last one is couple of months old.
- Had to resolve minor conflicts in SemAly.
- Regenned .q.out files.
- Also added the testcase from HIVE-5964

Please take a look; hope you don't mind that I uploaded an updated patch.



was (Author: rhbutani):
[~navis] this looks good
+1
attaching an updated patch, since the last one is couple of months old.
Also added the testcase from HIVE-5964

Please take a look; hope you don't mind that I uploaded an updated patch.

 Predicates following UDTF operator are removed by PPD
 -

 Key: HIVE-4293
 URL: https://issues.apache.org/jira/browse/HIVE-4293
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Critical
 Attachments: D9933.6.patch, HIVE-4293.10.patch, 
 HIVE-4293.7.patch.txt, HIVE-4293.8.patch.txt, HIVE-4293.9.patch.txt, 
 HIVE-4293.D9933.1.patch, HIVE-4293.D9933.2.patch, HIVE-4293.D9933.3.patch, 
 HIVE-4293.D9933.4.patch, HIVE-4293.D9933.5.patch


 For example, 
 {noformat}
 explain SELECT value from (
   select explode(array(key, value)) as (value) from (
 select * FROM src WHERE key  200
   ) A
 ) B WHERE value  300
 ;
 {noformat}
 Makes plan like this, removing last predicates
 {noformat}
   TableScan
 alias: src
 Filter Operator
   predicate:
   expr: (key  200.0)
   type: boolean
   Select Operator
 expressions:
   expr: array(key,value)
   type: arraystring
 outputColumnNames: _col0
 UDTF Operator
   function name: explode
   Select Operator
 expressions:
   expr: col
   type: string
 outputColumnNames: _col0
 File Output Operator
   compressed: false
   GlobalTableId: 0
   table:
   input format: org.apache.hadoop.mapred.TextInputFormat
   output format: 
 org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6486) Support secure Subject.doAs() in HiveServer2 JDBC client.

2014-03-04 Thread Prasad Mujumdar (JIRA)

[
https://issues.apache.org/jira/browse/HIVE-6486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13920109#comment-13920109
]

Prasad Mujumdar commented on HIVE-6486:
---

[~shivshi] My apologies for not looking into it earlier. The patch looks fine
to me. Thanks for Addressing the issue.
I understand that we can't add a unit test for this since it needs all the
security setup. There's an integration test added as part of the proposed
HIVE-5155 patch. Once that's committed, I will try add a test case to cover
this.

Support secure Subject.doAs() in HiveServer2 JDBC client.
-

Attachments: HIVE-6486.1.patch, Hive_011_Support-Subject_doAS.patch,
TestHive_SujectDoAs.java

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6541) Need to write documentation for ACID work

2014-03-04 Thread Alan Gates (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13920115#comment-13920115
 ] 

Alan Gates commented on HIVE-6541:
--

[~leftylev] I have a first draft of the docs.  Should I just post it in here in 
text format so we can iterate on it then I can post it to JIRA when this stuff 
gets committed?

 Need to write documentation for ACID work
 -

 Key: HIVE-6541
 URL: https://issues.apache.org/jira/browse/HIVE-6541
 Project: Hive
  Issue Type: Sub-task
  Components: Documentation
Affects Versions: 0.13.0
Reporter: Alan Gates
Assignee: Alan Gates
 Fix For: 0.13.0


 ACID introduces a number of new config file options, tables in the metastore, 
 keywords in the grammar, and a new interface for use of tools like storm and 
 flume.  These need to be documented.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-6492) limit partition number involved in a table scan


 [ 
https://issues.apache.org/jira/browse/HIVE-6492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-6492:
-

Attachment: HIVE-6492.4.patch_suggestion

[~selinazh] - see .4_suggestion for what i meant by doing it in semantic 
analyzer. that way it will work for both tez + mr. i've also added a couple of 
tests. if you like you can throw out the fetch task part to make it simpler.

 limit partition number involved in a table scan
 ---

 Key: HIVE-6492
 URL: https://issues.apache.org/jira/browse/HIVE-6492
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor
Affects Versions: 0.12.0
Reporter: Selina Zhang
 Fix For: 0.13.0

 Attachments: HIVE-6492.1.patch.txt, HIVE-6492.2.patch.txt, 
 HIVE-6492.3.patch.txt, HIVE-6492.4.patch.txt, HIVE-6492.4.patch_suggestion

   Original Estimate: 24h
  Remaining Estimate: 24h

 To protect the cluster, a new configure variable 
 hive.limit.query.max.table.partition is added to hive configuration to
 limit the table partitions involved in a table scan. 
 The default value will be set to -1 which means there is no limit by default. 
 This variable will not affect metadata only query.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-6492) limit partition number involved in a table scan


 [ 
https://issues.apache.org/jira/browse/HIVE-6492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Selina Zhang updated HIVE-6492:
---

Attachment: HIVE-6492.5.patch.txt

Thank you,Gunther! I like your patch though it does not really care of metadata 
only query. But I agree put it to SemanticAnalyzer is better. 

I just renamed the suggestion patch and re-submit it. 

 limit partition number involved in a table scan
 ---

 Key: HIVE-6492
 URL: https://issues.apache.org/jira/browse/HIVE-6492
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor
Affects Versions: 0.12.0
Reporter: Selina Zhang
 Fix For: 0.13.0

 Attachments: HIVE-6492.1.patch.txt, HIVE-6492.2.patch.txt, 
 HIVE-6492.3.patch.txt, HIVE-6492.4.patch.txt, HIVE-6492.4.patch_suggestion, 
 HIVE-6492.5.patch.txt

   Original Estimate: 24h
  Remaining Estimate: 24h

 To protect the cluster, a new configure variable 
 hive.limit.query.max.table.partition is added to hive configuration to
 limit the table partitions involved in a table scan. 
 The default value will be set to -1 which means there is no limit by default. 
 This variable will not affect metadata only query.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Re: Review Request 13845: HIVE-5155: Support secure proxy user access to HiveServer2

2014-03-04 Thread Prasad Mujumdar


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/13845/
---

(Updated March 4, 2014, 10:47 p.m.)


Review request for hive, Brock Noland, Carl Steinbach, and Thejas Nair.


Changes
---

Corrected a merge conflict.


Bugs: HIVE-5155
https://issues.apache.org/jira/browse/HIVE-5155


Repository: hive-git


Description
---

Delegation token support -
Enable delegation token connection for HiveServer2
Enhance the TCLIService interface to support delegation token requests
Support passing the delegation token connection type via JDBC URL and Beeline 
option

Direct proxy access -
Define new proxy user property
Shim interfaces to validate proxy access for a given user

Note that the diff doesn't include thrift generated code.


Diffs (updated)
-

  beeline/pom.xml 7449430 
  beeline/src/java/org/apache/hive/beeline/BeeLine.java 563d242 
  beeline/src/java/org/apache/hive/beeline/BeeLineOpts.java 91e20ec 
  beeline/src/java/org/apache/hive/beeline/Commands.java d2d7fd3 
  beeline/src/java/org/apache/hive/beeline/DatabaseConnection.java 94178ef 
  beeline/src/test/org/apache/hive/beeline/ProxyAuthTest.java PRE-CREATION 
  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 388a604 
  conf/hive-default.xml.template 3f01e0b 
  data/files/ProxyAuth.res PRE-CREATION 
  itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestJdbcDriver2.java 
8210e75 
  jdbc/src/java/org/apache/hadoop/hive/jdbc/HiveConnection.java d08e05b 
  jdbc/src/java/org/apache/hive/jdbc/HiveConnection.java 4102d7a 
  jdbc/src/java/org/apache/hive/jdbc/Utils.java 608837e 
  service/src/java/org/apache/hive/service/auth/HiveAuthFactory.java d8ba3aa 
  service/src/java/org/apache/hive/service/auth/KerberosSaslHelper.java 519556c 
  service/src/java/org/apache/hive/service/auth/PlainSaslHelper.java 15b1675 
  service/src/java/org/apache/hive/service/cli/CLIService.java 2b1e712 
  service/src/java/org/apache/hive/service/cli/CLIServiceClient.java b9d1489 
  service/src/java/org/apache/hive/service/cli/EmbeddedCLIServiceClient.java 
a31ea94 
  service/src/java/org/apache/hive/service/cli/ICLIService.java 621d689 
  service/src/java/org/apache/hive/service/cli/session/HiveSession.java c8fb8ec 
  service/src/java/org/apache/hive/service/cli/session/HiveSessionImpl.java 
d6d0d27 
  
service/src/java/org/apache/hive/service/cli/session/HiveSessionImplwithUGI.java
 b934ebe 
  service/src/java/org/apache/hive/service/cli/session/SessionManager.java 
cec3b04 
  service/src/java/org/apache/hive/service/cli/thrift/ThriftCLIService.java 
26bda5a 
  
service/src/java/org/apache/hive/service/cli/thrift/ThriftCLIServiceClient.java 
3675e86 
  service/src/test/org/apache/hive/service/auth/TestPlainSaslHelper.java 
8fa4afd 
  service/src/test/org/apache/hive/service/cli/session/TestSessionHooks.java 
2fac800 
  shims/0.20/src/main/java/org/apache/hadoop/hive/shims/Hadoop20Shims.java 
51c8051 
  
shims/common-secure/src/main/java/org/apache/hadoop/hive/shims/HadoopShimsSecure.java
 e205caa 
  
shims/common-secure/src/main/java/org/apache/hadoop/hive/thrift/DelegationTokenSecretManager.java
 29114f0 
  
shims/common-secure/src/main/java/org/apache/hadoop/hive/thrift/HadoopThriftAuthBridge20S.java
 dc89de1 
  shims/common/src/main/java/org/apache/hadoop/hive/shims/HadoopShims.java 
e15ab4e 
  
shims/common/src/main/java/org/apache/hadoop/hive/thrift/HadoopThriftAuthBridge.java
 03f4e51 

Diff: https://reviews.apache.org/r/13845/diff/


Testing
---

Since this requires kerberos setup, its tested by a standalone test program 
that runs various existing and new secure connection scenarios. The test code 
is attached to the ticket at 
https://issues.apache.org/jira/secure/attachment/12600119/ProxyAuth.java


Thanks,

Prasad Mujumdar

[jira] [Commented] (HIVE-6492) limit partition number involved in a table scan


[ 
https://issues.apache.org/jira/browse/HIVE-6492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13920146#comment-13920146
 ] 

Gunther Hagleitner commented on HIVE-6492:
--

[~selinazh] - sorry if i messed up the metadata only part. Can you give me an 
example where the patch doesn't work?

 limit partition number involved in a table scan
 ---

 Key: HIVE-6492
 URL: https://issues.apache.org/jira/browse/HIVE-6492
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor
Affects Versions: 0.12.0
Reporter: Selina Zhang
 Fix For: 0.13.0

 Attachments: HIVE-6492.1.patch.txt, HIVE-6492.2.patch.txt, 
 HIVE-6492.3.patch.txt, HIVE-6492.4.patch.txt, HIVE-6492.4.patch_suggestion, 
 HIVE-6492.5.patch.txt

   Original Estimate: 24h
  Remaining Estimate: 24h

 To protect the cluster, a new configure variable 
 hive.limit.query.max.table.partition is added to hive configuration to
 limit the table partitions involved in a table scan. 
 The default value will be set to -1 which means there is no limit by default. 
 This variable will not affect metadata only query.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-5155) Support secure proxy user access to HiveServer2

2014-03-04 Thread Prasad Mujumdar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-5155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasad Mujumdar updated HIVE-5155:
--

Attachment: HIVE-5155-noThrift.8.patch

 Support secure proxy user access to HiveServer2
 ---

 Key: HIVE-5155
 URL: https://issues.apache.org/jira/browse/HIVE-5155
 Project: Hive
  Issue Type: Improvement
  Components: Authentication, HiveServer2, JDBC
Affects Versions: 0.12.0
Reporter: Prasad Mujumdar
Assignee: Prasad Mujumdar
 Attachments: HIVE-5155-1-nothrift.patch, HIVE-5155-noThrift.2.patch, 
 HIVE-5155-noThrift.4.patch, HIVE-5155-noThrift.5.patch, 
 HIVE-5155-noThrift.6.patch, HIVE-5155-noThrift.7.patch, 
 HIVE-5155-noThrift.8.patch, HIVE-5155.1.patch, HIVE-5155.2.patch, 
 HIVE-5155.3.patch, ProxyAuth.java, ProxyAuth.out, TestKERBEROS_Hive_JDBC.java


 The HiveServer2 can authenticate a client using via Kerberos and impersonate 
 the connecting user with underlying secure hadoop. This becomes a gateway for 
 a remote client to access secure hadoop cluster. Now this works fine for when 
 the client obtains Kerberos ticket and directly connects to HiveServer2. 
 There's another big use case for middleware tools where the end user wants to 
 access Hive via another server. For example Oozie action or Hue submitting 
 queries or a BI tool server accessing to HiveServer2. In these cases, the 
 third party server doesn't have end user's Kerberos credentials and hence it 
 can't submit queries to HiveServer2 on behalf of the end user.
 This ticket is for enabling proxy access to HiveServer2 for third party tools 
 on behalf of end users. There are two parts of the solution proposed in this 
 ticket:
 1) Delegation token based connection for Oozie (OOZIE-1457)
 This is the common mechanism for Hadoop ecosystem components. Hive Remote 
 Metastore and HCatalog already support this. This is suitable for tool like 
 Oozie that submits the MR jobs as actions on behalf of its client. Oozie 
 already uses similar mechanism for Metastore/HCatalog access.
 2) Direct proxy access for privileged hadoop users
 The delegation token implementation can be a challenge for non-hadoop 
 (especially non-java) components. This second part enables a privileged user 
 to directly specify an alternate session user during the connection. If the 
 connecting user has hadoop level privilege to impersonate the requested 
 userid, then HiveServer2 will run the session as that requested user. For 
 example, user Hue is allowed to impersonate user Bob (via core-site.xml proxy 
 user configuration). Then user Hue can connect to HiveServer2 and specify Bob 
 as session user via a session property. HiveServer2 will verify Hue's proxy 
 user privilege and then impersonate user Bob instead of Hue. This will enable 
 any third party tool to impersonate alternate userid without having to 
 implement delegation token connection.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-5155) Support secure proxy user access to HiveServer2

2014-03-04 Thread Prasad Mujumdar (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-5155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13920152#comment-13920152
 ] 

Prasad Mujumdar commented on HIVE-5155:
---

[~thejas]  [~vaibhavgumashta] The rebased patch is attached and review updated 
y'day. I found a minor rebase conflict that I just fixed. Please take a look 
when you get a chance. Thanks!

 Support secure proxy user access to HiveServer2
 ---

 Key: HIVE-5155
 URL: https://issues.apache.org/jira/browse/HIVE-5155
 Project: Hive
  Issue Type: Improvement
  Components: Authentication, HiveServer2, JDBC
Affects Versions: 0.12.0
Reporter: Prasad Mujumdar
Assignee: Prasad Mujumdar
 Attachments: HIVE-5155-1-nothrift.patch, HIVE-5155-noThrift.2.patch, 
 HIVE-5155-noThrift.4.patch, HIVE-5155-noThrift.5.patch, 
 HIVE-5155-noThrift.6.patch, HIVE-5155-noThrift.7.patch, 
 HIVE-5155-noThrift.8.patch, HIVE-5155.1.patch, HIVE-5155.2.patch, 
 HIVE-5155.3.patch, ProxyAuth.java, ProxyAuth.out, TestKERBEROS_Hive_JDBC.java


 The HiveServer2 can authenticate a client using via Kerberos and impersonate 
 the connecting user with underlying secure hadoop. This becomes a gateway for 
 a remote client to access secure hadoop cluster. Now this works fine for when 
 the client obtains Kerberos ticket and directly connects to HiveServer2. 
 There's another big use case for middleware tools where the end user wants to 
 access Hive via another server. For example Oozie action or Hue submitting 
 queries or a BI tool server accessing to HiveServer2. In these cases, the 
 third party server doesn't have end user's Kerberos credentials and hence it 
 can't submit queries to HiveServer2 on behalf of the end user.
 This ticket is for enabling proxy access to HiveServer2 for third party tools 
 on behalf of end users. There are two parts of the solution proposed in this 
 ticket:
 1) Delegation token based connection for Oozie (OOZIE-1457)
 This is the common mechanism for Hadoop ecosystem components. Hive Remote 
 Metastore and HCatalog already support this. This is suitable for tool like 
 Oozie that submits the MR jobs as actions on behalf of its client. Oozie 
 already uses similar mechanism for Metastore/HCatalog access.
 2) Direct proxy access for privileged hadoop users
 The delegation token implementation can be a challenge for non-hadoop 
 (especially non-java) components. This second part enables a privileged user 
 to directly specify an alternate session user during the connection. If the 
 connecting user has hadoop level privilege to impersonate the requested 
 userid, then HiveServer2 will run the session as that requested user. For 
 example, user Hue is allowed to impersonate user Bob (via core-site.xml proxy 
 user configuration). Then user Hue can connect to HiveServer2 and specify Bob 
 as session user via a session property. HiveServer2 will verify Hue's proxy 
 user privilege and then impersonate user Bob instead of Hue. This will enable 
 any third party tool to impersonate alternate userid without having to 
 implement delegation token connection.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-5687) Streaming support in Hive

2014-03-04 Thread Roshan Naik (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-5687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roshan Naik updated HIVE-5687:
--

Issue Type: Sub-task  (was: Bug)
Parent: HIVE-5317

 Streaming support in Hive
 -

 Key: HIVE-5687
 URL: https://issues.apache.org/jira/browse/HIVE-5687
 Project: Hive
  Issue Type: Sub-task
Reporter: Roshan Naik
Assignee: Roshan Naik
 Attachments: 5687-api-spec4.pdf, 5687-draft-api-spec.pdf, 
 5687-draft-api-spec2.pdf, 5687-draft-api-spec3.pdf, HIVE-5687.patch, 
 HIVE-5687.v2.patch


 Implement support for Streaming data into HIVE.
 - Provide a client streaming API 
 - Transaction support: Clients should be able to periodically commit a batch 
 of records atomically
 - Immediate visibility: Records should be immediately visible to queries on 
 commit
 - Should not overload HDFS with too many small files
 Use Cases:
  - Streaming logs into HIVE via Flume
  - Streaming results of computations from Storm



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-6486) Support secure Subject.doAs() in HiveServer2 JDBC client.

[
https://issues.apache.org/jira/browse/HIVE-6486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Shivaraju Gowda updated HIVE-6486:
--

Attachment: HIVE-6486.2.patch

The test failure in Pre-commit tests looks unrelated to the patch. The test
case passed in my setup. I have rebased the patch to the trunk and uploading it
again.

Support secure Subject.doAs() in HiveServer2 JDBC client.
-

Attachments: HIVE-6486.1.patch, HIVE-6486.2.patch,
Hive_011_Support-Subject_doAS.patch, TestHive_SujectDoAs.java

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-6486) Support secure Subject.doAs() in HiveServer2 JDBC client.


 [ 
https://issues.apache.org/jira/browse/HIVE-6486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shivaraju Gowda updated HIVE-6486:
--

Status: Patch Available  (was: Open)

 Support secure Subject.doAs() in HiveServer2 JDBC client.
 -

 Key: HIVE-6486
 URL: https://issues.apache.org/jira/browse/HIVE-6486
 Project: Hive
  Issue Type: Improvement
  Components: Authentication, HiveServer2, JDBC
Affects Versions: 0.12.0, 0.11.0
Reporter: Shivaraju Gowda
Assignee: Shivaraju Gowda
 Fix For: 0.13.0

 Attachments: HIVE-6486.1.patch, HIVE-6486.2.patch, 
 Hive_011_Support-Subject_doAS.patch, TestHive_SujectDoAs.java


 HIVE-5155 addresses the problem of kerberos authentication in multi-user 
 middleware server using proxy user.  In this mode the principal used by the 
 middle ware server has privileges to impersonate selected users in 
 Hive/Hadoop. 
 This enhancement is to support Subject.doAs() authentication in  Hive JDBC 
 layer so that the end users Kerberos Subject is passed through in the middle 
 ware server. With this improvement there won't be any additional setup in the 
 server to grant proxy privileges to some users and there won't be need to 
 specify a proxy user in the JDBC client. This version should also be more 
 secure since it won't require principals with the privileges to impersonate 
 other users in Hive/Hadoop setup.
  



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-6486) Support secure Subject.doAs() in HiveServer2 JDBC client.


 [ 
https://issues.apache.org/jira/browse/HIVE-6486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shivaraju Gowda updated HIVE-6486:
--

Status: Open  (was: Patch Available)

 Support secure Subject.doAs() in HiveServer2 JDBC client.
 -

 Key: HIVE-6486
 URL: https://issues.apache.org/jira/browse/HIVE-6486
 Project: Hive
  Issue Type: Improvement
  Components: Authentication, HiveServer2, JDBC
Affects Versions: 0.12.0, 0.11.0
Reporter: Shivaraju Gowda
Assignee: Shivaraju Gowda
 Fix For: 0.13.0

 Attachments: HIVE-6486.1.patch, HIVE-6486.2.patch, 
 Hive_011_Support-Subject_doAS.patch, TestHive_SujectDoAs.java


 HIVE-5155 addresses the problem of kerberos authentication in multi-user 
 middleware server using proxy user.  In this mode the principal used by the 
 middle ware server has privileges to impersonate selected users in 
 Hive/Hadoop. 
 This enhancement is to support Subject.doAs() authentication in  Hive JDBC 
 layer so that the end users Kerberos Subject is passed through in the middle 
 ware server. With this improvement there won't be any additional setup in the 
 server to grant proxy privileges to some users and there won't be need to 
 specify a proxy user in the JDBC client. This version should also be more 
 secure since it won't require principals with the privileges to impersonate 
 other users in Hive/Hadoop setup.
  



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-5950) ORC SARG creation fails with NPE for predicate conditions with decimal/date/char/varchar datatypes


 [ 
https://issues.apache.org/jira/browse/HIVE-5950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-5950:
-

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed to trunk. Thanks Prasanth!

 ORC SARG creation fails with NPE for predicate conditions with 
 decimal/date/char/varchar datatypes
 --

 Key: HIVE-5950
 URL: https://issues.apache.org/jira/browse/HIVE-5950
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0
Reporter: Prasanth J
Assignee: Prasanth J
  Labels: orcfile
 Attachments: HIVE-5950.1.patch, HIVE-5950.2.patch, HIVE-5950.3.patch, 
 HIVE-5950.4.patch, HIVE-5950.5.patch


 When decimal or date column is used, the type field in PredicateLeafImpl will 
 be set to null. This will result in NPE during predicate leaf generation 
 because of null dereferencing in hashcode computation. SARG creation should 
 be extended to support/handle decimal and date data types.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Review Request 18757: HIVE-6486 Support secure Subject.doAs() in HiveServer2 JDBC client

2014-03-04 Thread Shivaraju Gowda


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18757/
---

Review request for hive, Kevin Minder, Prasad Mujumdar, Thejas Nair, and 
Vaibhav Gumashta.


Bugs: HIVE-6486
https://issues.apache.org/jira/browse/HIVE-6486


Repository: hive


Description
---

Support secure Subject.doAs() in HiveServer2 JDBC client.

Original review: https://reviews.apache.org/r/18464/


Diffs
-

  
http://svn.apache.org/repos/asf/hive/trunk/jdbc/src/java/org/apache/hive/jdbc/HiveConnection.java
 1574208 
  
http://svn.apache.org/repos/asf/hive/trunk/service/src/java/org/apache/hive/service/auth/KerberosSaslHelper.java
 1574208 
  
http://svn.apache.org/repos/asf/hive/trunk/service/src/java/org/apache/hive/service/auth/TSubjectAssumingTransport.java
 PRE-CREATION 

Diff: https://reviews.apache.org/r/18757/diff/


Testing
---

Manual testing.


Thanks,

Shivaraju Gowda

[jira] [Commented] (HIVE-6486) Support secure Subject.doAs() in HiveServer2 JDBC client.


[ 
https://issues.apache.org/jira/browse/HIVE-6486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13920193#comment-13920193
 ] 

Shivaraju Gowda commented on HIVE-6486:
---

Vaibhav Gumashta I have created  a review with the rebased trunk (ReviewBoard 
#18757). I couldn't edit the current review because I was not the owner.


 Support secure Subject.doAs() in HiveServer2 JDBC client.
 -

 Key: HIVE-6486
 URL: https://issues.apache.org/jira/browse/HIVE-6486
 Project: Hive
  Issue Type: Improvement
  Components: Authentication, HiveServer2, JDBC
Affects Versions: 0.11.0, 0.12.0
Reporter: Shivaraju Gowda
Assignee: Shivaraju Gowda
 Fix For: 0.13.0

 Attachments: HIVE-6486.1.patch, HIVE-6486.2.patch, 
 Hive_011_Support-Subject_doAS.patch, TestHive_SujectDoAs.java


 HIVE-5155 addresses the problem of kerberos authentication in multi-user 
 middleware server using proxy user.  In this mode the principal used by the 
 middle ware server has privileges to impersonate selected users in 
 Hive/Hadoop. 
 This enhancement is to support Subject.doAs() authentication in  Hive JDBC 
 layer so that the end users Kerberos Subject is passed through in the middle 
 ware server. With this improvement there won't be any additional setup in the 
 server to grant proxy privileges to some users and there won't be need to 
 specify a proxy user in the JDBC client. This version should also be more 
 secure since it won't require principals with the privileges to impersonate 
 other users in Hive/Hadoop setup.
  



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6486) Support secure Subject.doAs() in HiveServer2 JDBC client.


[ 
https://issues.apache.org/jira/browse/HIVE-6486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13920201#comment-13920201
 ] 

Shivaraju Gowda commented on HIVE-6486:
---

Prasad Mujumdar: Thanks for the review and the offer to add the test. The test 
case attached to this issue might serve as a good starting point. 

 Support secure Subject.doAs() in HiveServer2 JDBC client.
 -

 Key: HIVE-6486
 URL: https://issues.apache.org/jira/browse/HIVE-6486
 Project: Hive
  Issue Type: Improvement
  Components: Authentication, HiveServer2, JDBC
Affects Versions: 0.11.0, 0.12.0
Reporter: Shivaraju Gowda
Assignee: Shivaraju Gowda
 Fix For: 0.13.0

 Attachments: HIVE-6486.1.patch, HIVE-6486.2.patch, 
 Hive_011_Support-Subject_doAS.patch, TestHive_SujectDoAs.java


 HIVE-5155 addresses the problem of kerberos authentication in multi-user 
 middleware server using proxy user.  In this mode the principal used by the 
 middle ware server has privileges to impersonate selected users in 
 Hive/Hadoop. 
 This enhancement is to support Subject.doAs() authentication in  Hive JDBC 
 layer so that the end users Kerberos Subject is passed through in the middle 
 ware server. With this improvement there won't be any additional setup in the 
 server to grant proxy privileges to some users and there won't be need to 
 specify a proxy user in the JDBC client. This version should also be more 
 secure since it won't require principals with the privileges to impersonate 
 other users in Hive/Hadoop setup.
  



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6486) Support secure Subject.doAs() in HiveServer2 JDBC client.

[
https://issues.apache.org/jira/browse/HIVE-6486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13920205#comment-13920205
]

Shivaraju Gowda commented on HIVE-6486:
---

Thejas M Nair : Thanks for the review.

Vaibhav Gumashta had some concerns on using the auth url property to enable
this functionality, once it is cleared I will add the usage notes to the
release notes section of the jira.

I don't know where the Wiki Documentation is, can you point me to it, I will
see if I can help.

Support secure Subject.doAs() in HiveServer2 JDBC client.
-

Attachments: HIVE-6486.1.patch, HIVE-6486.2.patch,
Hive_011_Support-Subject_doAS.patch, TestHive_SujectDoAs.java

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-6325) Enable using multiple concurrent sessions in tez

2014-03-04 Thread Vikram Dixit K (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikram Dixit K updated HIVE-6325:
-

Attachment: HIVE-6325.11.patch

Rebasing patch. No tests affected.

 Enable using multiple concurrent sessions in tez
 

 Key: HIVE-6325
 URL: https://issues.apache.org/jira/browse/HIVE-6325
 Project: Hive
  Issue Type: Improvement
  Components: Tez
Affects Versions: 0.13.0
Reporter: Vikram Dixit K
Assignee: Vikram Dixit K
 Attachments: HIVE-6325.1.patch, HIVE-6325.10.patch, 
 HIVE-6325.11.patch, HIVE-6325.2.patch, HIVE-6325.3.patch, HIVE-6325.4.patch, 
 HIVE-6325.5.patch, HIVE-6325.6.patch, HIVE-6325.7.patch, HIVE-6325.8.patch, 
 HIVE-6325.9.patch


 We would like to enable multiple concurrent sessions in tez via hive server 
 2. This will enable users to make efficient use of the cluster when it has 
 been partitioned using yarn queues.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-6325) Enable using multiple concurrent sessions in tez

2014-03-04 Thread Vikram Dixit K (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikram Dixit K updated HIVE-6325:
-

Status: Patch Available  (was: Open)

 Enable using multiple concurrent sessions in tez
 

 Key: HIVE-6325
 URL: https://issues.apache.org/jira/browse/HIVE-6325
 Project: Hive
  Issue Type: Improvement
  Components: Tez
Affects Versions: 0.13.0
Reporter: Vikram Dixit K
Assignee: Vikram Dixit K
 Attachments: HIVE-6325.1.patch, HIVE-6325.10.patch, 
 HIVE-6325.11.patch, HIVE-6325.2.patch, HIVE-6325.3.patch, HIVE-6325.4.patch, 
 HIVE-6325.5.patch, HIVE-6325.6.patch, HIVE-6325.7.patch, HIVE-6325.8.patch, 
 HIVE-6325.9.patch


 We would like to enable multiple concurrent sessions in tez via hive server 
 2. This will enable users to make efficient use of the cluster when it has 
 been partitioned using yarn queues.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-6539) Couple of issues in fs based stats collection


 [ 
https://issues.apache.org/jira/browse/HIVE-6539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-6539:
---

Affects Version/s: 0.13.0

 Couple of issues in fs based stats collection
 -

 Key: HIVE-6539
 URL: https://issues.apache.org/jira/browse/HIVE-6539
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Fix For: 0.13.0

 Attachments: HIVE-6539.patch


 While testing on cluster found couple of bugs:
 * NPE in certain case.
 * map object reuse causing problem



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-6539) Couple of issues in fs based stats collection


 [ 
https://issues.apache.org/jira/browse/HIVE-6539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-6539:
---

   Resolution: Fixed
Fix Version/s: 0.13.0
   Status: Resolved  (was: Patch Available)

Committed to trunk.

 Couple of issues in fs based stats collection
 -

 Key: HIVE-6539
 URL: https://issues.apache.org/jira/browse/HIVE-6539
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Fix For: 0.13.0

 Attachments: HIVE-6539.patch


 While testing on cluster found couple of bugs:
 * NPE in certain case.
 * map object reuse causing problem



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6545) analyze table throws NPE for non-existent tables.


[ 
https://issues.apache.org/jira/browse/HIVE-6545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13920226#comment-13920226
 ] 

Hive QA commented on HIVE-6545:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12632573/HIVE-6545.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 5240 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_parallel_orderby
{noformat}

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1621/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1621/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12632573

 analyze table throws NPE for non-existent tables.
 -

 Key: HIVE-6545
 URL: https://issues.apache.org/jira/browse/HIVE-6545
 Project: Hive
  Issue Type: Bug
  Components: Statistics
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Attachments: HIVE-6545.patch


 Instead of NPE, we should give error message to user.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6492) limit partition number involved in a table scan


[ 
https://issues.apache.org/jira/browse/HIVE-6492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13920237#comment-13920237
 ] 

Selina Zhang commented on HIVE-6492:


In the new test case limit_partition_2.q:
select distinct hr from srcpart;
should let pass because hr is the partition key. With the new patch, it is 
blocked:
FAILED: SemanticException Number of partitions scanned (=4) on table srcpart 
exceeds limit (=1). This is controlled by hive.limit.query.max.table.partition.

 limit partition number involved in a table scan
 ---

 Key: HIVE-6492
 URL: https://issues.apache.org/jira/browse/HIVE-6492
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor
Affects Versions: 0.12.0
Reporter: Selina Zhang
 Fix For: 0.13.0

 Attachments: HIVE-6492.1.patch.txt, HIVE-6492.2.patch.txt, 
 HIVE-6492.3.patch.txt, HIVE-6492.4.patch.txt, HIVE-6492.4.patch_suggestion, 
 HIVE-6492.5.patch.txt

   Original Estimate: 24h
  Remaining Estimate: 24h

 To protect the cluster, a new configure variable 
 hive.limit.query.max.table.partition is added to hive configuration to
 limit the table partitions involved in a table scan. 
 The default value will be set to -1 which means there is no limit by default. 
 This variable will not affect metadata only query.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6529) Tez output files are out of date


[ 
https://issues.apache.org/jira/browse/HIVE-6529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13920236#comment-13920236
 ] 

Gunther Hagleitner commented on HIVE-6529:
--

+1

 Tez output files are out of date
 

 Key: HIVE-6529
 URL: https://issues.apache.org/jira/browse/HIVE-6529
 Project: Hive
  Issue Type: Bug
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
Priority: Minor
 Attachments: HIVE-6529.patch






--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-6545) analyze table throws NPE for non-existent tables.


 [ 
https://issues.apache.org/jira/browse/HIVE-6545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-6545:
---

   Resolution: Fixed
Fix Version/s: 0.13.0
   Status: Resolved  (was: Patch Available)

Committed to trunk.

 analyze table throws NPE for non-existent tables.
 -

 Key: HIVE-6545
 URL: https://issues.apache.org/jira/browse/HIVE-6545
 Project: Hive
  Issue Type: Bug
  Components: Statistics
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Fix For: 0.13.0

 Attachments: HIVE-6545.patch


 Instead of NPE, we should give error message to user.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Created] (HIVE-6548) Missing owner name and type fields in schema script for DBS table

Ashutosh Chauhan created HIVE-6548:
--

 Summary: Missing owner name and type fields in schema script for 
DBS table 
 Key: HIVE-6548
 URL: https://issues.apache.org/jira/browse/HIVE-6548
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.13.0
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan


HIVE-6386 introduced new columns in DBS table, but those are missing from 
schema scripts.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-6548) Missing owner name and type fields in schema script for DBS table


 [ 
https://issues.apache.org/jira/browse/HIVE-6548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-6548:
---

Attachment: HIVE-6548.patch

Patch to add missing columns in schema scripts.

 Missing owner name and type fields in schema script for DBS table 
 --

 Key: HIVE-6548
 URL: https://issues.apache.org/jira/browse/HIVE-6548
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.13.0
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Attachments: HIVE-6548.patch


 HIVE-6386 introduced new columns in DBS table, but those are missing from 
 schema scripts.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-6548) Missing owner name and type fields in schema script for DBS table


 [ 
https://issues.apache.org/jira/browse/HIVE-6548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-6548:
---

Status: Patch Available  (was: Open)

 Missing owner name and type fields in schema script for DBS table 
 --

 Key: HIVE-6548
 URL: https://issues.apache.org/jira/browse/HIVE-6548
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.13.0
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Attachments: HIVE-6548.patch


 HIVE-6386 introduced new columns in DBS table, but those are missing from 
 schema scripts.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Re: Timeline for the Hive 0.13 release?

2014-03-04 Thread Sushanth Sowmyan

I have two patches still as patch-available, that have had +1s as
well, but are waiting on pre-commit tests picking them up go in to
0.13:

https://issues.apache.org/jira/browse/HIVE-6507 (refactor of table
property names from string constants to an enum in OrcFile)
https://issues.apache.org/jira/browse/HIVE-6499 (fixes bug where calls
like create table and drop table can fail if metastore-side
authorization is used in conjunction with custom
inputformat/outputformat/serdes that are not loadable from the
metastore-side)

[jira] [Updated] (HIVE-5843) Transaction manager for Hive


 [ 
https://issues.apache.org/jira/browse/HIVE-5843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-5843:
---

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed to trunk. Thanks, Alan!

 Transaction manager for Hive
 

 Key: HIVE-5843
 URL: https://issues.apache.org/jira/browse/HIVE-5843
 Project: Hive
  Issue Type: Sub-task
Affects Versions: 0.12.0
Reporter: Alan Gates
Assignee: Alan Gates
 Fix For: 0.13.0

 Attachments: 5843.5-wip.patch, HIVE-5843-src-only.6.patch, 
 HIVE-5843-src-only.patch, HIVE-5843.10.patch, HIVE-5843.2.patch, 
 HIVE-5843.3-src.path, HIVE-5843.3.patch, HIVE-5843.4-src.patch, 
 HIVE-5843.4.patch, HIVE-5843.6.patch, HIVE-5843.7.patch, HIVE-5843.8.patch, 
 HIVE-5843.8.src-only.patch, HIVE-5843.9.patch, HIVE-5843.patch, 
 HiveTransactionManagerDetailedDesign (1).pdf


 As part of the ACID work proposed in HIVE-5317 a transaction manager is 
 required.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Re: Review Request 15873: Query cancel should stop running MR tasks

2014-03-04 Thread Navis Ryu



 On Feb. 27, 2014, 11:08 p.m., Thejas Nair wrote:
  ql/src/java/org/apache/hadoop/hive/ql/DriverContext.java, line 110
  https://reviews.apache.org/r/15873/diff/3/?file=478815#file478815line110
 
  When pollFinished is running, this shutdown() function will not be able 
  to make progress. Which means that the query cancellation will happen only 
  after a task (could be an MR task) is complete.
  
  It seems synchronizing around shutdown should be sufficient, either by 
  making it volatile or having synchronized methods around it.
  
  Since thread safe concurrent collection classes are being used here, I 
  don't see other concurrency issues that would make it necessary to make all 
  these functions synchronized. 
  
  
 
 
 Navis Ryu wrote:
 It just only polls status of running tasks and goes into wait state quite 
 quickly, so it would not hinder shutdown process. Furthermore, two threads, 
 polling and shutdown, has a race condition on both collections, runnable and 
 running, so those should be guarded by shared something.
 
 Thejas Nair wrote:
 Yes, it will go into the wait state quickly. But I haven't understood how 
 the wait helps here. There is no notify in this code, so the wait will always 
 wait for 2 seconds. It will be no different from a sleep(2000) .
 So it looks like the polling outside loop will continue until all the 
 currently running jobs are complete.


In javadoc, Object.wait()

The current thread must own this object's monitor. The thread 
releases ownership of this monitor and waits until another thread 
notifies threads waiting on this object's monitor

In wait state, any other thread can take the monitor (in sleep, it's not 
possible). So shutdown thread does not need to wait for 2 seconds. Polling 
thread might notice 2 seconds after shutdown as you said because it's not 
notified. But I think it's not a big deal. Isn't it? 


- Navis


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15873/#review35625
---


On March 4, 2014, 8:02 a.m., Navis Ryu wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/15873/
 ---
 
 (Updated March 4, 2014, 8:02 a.m.)
 
 
 Review request for hive.
 
 
 Bugs: HIVE-5901
 https://issues.apache.org/jira/browse/HIVE-5901
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 Currently, query canceling does not stop running MR job immediately.
 
 
 Diffs
 -
 
   ql/src/java/org/apache/hadoop/hive/ql/Driver.java 332cadb 
   ql/src/java/org/apache/hadoop/hive/ql/DriverContext.java c51a9c8 
   ql/src/java/org/apache/hadoop/hive/ql/exec/ConditionalTask.java 854cd52 
   ql/src/java/org/apache/hadoop/hive/ql/exec/TaskRunner.java ead7b59 
 
 Diff: https://reviews.apache.org/r/15873/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Navis Ryu

[jira] [Updated] (HIVE-6523) Tests with -Phadoop-2 and MiniMRCluster error if it doesn't find yarn-site.xml


 [ 
https://issues.apache.org/jira/browse/HIVE-6523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-6523:
---

Resolution: Won't Fix
Status: Resolved  (was: Patch Available)

 Tests with -Phadoop-2 and MiniMRCluster error if it doesn't find yarn-site.xml
 --

 Key: HIVE-6523
 URL: https://issues.apache.org/jira/browse/HIVE-6523
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0
 Environment: Hadoop 2.4.*
Reporter: Sushanth Sowmyan
Assignee: Sushanth Sowmyan
 Attachments: HIVE-6523.patch


 With the newer hadoop versions (2.4+) in tests, MiniMRCluster throws an error 
 loading resources if it can't find a yarn-site.xml in its classpath, which 
 affects test runs with -Phadoop-2 and minimrclusters.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6523) Tests with -Phadoop-2 and MiniMRCluster error if it doesn't find yarn-site.xml


[ 
https://issues.apache.org/jira/browse/HIVE-6523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13920277#comment-13920277
 ] 

Sushanth Sowmyan commented on HIVE-6523:


Closing as WONTFIX, since YARN-1758 has been fixed. This can be reopened at 
some time if this issue is observed again.

 Tests with -Phadoop-2 and MiniMRCluster error if it doesn't find yarn-site.xml
 --

 Key: HIVE-6523
 URL: https://issues.apache.org/jira/browse/HIVE-6523
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0
 Environment: Hadoop 2.4.*
Reporter: Sushanth Sowmyan
Assignee: Sushanth Sowmyan
 Attachments: HIVE-6523.patch


 With the newer hadoop versions (2.4+) in tests, MiniMRCluster throws an error 
 loading resources if it can't find a yarn-site.xml in its classpath, which 
 affects test runs with -Phadoop-2 and minimrclusters.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Created] (HIVE-6549) removed templeton.jar from webhcat-default.xml

2014-03-04 Thread Eugene Koifman (JIRA)

Eugene Koifman created HIVE-6549:


 Summary: removed templeton.jar from webhcat-default.xml
 Key: HIVE-6549
 URL: https://issues.apache.org/jira/browse/HIVE-6549
 Project: Hive
  Issue Type: Bug
  Components: WebHCat
Affects Versions: 0.12.0
Reporter: Eugene Koifman
Assignee: Eugene Koifman


this property is no longer used
also removed corresponding AppConfig.TEMPLETON_JAR_NAME



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-6549) removed templeton.jar from webhcat-default.xml

2014-03-04 Thread Eugene Koifman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-6549:
-

Priority: Minor  (was: Major)

 removed templeton.jar from webhcat-default.xml
 --

 Key: HIVE-6549
 URL: https://issues.apache.org/jira/browse/HIVE-6549
 Project: Hive
  Issue Type: Bug
  Components: WebHCat
Affects Versions: 0.12.0
Reporter: Eugene Koifman
Assignee: Eugene Koifman
Priority: Minor

 this property is no longer used
 also removed corresponding AppConfig.TEMPLETON_JAR_NAME



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Re: Timeline for the Hive 0.13 release?

branching now. Will be changing the pom files on trunk.
Will send another email when the branch and trunk changes are in.


On Mar 4, 2014, at 4:03 PM, Sushanth Sowmyan khorg...@gmail.com wrote:

 I have two patches still as patch-available, that have had +1s as
 well, but are waiting on pre-commit tests picking them up go in to
 0.13:
 
 https://issues.apache.org/jira/browse/HIVE-6507 (refactor of table
 property names from string constants to an enum in OrcFile)
 https://issues.apache.org/jira/browse/HIVE-6499 (fixes bug where calls
 like create table and drop table can fail if metastore-side
 authorization is used in conjunction with custom
 inputformat/outputformat/serdes that are not loadable from the
 metastore-side)


-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Timeline for the Hive 0.13 release?

2014-03-04 Thread Vikram Dixit

Hi,

I have https://issues.apache.org/jira/browse/HIVE-6325 in patch available.
It is awaiting pre-commit tests to run. I would like for it to go in as
well.

Thanks
Vikram.


On Tue, Mar 4, 2014 at 5:05 PM, Harish Butani hbut...@hortonworks.comwrote:

 branching now. Will be changing the pom files on trunk.
 Will send another email when the branch and trunk changes are in.


 On Mar 4, 2014, at 4:03 PM, Sushanth Sowmyan khorg...@gmail.com wrote:

  I have two patches still as patch-available, that have had +1s as
  well, but are waiting on pre-commit tests picking them up go in to
  0.13:
 
  https://issues.apache.org/jira/browse/HIVE-6507 (refactor of table
  property names from string constants to an enum in OrcFile)
  https://issues.apache.org/jira/browse/HIVE-6499 (fixes bug where calls
  like create table and drop table can fail if metastore-side
  authorization is used in conjunction with custom
  inputformat/outputformat/serdes that are not loadable from the
  metastore-side)


 --
 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or entity to
 which it is addressed and may contain information that is confidential,
 privileged and exempt from disclosure under applicable law. If the reader
 of this message is not the intended recipient, you are hereby notified that
 any printing, copying, dissemination, distribution, disclosure or
 forwarding of this communication is strictly prohibited. If you have
 received this communication in error, please contact the sender immediately
 and delete it from your system. Thank You.


-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

[jira] [Commented] (HIVE-5888) group by after join operation product no result when hive.optimize.skewjoin = true

2014-03-04 Thread Jian Fang (JIRA)

[
https://issues.apache.org/jira/browse/HIVE-5888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13920354#comment-13920354
]

Jian Fang commented on HIVE-5888:
-

We had the same problem for Hive 0.11. Ankit created a JIRA at
https://issues.apache.org/jira/browse/HIVE-6520.

Based on Ankit's observation, here is the root cause of the problem:

--
Hive's Skew join optimization is a physical optimization that changes the
operator DAG (At compile time, Hive first creates a basic operator DAG and then
various optimizations optimize it). After compile time skew join optimization,
the skew join related nodes will look like: (MR job with Reduce Join Operator
(Stage-1))-(Conditional Skew Join Task that performs Map Join (Stage-2)). When
Skew Join optimization kicks in at compile time, it sets a flag
handleSkewJoin in Stage-1. At run time, Stage-1 performs following (provided
handleSkewJoin flag was set):

1. Join unskewed keys through normal MR job.
2. Copies data with skewed keys (from all tables) in a specific directory
structure in hdfs.

Stage-2 then picks the skewed data and performs Map Join. The Map Join of the
skewed keys is the real optimization because it saves running reducer which has
to copy intermediate data from mappers.

Hive also has Map Join Optimization and this is the cause of the problem. A
normal map-reduce join is converted to map join if (n-1) small tables can fit
in memory. If this happens, at compile time, after both map join and skew join
optimization, nodes will look like: (MR job with Map Join Operator
(Stage-1))-(Conditional Skew Join Task that performs Map Join (Stage-2)).

Now the problem is that Skew Join optimization sets handleSkewJoin only for
Reduce Join Operator in Stage-1 (it assumes there will be a reducer). So, in
case there is Map Join Operator, handleSkewJoin flag is not set and Stage-1
doesn't copy skewed keys in hdfs. When Stage-2 runs, it is not able to find
skewed key directory and it gets eliminated at run time. Therefore, no results
are displayed.
-

I tried to set hive.optimize.skewjoin=true and hive.auto.convert.join=false so
that stage-1 would not be converted to a mapjoin to work around this problem.
But the reduce phase in stage-1 took an extremely long time. We had 200
reducers, most of them only have 5 or 6 input keys and all the remaining keys
were distributed to two reducers.

Seems the two reducers created very big RowContainer files on local disk, for
example.

-rwxrwxrwx 1 hadoop hadoop 334G Mar 5 00:56
RowContainer6650985529012862786.[129].tmp
-rw-r--r-- 1 hadoop hadoop 2.7G Mar 5 00:56
.RowContainer6650985529012862786.[129].tmp.crc

This behavior is really weird. The inconsistent results caused a lot trouble
for us.

Is there any way to work around this problem?

group by after join operation product no result when hive.optimize.skewjoin
= true

Key: HIVE-5888
URL: https://issues.apache.org/jira/browse/HIVE-5888
Project: Hive
Issue Type: Bug
Affects Versions: 0.12.0
Reporter: cyril liao

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6492) limit partition number involved in a table scan


[ 
https://issues.apache.org/jira/browse/HIVE-6492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13920358#comment-13920358
 ] 

Selina Zhang commented on HIVE-6492:


Also should let the test case pass in limit_partition_3.q 

set hive.compute.query.using.stats=true;
set hive.limit.query.max.table.partition=1;
select count(*) from part;

for it does not need a table scan. 

 limit partition number involved in a table scan
 ---

 Key: HIVE-6492
 URL: https://issues.apache.org/jira/browse/HIVE-6492
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor
Affects Versions: 0.12.0
Reporter: Selina Zhang
 Fix For: 0.13.0

 Attachments: HIVE-6492.1.patch.txt, HIVE-6492.2.patch.txt, 
 HIVE-6492.3.patch.txt, HIVE-6492.4.patch.txt, HIVE-6492.4.patch_suggestion, 
 HIVE-6492.5.patch.txt

   Original Estimate: 24h
  Remaining Estimate: 24h

 To protect the cluster, a new configure variable 
 hive.limit.query.max.table.partition is added to hive configuration to
 limit the table partitions involved in a table scan. 
 The default value will be set to -1 which means there is no limit by default. 
 This variable will not affect metadata only query.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6392) Hive (and HCatalog) don't allow super-users to add partitions to tables.


[ 
https://issues.apache.org/jira/browse/HIVE-6392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13920359#comment-13920359
 ] 

Hive QA commented on HIVE-6392:
---



{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12632575/HIVE-6392.patch

{color:green}SUCCESS:{color} +1 5244 tests passed

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1622/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1622/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12632575

 Hive (and HCatalog) don't allow super-users to add partitions to tables.
 

 Key: HIVE-6392
 URL: https://issues.apache.org/jira/browse/HIVE-6392
 Project: Hive
  Issue Type: Bug
  Components: Authorization
Affects Versions: 0.12.0, 0.13.0
Reporter: Mithun Radhakrishnan
Assignee: Mithun Radhakrishnan
 Attachments: HIVE-6392.branch-0.12.patch, HIVE-6392.patch


 HDFS allows for users to be added to a supergroup (identified by the 
 dfs.permissions.superusergroup key in hdfs-site.xml). Users in this group 
 are allowed to modify HDFS contents regardless of the path's ogw permissions.
 However, Hive's StorageBasedAuthProvider disallows such a superuser from 
 adding partitions to any table that doesn't explicitly grant write 
 permissions to said superuser. This causes the odd scenario where the 
 superuser writes data to a partition-directory (under the table's path), but 
 can't register the appropriate partition.
 I have a patch that brings the Metastore's behaviour in line with what the 
 HDFS allows.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Re: Timeline for the Hive 0.13 release?

the branch is created.
have changed the poms in both branches.
Planning to setup a wikipage to track jiras that will get ported to 0.13

regards,
Harish.


On Mar 4, 2014, at 5:05 PM, Harish Butani hbut...@hortonworks.com wrote:

 branching now. Will be changing the pom files on trunk.
 Will send another email when the branch and trunk changes are in.
 
 
 On Mar 4, 2014, at 4:03 PM, Sushanth Sowmyan khorg...@gmail.com wrote:
 
 I have two patches still as patch-available, that have had +1s as
 well, but are waiting on pre-commit tests picking them up go in to
 0.13:
 
 https://issues.apache.org/jira/browse/HIVE-6507 (refactor of table
 property names from string constants to an enum in OrcFile)
 https://issues.apache.org/jira/browse/HIVE-6499 (fixes bug where calls
 like create table and drop table can fail if metastore-side
 authorization is used in conjunction with custom
 inputformat/outputformat/serdes that are not loadable from the
 metastore-side)
 


-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

[jira] [Commented] (HIVE-6548) Missing owner name and type fields in schema script for DBS table


[ 
https://issues.apache.org/jira/browse/HIVE-6548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13920421#comment-13920421
 ] 

Thejas M Nair commented on HIVE-6548:
-

+1

 Missing owner name and type fields in schema script for DBS table 
 --

 Key: HIVE-6548
 URL: https://issues.apache.org/jira/browse/HIVE-6548
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.13.0
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Attachments: HIVE-6548.patch


 HIVE-6386 introduced new columns in DBS table, but those are missing from 
 schema scripts.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-5931) SQL std auth - add metastore get_role_participants api - to support DESCRIBE ROLE


[ 
https://issues.apache.org/jira/browse/HIVE-5931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13920423#comment-13920423
 ] 

Ashutosh Chauhan commented on HIVE-5931:


Few comments on proposed api:
* Better name for 1st method : get_principals_in_role() ?
* Better name for 2nd method : get_roles_granted_to_principal() ?
* Also struct needs better name. Also, put explanation for struct, since it 
carries redundant info, depending on method it is used in. 
* principalType in struct should be enum

 SQL std auth - add metastore get_role_participants api - to support DESCRIBE 
 ROLE
 -

 Key: HIVE-5931
 URL: https://issues.apache.org/jira/browse/HIVE-5931
 Project: Hive
  Issue Type: Sub-task
  Components: Authorization
Reporter: Thejas M Nair
 Attachments: HIVE-5931.thriftapi.followup.patch, 
 HIVE-5931.thriftapi.patch

   Original Estimate: 24h
  Remaining Estimate: 24h

 This is necessary for DESCRIBE ROLE role statement. This will list
 all users and roles that participate in a role. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-6460) Need new show functionality for transactions

2014-03-04 Thread Alan Gates (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated HIVE-6460:
-

Status: Patch Available  (was: Open)

 Need new show functionality for transactions
 --

 Key: HIVE-6460
 URL: https://issues.apache.org/jira/browse/HIVE-6460
 Project: Hive
  Issue Type: Sub-task
  Components: SQL
Reporter: Alan Gates
Assignee: Alan Gates
 Fix For: 0.13.0

 Attachments: 6460.wip.patch, HIVE-6460.patch


 With the addition of transactions and compactions for delta files some new 
 show commands are required.
 * show transactions to show currently open or aborted transactions
 * show compactions to show currently waiting or running compactions
 * show locks needs to work with the new db style of locks as well.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-6460) Need new show functionality for transactions

2014-03-04 Thread Alan Gates (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated HIVE-6460:
-

Attachment: HIVE-6460.patch

 Need new show functionality for transactions
 --

 Key: HIVE-6460
 URL: https://issues.apache.org/jira/browse/HIVE-6460
 Project: Hive
  Issue Type: Sub-task
  Components: SQL
Reporter: Alan Gates
Assignee: Alan Gates
 Fix For: 0.13.0

 Attachments: 6460.wip.patch, HIVE-6460.patch


 With the addition of transactions and compactions for delta files some new 
 show commands are required.
 * show transactions to show currently open or aborted transactions
 * show compactions to show currently waiting or running compactions
 * show locks needs to work with the new db style of locks as well.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-6392) Hive (and HCatalog) don't allow super-users to add partitions to tables.


 [ 
https://issues.apache.org/jira/browse/HIVE-6392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-6392:


   Resolution: Fixed
Fix Version/s: 0.13.0
   Status: Resolved  (was: Patch Available)

Patch committed to trunk. Thanks for the contribution Mithun!


 Hive (and HCatalog) don't allow super-users to add partitions to tables.
 

 Key: HIVE-6392
 URL: https://issues.apache.org/jira/browse/HIVE-6392
 Project: Hive
  Issue Type: Bug
  Components: Authorization
Affects Versions: 0.12.0, 0.13.0
Reporter: Mithun Radhakrishnan
Assignee: Mithun Radhakrishnan
 Fix For: 0.13.0

 Attachments: HIVE-6392.branch-0.12.patch, HIVE-6392.patch


 HDFS allows for users to be added to a supergroup (identified by the 
 dfs.permissions.superusergroup key in hdfs-site.xml). Users in this group 
 are allowed to modify HDFS contents regardless of the path's ogw permissions.
 However, Hive's StorageBasedAuthProvider disallows such a superuser from 
 adding partitions to any table that doesn't explicitly grant write 
 permissions to said superuser. This causes the odd scenario where the 
 superuser writes data to a partition-directory (under the table's path), but 
 can't register the appropriate partition.
 I have a patch that brings the Metastore's behaviour in line with what the 
 HDFS allows.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6549) removed templeton.jar from webhcat-default.xml


[ 
https://issues.apache.org/jira/browse/HIVE-6549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13920425#comment-13920425
 ] 

Lefty Leverenz commented on HIVE-6549:
--

When this gets committed, the wiki needs to be edited (with version 
information):

* [WebHCat Configuration:  Configuration Variables 
|https://cwiki.apache.org/confluence/display/Hive/WebHCat+Configure#WebHCatConfigure-ConfigurationVariables]

The existing table shows configuration defaults for Hive 0.11.0, so they ought 
to be updated too.  But if the only changes are 11 or 12 or 13 in file 
names and paths, then a note could explain that in the intro to the table.

 removed templeton.jar from webhcat-default.xml
 --

 Key: HIVE-6549
 URL: https://issues.apache.org/jira/browse/HIVE-6549
 Project: Hive
  Issue Type: Bug
  Components: WebHCat
Affects Versions: 0.12.0
Reporter: Eugene Koifman
Assignee: Eugene Koifman
Priority: Minor

 this property is no longer used
 also removed corresponding AppConfig.TEMPLETON_JAR_NAME



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Created] (HIVE-6550) SemanticAnalyzer.reset() doesn't clear all the state

2014-03-04 Thread Laljo John Pullokkaran (JIRA)

Laljo John Pullokkaran created HIVE-6550:


 Summary: SemanticAnalyzer.reset() doesn't clear all the state
 Key: HIVE-6550
 URL: https://issues.apache.org/jira/browse/HIVE-6550
 Project: Hive
  Issue Type: Bug
Reporter: Laljo John Pullokkaran
Assignee: Laljo John Pullokkaran






--
This message was sent by Atlassian JIRA
(v6.2#6252)

Re: Timeline for the Hive 0.13 release?

Tracking jiras to be applied to branch 0.13 here: 
https://cwiki.apache.org/confluence/display/Hive/Hive+0.13+release+status

On Mar 4, 2014, at 5:45 PM, Harish Butani hbut...@hortonworks.com wrote:

 the branch is created.
 have changed the poms in both branches.
 Planning to setup a wikipage to track jiras that will get ported to 0.13
 
 regards,
 Harish.
 
 
 On Mar 4, 2014, at 5:05 PM, Harish Butani hbut...@hortonworks.com wrote:
 
 branching now. Will be changing the pom files on trunk.
 Will send another email when the branch and trunk changes are in.
 
 
 On Mar 4, 2014, at 4:03 PM, Sushanth Sowmyan khorg...@gmail.com wrote:
 
 I have two patches still as patch-available, that have had +1s as
 well, but are waiting on pre-commit tests picking them up go in to
 0.13:
 
 https://issues.apache.org/jira/browse/HIVE-6507 (refactor of table
 property names from string constants to an enum in OrcFile)
 https://issues.apache.org/jira/browse/HIVE-6499 (fixes bug where calls
 like create table and drop table can fail if metastore-side
 authorization is used in conjunction with custom
 inputformat/outputformat/serdes that are not loadable from the
 metastore-side)
 
 


-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

[jira] [Commented] (HIVE-6486) Support secure Subject.doAs() in HiveServer2 JDBC client.


[ 
https://issues.apache.org/jira/browse/HIVE-6486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13920437#comment-13920437
 ] 

Lefty Leverenz commented on HIVE-6486:
--

Here's the user doc for HiveServer2 JDBC clients:

* [HiveServer2 Clients:  JDBC 
|https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Clients#HiveServer2Clients-JDBC]

Administration doc is here:

* [Setting Up HiveServer2 
|https://cwiki.apache.org/confluence/display/Hive/Setting+Up+HiveServer2]

In particular: 

* [Setting Up HiveServer2:  Authentication/Security Configuration 
|https://cwiki.apache.org/confluence/display/Hive/Setting+Up+HiveServer2#SettingUpHiveServer2-Authentication/SecurityConfiguration]

 Support secure Subject.doAs() in HiveServer2 JDBC client.
 -

 Key: HIVE-6486
 URL: https://issues.apache.org/jira/browse/HIVE-6486
 Project: Hive
  Issue Type: Improvement
  Components: Authentication, HiveServer2, JDBC
Affects Versions: 0.11.0, 0.12.0
Reporter: Shivaraju Gowda
Assignee: Shivaraju Gowda
 Fix For: 0.13.0

 Attachments: HIVE-6486.1.patch, HIVE-6486.2.patch, 
 Hive_011_Support-Subject_doAS.patch, TestHive_SujectDoAs.java


 HIVE-5155 addresses the problem of kerberos authentication in multi-user 
 middleware server using proxy user.  In this mode the principal used by the 
 middle ware server has privileges to impersonate selected users in 
 Hive/Hadoop. 
 This enhancement is to support Subject.doAs() authentication in  Hive JDBC 
 layer so that the end users Kerberos Subject is passed through in the middle 
 ware server. With this improvement there won't be any additional setup in the 
 server to grant proxy privileges to some users and there won't be need to 
 specify a proxy user in the JDBC client. This version should also be more 
 secure since it won't require principals with the privileges to impersonate 
 other users in Hive/Hadoop setup.
  



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6433) SQL std auth - allow grant/revoke roles if user has ADMIN OPTION


[ 
https://issues.apache.org/jira/browse/HIVE-6433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13920439#comment-13920439
 ] 

Thejas M Nair commented on HIVE-6433:
-

I will comb through the issues and create a consolidated doc for parent 
HIVE-5837. This change is specific to HIVE-5837.


 SQL std auth - allow grant/revoke roles if user has ADMIN OPTION
 

 Key: HIVE-6433
 URL: https://issues.apache.org/jira/browse/HIVE-6433
 Project: Hive
  Issue Type: Sub-task
Reporter: Thejas M Nair
Assignee: Ashutosh Chauhan
 Fix For: 0.13.0

 Attachments: HIVE-6433.1.patch, HIVE-6433.2.patch, HIVE-6433.patch


 Follow up jira for HIVE-5952.
 If a user/role has admin option on a role, then user should be able to grant 
 /revoke other users to/from the role.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Assigned] (HIVE-6432) Remove deprecated methods in HCatalog


 [ 
https://issues.apache.org/jira/browse/HIVE-6432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan reassigned HIVE-6432:
--

Assignee: Sushanth Sowmyan

 Remove deprecated methods in HCatalog
 -

 Key: HIVE-6432
 URL: https://issues.apache.org/jira/browse/HIVE-6432
 Project: Hive
  Issue Type: Task
  Components: HCatalog
Reporter: Sushanth Sowmyan
Assignee: Sushanth Sowmyan

 There are a lot of methods in HCatalog that have been deprecated in HCatalog 
 0.5, and some that were recently deprecated in Hive 0.11 (joint release with 
 HCatalog).
 The goal for HCatalog deprecation is that in general, after something has 
 been deprecated, it is expected to stay around for 2 releases, which means 
 hive-0.13 will be the last release to ship with all the methods that were 
 deprecated in hive-0.11 (the org.apache.hcatalog.* files should all be 
 removed afterwards), and it is also good for us to clean out and nuke all 
 other older deprecated methods.
 We should take this on early in a dev/release cycle to allow us time to 
 resolve all fallout, so I propose that we remove all HCatalog deprecated 
 methods after we branch out 0.13 and 0.14 becomes trunk.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-6433) SQL std auth - allow grant/revoke roles if user has ADMIN OPTION


 [ 
https://issues.apache.org/jira/browse/HIVE-6433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-6433:


Release Note: If a user/role has admin option on a role, then user should 
be able to grant /revoke other users to/from the role.

 SQL std auth - allow grant/revoke roles if user has ADMIN OPTION
 

 Key: HIVE-6433
 URL: https://issues.apache.org/jira/browse/HIVE-6433
 Project: Hive
  Issue Type: Sub-task
Reporter: Thejas M Nair
Assignee: Ashutosh Chauhan
 Fix For: 0.13.0

 Attachments: HIVE-6433.1.patch, HIVE-6433.2.patch, HIVE-6433.patch


 Follow up jira for HIVE-5952.
 If a user/role has admin option on a role, then user should be able to grant 
 /revoke other users to/from the role.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-6432) Remove deprecated methods in HCatalog


 [ 
https://issues.apache.org/jira/browse/HIVE-6432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-6432:
---

Attachment: HIVE-6432.wip.1.patch

Now that 0.13 has forked out, and 0.14 is trunk, it's time for mass destruction!

I'm uploading a first attempt work-in-progress patch, which removes all 
org.apache.hcatalog entries. This is not backward-compatible, and removes the 
storage-handlers directory in hcat altogether.

I still need to remove deprecated functions and api points in various classes.

 Remove deprecated methods in HCatalog
 -

 Key: HIVE-6432
 URL: https://issues.apache.org/jira/browse/HIVE-6432
 Project: Hive
  Issue Type: Task
  Components: HCatalog
Reporter: Sushanth Sowmyan
Assignee: Sushanth Sowmyan
 Attachments: HIVE-6432.wip.1.patch


 There are a lot of methods in HCatalog that have been deprecated in HCatalog 
 0.5, and some that were recently deprecated in Hive 0.11 (joint release with 
 HCatalog).
 The goal for HCatalog deprecation is that in general, after something has 
 been deprecated, it is expected to stay around for 2 releases, which means 
 hive-0.13 will be the last release to ship with all the methods that were 
 deprecated in hive-0.11 (the org.apache.hcatalog.* files should all be 
 removed afterwards), and it is also good for us to clean out and nuke all 
 other older deprecated methods.
 We should take this on early in a dev/release cycle to allow us time to 
 resolve all fallout, so I propose that we remove all HCatalog deprecated 
 methods after we branch out 0.13 and 0.14 becomes trunk.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Re: Review Request 18179: Support more generic way of using composite key for HBaseHandler

2014-03-04 Thread Navis Ryu


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18179/
---

(Updated March 5, 2014, 3:47 a.m.)


Review request for hive.


Changes
---

Merged functionality of HIVE-2599


Bugs: HIVE-6411
https://issues.apache.org/jira/browse/HIVE-6411


Repository: hive-git


Description
---

HIVE-2599 introduced using custom object for the row key. But it forces key 
objects to extend HBaseCompositeKey, which is again extension of LazyStruct. If 
user provides proper Object and OI, we can replace internal key and keyOI with 
those. 

Initial implementation is based on factory interface.
{code}
public interface HBaseKeyFactory {
  void init(SerDeParameters parameters, Properties properties) throws 
SerDeException;
  ObjectInspector createObjectInspector(TypeInfo type) throws SerDeException;
  LazyObjectBase createObject(ObjectInspector inspector) throws SerDeException;
}
{code}


Diffs (updated)
-

  hbase-handler/pom.xml 7c3524c 
  hbase-handler/src/java/org/apache/hadoop/hive/hbase/HBaseCompositeKey.java 
5008f15 
  hbase-handler/src/java/org/apache/hadoop/hive/hbase/HBaseKeyFactory.java 
PRE-CREATION 
  
hbase-handler/src/java/org/apache/hadoop/hive/hbase/HBaseLazyObjectFactory.java 
PRE-CREATION 
  hbase-handler/src/java/org/apache/hadoop/hive/hbase/HBaseScanRange.java 
PRE-CREATION 
  hbase-handler/src/java/org/apache/hadoop/hive/hbase/HBaseSerDe.java 2cd65cb 
  hbase-handler/src/java/org/apache/hadoop/hive/hbase/HBaseStorageHandler.java 
29e5da5 
  
hbase-handler/src/java/org/apache/hadoop/hive/hbase/HBaseWritableKeyFactory.java
 PRE-CREATION 
  
hbase-handler/src/java/org/apache/hadoop/hive/hbase/HiveHBaseTableInputFormat.java
 704fcb9 
  hbase-handler/src/java/org/apache/hadoop/hive/hbase/LazyHBaseRow.java fc40195 
  
hbase-handler/src/test/org/apache/hadoop/hive/hbase/HBaseTestCompositeKey.java 
13c344b 
  hbase-handler/src/test/org/apache/hadoop/hive/hbase/TestHBaseKeyFactory.java 
PRE-CREATION 
  hbase-handler/src/test/org/apache/hadoop/hive/hbase/TestHBaseKeyFactory2.java 
PRE-CREATION 
  hbase-handler/src/test/queries/positive/hbase_custom_key.q PRE-CREATION 
  hbase-handler/src/test/queries/positive/hbase_custom_key2.q PRE-CREATION 
  hbase-handler/src/test/results/positive/hbase_custom_key.q.out PRE-CREATION 
  hbase-handler/src/test/results/positive/hbase_custom_key2.q.out PRE-CREATION 
  itests/util/pom.xml 9885c53 
  ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 5995c14 
  ql/src/java/org/apache/hadoop/hive/ql/index/IndexPredicateAnalyzer.java 
d39ee2e 
  ql/src/java/org/apache/hadoop/hive/ql/index/IndexSearchCondition.java 5f1329c 
  ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java 647a9a6 
  
ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveStoragePredicateHandler.java 
9f35575 
  ql/src/java/org/apache/hadoop/hive/ql/plan/ExprNodeDescUtils.java e50026b 
  ql/src/java/org/apache/hadoop/hive/ql/plan/TableScanDesc.java 10bae4d 
  ql/src/java/org/apache/hadoop/hive/ql/ppd/OpProcFactory.java 40298e1 
  serde/src/java/org/apache/hadoop/hive/serde2/StructObject.java PRE-CREATION 
  serde/src/java/org/apache/hadoop/hive/serde2/StructObjectBaseInspector.java 
PRE-CREATION 
  serde/src/java/org/apache/hadoop/hive/serde2/columnar/ColumnarStructBase.java 
1fd6853 
  serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyObject.java 10f4c05 
  serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyObjectBase.java 3334dff 
  serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyStruct.java 8a1ea46 
  
serde/src/java/org/apache/hadoop/hive/serde2/lazy/objectinspector/LazySimpleStructObjectInspector.java
 8a5386a 
  serde/src/java/org/apache/hadoop/hive/serde2/lazybinary/LazyBinaryObject.java 
598683f 
  serde/src/java/org/apache/hadoop/hive/serde2/lazybinary/LazyBinaryStruct.java 
caf3517 

Diff: https://reviews.apache.org/r/18179/diff/


Testing
---


Thanks,

Navis Ryu

[jira] [Updated] (HIVE-6411) Support more generic way of using composite key for HBaseHandler


 [ 
https://issues.apache.org/jira/browse/HIVE-6411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-6411:


Attachment: HIVE-6411.4.patch.txt

 Support more generic way of using composite key for HBaseHandler
 

 Key: HIVE-6411
 URL: https://issues.apache.org/jira/browse/HIVE-6411
 Project: Hive
  Issue Type: Improvement
  Components: HBase Handler
Reporter: Navis
Assignee: Navis
Priority: Minor
 Attachments: HIVE-6411.1.patch.txt, HIVE-6411.2.patch.txt, 
 HIVE-6411.3.patch.txt, HIVE-6411.4.patch.txt


 HIVE-2599 introduced using custom object for the row key. But it forces key 
 objects to extend HBaseCompositeKey, which is again extension of LazyStruct. 
 If user provides proper Object and OI, we can replace internal key and keyOI 
 with those. 
 Initial implementation is based on factory interface.
 {code}
 public interface HBaseKeyFactory {
   void init(SerDeParameters parameters, Properties properties) throws 
 SerDeException;
   ObjectInspector createObjectInspector(TypeInfo type) throws SerDeException;
   LazyObjectBase createObject(ObjectInspector inspector) throws 
 SerDeException;
 }
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-6455) Scalable dynamic partitioning and bucketing optimization

2014-03-04 Thread Prasanth J (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth J updated HIVE-6455:
-

Attachment: HIVE-6455.11.patch

Addressed [~hagleitn]'s code review comments. This is intermediate checkin to 
look for precommit test failures.

 Scalable dynamic partitioning and bucketing optimization
 

 Key: HIVE-6455
 URL: https://issues.apache.org/jira/browse/HIVE-6455
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor
Affects Versions: 0.13.0
Reporter: Prasanth J
Assignee: Prasanth J
  Labels: optimization
 Attachments: HIVE-6455.1.patch, HIVE-6455.1.patch, 
 HIVE-6455.10.patch, HIVE-6455.10.patch, HIVE-6455.11.patch, 
 HIVE-6455.2.patch, HIVE-6455.3.patch, HIVE-6455.4.patch, HIVE-6455.4.patch, 
 HIVE-6455.5.patch, HIVE-6455.6.patch, HIVE-6455.7.patch, HIVE-6455.8.patch, 
 HIVE-6455.9.patch, HIVE-6455.9.patch


 The current implementation of dynamic partition works by keeping at least one 
 record writer open per dynamic partition directory. In case of bucketing 
 there can be multispray file writers which further adds up to the number of 
 open record writers. The record writers of column oriented file format (like 
 ORC, RCFile etc.) keeps some sort of in-memory buffers (value buffer or 
 compression buffers) open all the time to buffer up the rows and compress 
 them before flushing it to disk. Since these buffers are maintained per 
 column basis the amount of constant memory that will required at runtime 
 increases as the number of partitions and number of columns per partition 
 increases. This often leads to OutOfMemory (OOM) exception in mappers or 
 reducers depending on the number of open record writers. Users often tune the 
 JVM heapsize (runtime memory) to get over such OOM issues. 
 With this optimization, the dynamic partition columns and bucketing columns 
 (in case of bucketed tables) are sorted before being fed to the reducers. 
 Since the partitioning and bucketing columns are sorted, each reducers can 
 keep only one record writer open at any time thereby reducing the memory 
 pressure on the reducers. This optimization is highly scalable as the number 
 of partition and number of columns per partition increases at the cost of 
 sorting the columns.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6541) Need to write documentation for ACID work


[ 
https://issues.apache.org/jira/browse/HIVE-6541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13920456#comment-13920456
 ] 

Lefty Leverenz commented on HIVE-6541:
--

bq.  Should I just post it in here in text format

Sounds good to me.  Is it pretty much the same as InsertUpdatesinHive.pdf 
(attached to HIVE-5317)?

 Need to write documentation for ACID work
 -

 Key: HIVE-6541
 URL: https://issues.apache.org/jira/browse/HIVE-6541
 Project: Hive
  Issue Type: Sub-task
  Components: Documentation
Affects Versions: 0.13.0
Reporter: Alan Gates
Assignee: Alan Gates
 Fix For: 0.13.0


 ACID introduces a number of new config file options, tables in the metastore, 
 keywords in the grammar, and a new interface for use of tools like storm and 
 flume.  These need to be documented.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-4293) Predicates following UDTF operator are removed by PPD


 [ 
https://issues.apache.org/jira/browse/HIVE-4293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-4293:


Attachment: HIVE-4293.11.patch.txt

 Predicates following UDTF operator are removed by PPD
 -

 Key: HIVE-4293
 URL: https://issues.apache.org/jira/browse/HIVE-4293
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Critical
 Attachments: D9933.6.patch, HIVE-4293.10.patch, 
 HIVE-4293.11.patch.txt, HIVE-4293.7.patch.txt, HIVE-4293.8.patch.txt, 
 HIVE-4293.9.patch.txt, HIVE-4293.D9933.1.patch, HIVE-4293.D9933.2.patch, 
 HIVE-4293.D9933.3.patch, HIVE-4293.D9933.4.patch, HIVE-4293.D9933.5.patch


 For example, 
 {noformat}
 explain SELECT value from (
   select explode(array(key, value)) as (value) from (
 select * FROM src WHERE key  200
   ) A
 ) B WHERE value  300
 ;
 {noformat}
 Makes plan like this, removing last predicates
 {noformat}
   TableScan
 alias: src
 Filter Operator
   predicate:
   expr: (key  200.0)
   type: boolean
   Select Operator
 expressions:
   expr: array(key,value)
   type: arraystring
 outputColumnNames: _col0
 UDTF Operator
   function name: explode
   Select Operator
 expressions:
   expr: col
   type: string
 outputColumnNames: _col0
 File Output Operator
   compressed: false
   GlobalTableId: 0
   table:
   input format: org.apache.hadoop.mapred.TextInputFormat
   output format: 
 org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-4293) Predicates following UDTF operator are removed by PPD


[ 
https://issues.apache.org/jira/browse/HIVE-4293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13920459#comment-13920459
 ] 

Navis commented on HIVE-4293:
-

Merged your patch with partial fix in HIVE-4598. Let's see the test result.

 Predicates following UDTF operator are removed by PPD
 -

 Key: HIVE-4293
 URL: https://issues.apache.org/jira/browse/HIVE-4293
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Critical
 Attachments: D9933.6.patch, HIVE-4293.10.patch, 
 HIVE-4293.11.patch.txt, HIVE-4293.7.patch.txt, HIVE-4293.8.patch.txt, 
 HIVE-4293.9.patch.txt, HIVE-4293.D9933.1.patch, HIVE-4293.D9933.2.patch, 
 HIVE-4293.D9933.3.patch, HIVE-4293.D9933.4.patch, HIVE-4293.D9933.5.patch


 For example, 
 {noformat}
 explain SELECT value from (
   select explode(array(key, value)) as (value) from (
 select * FROM src WHERE key  200
   ) A
 ) B WHERE value  300
 ;
 {noformat}
 Makes plan like this, removing last predicates
 {noformat}
   TableScan
 alias: src
 Filter Operator
   predicate:
   expr: (key  200.0)
   type: boolean
   Select Operator
 expressions:
   expr: array(key,value)
   type: arraystring
 outputColumnNames: _col0
 UDTF Operator
   function name: explode
   Select Operator
 expressions:
   expr: col
   type: string
 outputColumnNames: _col0
 File Output Operator
   compressed: false
   GlobalTableId: 0
   table:
   input format: org.apache.hadoop.mapred.TextInputFormat
   output format: 
 org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6549) removed templeton.jar from webhcat-default.xml

2014-03-04 Thread Eugene Koifman (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13920461#comment-13920461
 ] 

Eugene Koifman commented on HIVE-6549:
--

I'm not sure it's useful to maintain Configuration Variables section.  Each 
variable is/should be documented in webhcat-default.xml (there is a special 
'description' xml element there for it).  Copying it to wiki only adds 
maintenance effort.  The rest of the page is useful.

 removed templeton.jar from webhcat-default.xml
 --

 Key: HIVE-6549
 URL: https://issues.apache.org/jira/browse/HIVE-6549
 Project: Hive
  Issue Type: Bug
  Components: WebHCat
Affects Versions: 0.12.0
Reporter: Eugene Koifman
Assignee: Eugene Koifman
Priority: Minor

 this property is no longer used
 also removed corresponding AppConfig.TEMPLETON_JAR_NAME



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Re: Review Request 16281: Predicates following UDTF operator are removed by PPD

2014-03-04 Thread Navis Ryu


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/16281/
---

(Updated March 5, 2014, 4:07 a.m.)


Review request for hive.


Changes
---

+ Works of Harish + Partial fix in HIVE-4598


Bugs: HIVE-4293
https://issues.apache.org/jira/browse/HIVE-4293


Repository: hive-git


Description
---

For example, 
{noformat}
explain SELECT value from (
  select explode(array(key, value)) as (value) from (
select * FROM src WHERE key  200
  ) A
) B WHERE value  300
;
{noformat}

Makes plan like this, removing last predicates
{noformat}
  TableScan
alias: src
Filter Operator
  predicate:
  expr: (key  200.0)
  type: boolean
  Select Operator
expressions:
  expr: array(key,value)
  type: arraystring
outputColumnNames: _col0
UDTF Operator
  function name: explode
  Select Operator
expressions:
  expr: col
  type: string
outputColumnNames: _col0
File Output Operator
  compressed: false
  GlobalTableId: 0
  table:
  input format: org.apache.hadoop.mapred.TextInputFormat
  output format: 
org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
{noformat}


Diffs (updated)
-

  ql/src/java/org/apache/hadoop/hive/ql/exec/LateralViewJoinOperator.java 
2fbb81b 
  ql/src/java/org/apache/hadoop/hive/ql/exec/ReduceSinkOperator.java c378dc7 
  ql/src/java/org/apache/hadoop/hive/ql/exec/mr/MapRedTask.java 326654f 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/ColumnPrunerProcFactory.java 
0798470 
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 89d2a9c 
  ql/src/java/org/apache/hadoop/hive/ql/plan/LateralViewJoinDesc.java ebfcfc8 
  ql/src/java/org/apache/hadoop/hive/ql/ppd/ExprWalkerInfo.java 6a3dd99 
  ql/src/java/org/apache/hadoop/hive/ql/ppd/OpProcFactory.java 40298e1 
  ql/src/java/org/apache/hadoop/hive/ql/ppd/PredicatePushDown.java cd5ae51 
  ql/src/test/queries/clientpositive/lateral_view_ppd.q 7be86a6 
  ql/src/test/queries/clientpositive/ppd_join4.q PRE-CREATION 
  ql/src/test/queries/clientpositive/ppd_transform.q 65a498d 
  ql/src/test/queries/clientpositive/ppd_udtf.q PRE-CREATION 
  ql/src/test/results/clientpositive/cluster.q.out 0cd0886 
  ql/src/test/results/clientpositive/ctas_colname.q.out 3d568ab 
  ql/src/test/results/clientpositive/lateral_view_ppd.q.out da77f75 
  ql/src/test/results/clientpositive/ppd2.q.out 2f2c558 
  ql/src/test/results/clientpositive/ppd_gby.q.out 68092e0 
  ql/src/test/results/clientpositive/ppd_gby2.q.out a8ccace 
  ql/src/test/results/clientpositive/ppd_join4.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/ppd_transform.q.out e7c07ed 
  ql/src/test/results/clientpositive/ppd_udtf.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/udtf_json_tuple.q.out f151740 
  ql/src/test/results/clientpositive/udtf_parse_url_tuple.q.out 74d9e96 
  ql/src/test/results/compiler/plan/join1.q.xml 12b01ce 
  ql/src/test/results/compiler/plan/join2.q.xml ed5bbb8 
  ql/src/test/results/compiler/plan/join3.q.xml 5437afa 
  ql/src/test/results/compiler/plan/join4.q.xml aa69ada 
  ql/src/test/results/compiler/plan/join5.q.xml ef0c69d 
  ql/src/test/results/compiler/plan/join6.q.xml da528f5 
  ql/src/test/results/compiler/plan/join7.q.xml fcacc6d 
  ql/src/test/results/compiler/plan/join8.q.xml c7591a4 

Diff: https://reviews.apache.org/r/16281/diff/


Testing
---


Thanks,

Navis Ryu

[jira] [Commented] (HIVE-5761) Implement vectorized support for the DATE data type


[ 
https://issues.apache.org/jira/browse/HIVE-5761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13920464#comment-13920464
 ] 

Lefty Leverenz commented on HIVE-5761:
--

Does this need any user documentation?

 Implement vectorized support for the DATE data type
 ---

 Key: HIVE-5761
 URL: https://issues.apache.org/jira/browse/HIVE-5761
 Project: Hive
  Issue Type: Sub-task
Reporter: Eric Hanson
Assignee: Teddy Choi
 Attachments: HIVE-5761.1.patch, HIVE-5761.2.patch, HIVE-5761.3.patch, 
 HIVE-5761.4.patch, HIVE-5761.5.patch, HIVE-5761.6.patch, HIVE-5761.6.patch


 Add support to allow queries referencing DATE columns and expression results 
 to run efficiently in vectorized mode. This should re-use the code for the 
 the integer/timestamp types to the extent possible and beneficial. Include 
 unit tests and end-to-end tests. Consider re-using or extending existing 
 end-to-end tests for vectorized integer and/or timestamp operations.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6549) removed templeton.jar from webhcat-default.xml


[ 
https://issues.apache.org/jira/browse/HIVE-6549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13920483#comment-13920483
 ] 

Lefty Leverenz commented on HIVE-6549:
--

Readability is the main advantage of putting config variables in the wiki.  
Some readers might also like seeing all the variables along with general 
configuration information, without having to hunt for webhcat-default.xml.  But 
you're right about the maintenance problem.

I'd say go ahead and remove the table from the wiki, but perhaps we need a few 
more opinions.

 removed templeton.jar from webhcat-default.xml
 --

 Key: HIVE-6549
 URL: https://issues.apache.org/jira/browse/HIVE-6549
 Project: Hive
  Issue Type: Bug
  Components: WebHCat
Affects Versions: 0.12.0
Reporter: Eugene Koifman
Assignee: Eugene Koifman
Priority: Minor

 this property is no longer used
 also removed corresponding AppConfig.TEMPLETON_JAR_NAME



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-5888) group by after join operation product no result when hive.optimize.skewjoin = true


[ 
https://issues.apache.org/jira/browse/HIVE-5888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13920493#comment-13920493
 ] 

Navis commented on HIVE-5888:
-

I believe this is fixed by HIVE-6041, which is included in hive-0.13.0. There 
remains a minor issue in explain result. But it makes valid result now.

 group by after join operation product no result when  hive.optimize.skewjoin 
 = true 
 

 Key: HIVE-5888
 URL: https://issues.apache.org/jira/browse/HIVE-5888
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0
Reporter: cyril liao





--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6290) Add support for hbase filters for composite keys


[ 
https://issues.apache.org/jira/browse/HIVE-6290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13920496#comment-13920496
 ] 

Navis commented on HIVE-6290:
-

I've regarded that as a following issue but I've merged your patch into 
HIVE-6411. 

 Add support for hbase filters for composite keys
 

 Key: HIVE-6290
 URL: https://issues.apache.org/jira/browse/HIVE-6290
 Project: Hive
  Issue Type: Sub-task
  Components: HBase Handler
Affects Versions: 0.12.0
Reporter: Swarnim Kulkarni
Assignee: Swarnim Kulkarni
 Attachments: HIVE-6290.1.patch.txt, HIVE-6290.2.patch.txt, 
 HIVE-6290.3.patch.txt


 Add support for filters to be provided via the composite key class



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6520) Skew Join optimization doesn't work if parent gets converted to MapJoin task

[
https://issues.apache.org/jira/browse/HIVE-6520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13920505#comment-13920505
]

Navis commented on HIVE-6520:
-

MapJoinOperator cannot handle skew join, which should know the total number of
a join key. We can disable converting to MapJoin when it's for skewjoin. But If
it can be converted MapJoin, it would be faster than doing it in classical skew
join.

Skew Join optimization doesn't work if parent gets converted to MapJoin task

Key: HIVE-6520
URL: https://issues.apache.org/jira/browse/HIVE-6520
Project: Hive
Issue Type: Bug
Affects Versions: 0.11.0
Reporter: Ankit Kamboj

Skew join optimization (GenMRSkewJoinProcessor.java) assumes that its parent
stage(that will create directory structure for skewed keys) will have a
Reduce Join Operator. GenMRSkewJoinProcessor sets the handleSkewJoin flag
only in that case.
But it is possible that parent stage gets converted to MapJoin task (because
of hive.auto.convert.join flag). In that case handleSkewJoin is not set for
parent stage and it will not create directory structure for skewed keys in
hdfs. This eventually leads to elimination of skew join conditional task (and
its children) because the conditional task is not able to find the skewed key
directories.
Shouldn't the MapJoinOperator also handle skew join and create directory
structure for skewed keys in addition to performing map join for the
non-skewed keys?

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6519) Allow optional as in subquery definition


[ 
https://issues.apache.org/jira/browse/HIVE-6519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13920506#comment-13920506
 ] 

Lefty Leverenz commented on HIVE-6519:
--

Documented in the wiki here:

* [SubQueries:  Subqueries in the FROM Clause 
|https://cwiki.apache.org/confluence/display/Hive/LanguageManual+SubQueries#LanguageManualSubQueries-SubqueriesintheFROMClause]

by adding a second line of syntax:

{code}
SELECT ... FROM (subquery) name ...
SELECT ... FROM (subquery) AS name ...   (Note: Only valid starting with Hive 
0.13.0)
{code}

and this text:

bq. The optional keyword AS can be included before the subquery name in Hive 
0.13.0 and later versions (HIVE-6519).

 Allow optional as in subquery definition
 --

 Key: HIVE-6519
 URL: https://issues.apache.org/jira/browse/HIVE-6519
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
Priority: Minor
 Fix For: 0.13.0

 Attachments: HIVE-6519.1.patch


 Allow both:
 select * from (select * from foo) bar 
 select * from (select * from foo) as bar 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Created] (HIVE-6551) group by after join with skew join optimization references invalid task sometimes

Navis created HIVE-6551:
---

 Summary: group by after join with skew join optimization 
references invalid task sometimes
 Key: HIVE-6551
 URL: https://issues.apache.org/jira/browse/HIVE-6551
 Project: Hive
  Issue Type: Bug
Reporter: Navis
Assignee: Navis
Priority: Trivial


For example,
{noformat}
hive set hive.auto.convert.join = true;
hive set hive.optimize.skewjoin = true;
hive set hive.skewjoin.key = 3;
hive 
 EXPLAIN FROM 
 (SELECT src.* FROM src) x
 JOIN 
 (SELECT src.* FROM src) Y
 ON (x.key = Y.key)
 SELECT sum(hash(Y.key)), sum(hash(Y.value));
OK
STAGE DEPENDENCIES:
  Stage-8 is a root stage
  Stage-6 depends on stages: Stage-8
  Stage-5 depends on stages: Stage-6 , consists of Stage-4, Stage-2
  Stage-4
  Stage-2 depends on stages: Stage-4, Stage-1
  Stage-0 is a root stage
...
{noformat}

Stage-2 references not-existing Stage-1



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Getting difficulty in the way to work on HiveQL (Appache Hadoop, Big Data Analytic Platform)

2014-03-04 Thread Gaurav Kumar

Dear Sir,


 
            With due respect I would like to mention that I, Gaurav Kumar, am a 
M.Tech. student in Computer Science at Jawaharlal Nehru University, New Delhi. 
I am currently working on my dissertation titled  'Optimization of  SQL query 
on Hive Platform (Apache Hive)'. 

I am unable to carry
 forward my work properly due to lack of valuable guidance and experience, 
leading to enormously increasing stress. As such, it would very helpful if you 
would provide me precious and constructive suggestions/guidance on my 
dissertation work. Kindly please spare a few moments of your time to help me 
with your valuable knowledge and experience garnered while working on this 
subject.

Hope to get a positive reply at the earliest. 
 
Thanking You.

Yours sincerely
Gaurav Kumar
gauravsp1...@yahoo.com

[jira] [Updated] (HIVE-6060) Define API for RecordUpdater and UpdateReader

2014-03-04 Thread Owen O'Malley (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley updated HIVE-6060:


Attachment: h-6060.patch

This patch puts everything together:
* Defines AcidInputFormat and AcidOutputFormat.
* Extends OrcInputFormat and OrcOutputFormat to implement them.
* Creates AcidUtils to figure out which base and deltas need to be read.
* Provides raw interfaces that the compactor uses to re-write small files.
* Moves ValidTxnList and ValidTxnListImpl to common where they can be used by 
code in mapreduce tasks and the metastore.
* Adds an interface to Orc Writers that provides callbacks when stripes are 
being written.
* Adds a method to Orc Writers that allow the client to write the current 
stripe to disk and writes a temporary footer before the writer continues to 
write new stripes.

 Define API for RecordUpdater and UpdateReader
 -

 Key: HIVE-6060
 URL: https://issues.apache.org/jira/browse/HIVE-6060
 Project: Hive
  Issue Type: Sub-task
Reporter: Owen O'Malley
Assignee: Owen O'Malley
 Attachments: acid-io.patch, h-5317.patch, h-5317.patch, h-5317.patch, 
 h-6060.patch, h-6060.patch


 We need to define some new APIs for how Hive interacts with the file formats 
 since it needs to be much richer than the current RecordReader and 
 RecordWriter.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-6060) Define API for RecordUpdater and UpdateReader

2014-03-04 Thread Owen O'Malley (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley updated HIVE-6060:


Status: Patch Available  (was: Open)

 Define API for RecordUpdater and UpdateReader
 -

 Key: HIVE-6060
 URL: https://issues.apache.org/jira/browse/HIVE-6060
 Project: Hive
  Issue Type: Sub-task
Reporter: Owen O'Malley
Assignee: Owen O'Malley
 Attachments: acid-io.patch, h-5317.patch, h-5317.patch, h-5317.patch, 
 h-6060.patch, h-6060.patch


 We need to define some new APIs for how Hive interacts with the file formats 
 since it needs to be much richer than the current RecordReader and 
 RecordWriter.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Re: Getting difficulty in the way to work on HiveQL (Appache Hadoop, Big Data Analytic Platform)

2014-03-04 Thread shashwat shriparv

Hey gaurav what is the problem wherebu r facing the pronlem
On 5 Mar 2014 10:41, Gaurav Kumar gauravsp1...@yahoo.com wrote:

 Dear Sir,



 With due respect I would like to mention that I, Gaurav Kumar,
 am a M.Tech. student in Computer Science at Jawaharlal Nehru University,
 New Delhi. I am currently working on my dissertation titled  'Optimization
 of  SQL query on Hive Platform (Apache Hive)'.

 I am unable to carry
  forward my work properly due to lack of valuable guidance and experience,
 leading to enormously increasing stress. As such, it would very helpful if
 you would provide me precious and constructive suggestions/guidance on my
 dissertation work. Kindly please spare a few moments of your time to help
 me with your valuable knowledge and experience garnered while working on
 this subject.

 Hope to get a positive reply at the earliest.

 Thanking You.

 Yours sincerely
 Gaurav Kumar
 gauravsp1...@yahoo.com

[jira] [Commented] (HIVE-6492) limit partition number involved in a table scan


[ 
https://issues.apache.org/jira/browse/HIVE-6492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13920579#comment-13920579
 ] 

Hive QA commented on HIVE-6492:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12632689/HIVE-6492.5.patch.txt

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 5358 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_bucketmapjoin6
org.apache.hive.beeline.TestSchemaTool.testSchemaInit
org.apache.hive.beeline.TestSchemaTool.testSchemaUpgrade
{noformat}

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1623/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1623/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12632689

 limit partition number involved in a table scan
 ---

 Key: HIVE-6492
 URL: https://issues.apache.org/jira/browse/HIVE-6492
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor
Affects Versions: 0.12.0
Reporter: Selina Zhang
 Fix For: 0.13.0

 Attachments: HIVE-6492.1.patch.txt, HIVE-6492.2.patch.txt, 
 HIVE-6492.3.patch.txt, HIVE-6492.4.patch.txt, HIVE-6492.4.patch_suggestion, 
 HIVE-6492.5.patch.txt

   Original Estimate: 24h
  Remaining Estimate: 24h

 To protect the cluster, a new configure variable 
 hive.limit.query.max.table.partition is added to hive configuration to
 limit the table partitions involved in a table scan. 
 The default value will be set to -1 which means there is no limit by default. 
 This variable will not affect metadata only query.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-6551) group by after join with skew join optimization references invalid task sometimes