date:20150622

[jira] [Updated] (HIVE-10165) Improve hive-hcatalog-streaming extensibility and support updates and deletes.

2015-06-22 Thread Elliot West (JIRA)

[
https://issues.apache.org/jira/browse/HIVE-10165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Elliot West updated HIVE-10165:
---
Attachment: HIVE-10165.9.patch

Improve hive-hcatalog-streaming extensibility and support updates and deletes.
--

Key: HIVE-10165
URL: https://issues.apache.org/jira/browse/HIVE-10165
Project: Hive
Issue Type: Improvement
Components: HCatalog
Affects Versions: 1.2.0
Reporter: Elliot West
Assignee: Elliot West
Labels: streaming_api
Attachments: HIVE-10165.0.patch, HIVE-10165.4.patch,
HIVE-10165.5.patch, HIVE-10165.6.patch, HIVE-10165.7.patch,
HIVE-10165.9.patch, mutate-system-overview.png

h3. Overview
I'd like to extend the
[hive-hcatalog-streaming|https://cwiki.apache.org/confluence/display/Hive/Streaming+Data+Ingest]
API so that it also supports the writing of record updates and deletes in
addition to the already supported inserts.
h3. Motivation
We have many Hadoop processes outside of Hive that merge changed facts into
existing datasets. Traditionally we achieve this by: reading in a
ground-truth dataset and a modified dataset, grouping by a key, sorting by a
sequence and then applying a function to determine inserted, updated, and
deleted rows. However, in our current scheme we must rewrite all partitions
that may potentially contain changes. In practice the number of mutated
records is very small when compared with the records contained in a
partition. This approach results in a number of operational issues:
* Excessive amount of write activity required for small data changes.
* Downstream applications cannot robustly read these datasets while they are
being updated.
* Due to scale of the updates (hundreds or partitions) the scope for
contention is high.
I believe we can address this problem by instead writing only the changed
records to a Hive transactional table. This should drastically reduce the
amount of data that we need to write and also provide a means for managing
concurrent access to the data. Our existing merge processes can read and
retain each record's {{ROW_ID}}/{{RecordIdentifier}} and pass this through to
an updated form of the hive-hcatalog-streaming API which will then have the
required data to perform an update or insert in a transactional manner.
h3. Benefits
* Enables the creation of large-scale dataset merge processes
* Opens up Hive transactional functionality in an accessible manner to
processes that operate outside of Hive.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11062) Remove Exception stacktrace from Log.info when ACL is not supported.

2015-06-22 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14596300#comment-14596300
 ] 

Hive QA commented on HIVE-11062:




{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12741052/HIVE-11062.1.patch

{color:green}SUCCESS:{color} +1 9013 tests passed

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4335/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4335/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4335/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12741052 - PreCommit-HIVE-TRUNK-Build

 Remove Exception stacktrace from Log.info when ACL is not supported.
 

 Key: HIVE-11062
 URL: https://issues.apache.org/jira/browse/HIVE-11062
 Project: Hive
  Issue Type: Bug
  Components: Logging
Affects Versions: 1.1.0
Reporter: Yongzhi Chen
Assignee: Yongzhi Chen
Priority: Minor
 Attachments: HIVE-11062.1.patch


 When logging set to info, Extended ACL Enabled and the file system does not 
 support ACL, there are a lot of Exception stack trace in the log file. 
 Although it is benign, it can easily make users frustrated. We should set the 
 level to show the Exception in debug. 
 Current, the Exception in the log looks like:
 {noformat}
 2015-06-19 05:09:59,376 INFO org.apache.hadoop.hive.shims.HadoopShimsSecure: 
 Skipping ACL inheritance: File system for path s3a://yibing/hive does not 
 support ACLs but dfs.namenode.acls.enabled is set to true: 
 java.lang.UnsupportedOperationException: S3AFileSystem doesn't support 
 getAclStatus
 java.lang.UnsupportedOperationException: S3AFileSystem doesn't support 
 getAclStatus
   at org.apache.hadoop.fs.FileSystem.getAclStatus(FileSystem.java:2429)
   at 
 org.apache.hadoop.hive.shims.Hadoop23Shims.getFullFileStatus(Hadoop23Shims.java:729)
   at 
 org.apache.hadoop.hive.ql.metadata.Hive.inheritFromTable(Hive.java:2786)
   at org.apache.hadoop.hive.ql.metadata.Hive.replaceFiles(Hive.java:2694)
   at org.apache.hadoop.hive.ql.metadata.Table.replaceFiles(Table.java:640)
   at org.apache.hadoop.hive.ql.metadata.Hive.loadTable(Hive.java:1587)
   at org.apache.hadoop.hive.ql.exec.MoveTask.execute(MoveTask.java:297)
   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160)
   at 
 org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:88)
   at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1638)
   at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1397)
   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1181)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1047)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1042)
   at 
 org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:145)
   at 
 org.apache.hive.service.cli.operation.SQLOperation.access$100(SQLOperation.java:70)
   at 
 org.apache.hive.service.cli.operation.SQLOperation$1$1.run(SQLOperation.java:197)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:415)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
   at 
 org.apache.hive.service.cli.operation.SQLOperation$1.run(SQLOperation.java:209)
   at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at java.lang.Thread.run(Thread.java:745)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10594) Remote Spark client doesn't use Kerberos keytab to authenticate [Spark Branch]

2015-06-22 Thread Chao Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14596295#comment-14596295
 ] 

Chao Sun commented on HIVE-10594:
-

+1

 Remote Spark client doesn't use Kerberos keytab to authenticate [Spark Branch]
 --

 Key: HIVE-10594
 URL: https://issues.apache.org/jira/browse/HIVE-10594
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Affects Versions: 1.1.0
Reporter: Chao Sun
Assignee: Xuefu Zhang
 Attachments: HIVE-10594.1-spark.patch


 Reporting problem found by one of the HoS users:
 Currently, if user is running Beeline on a different host than HS2, and 
 he/she didn't do kinit on the HS2 host, then he/she may get the following 
 error:
 {code}
 2015-04-29 15:49:34,614 INFO org.apache.hive.spark.client.SparkClientImpl: 
 15/04/29 15:49:34 WARN UserGroupInformation: PriviledgedActionException 
 as:hive (auth:KERBEROS) cause:java.io.IOException: 
 javax.security.sasl.SaslException: GSS initiate failed [Caused by 
 GSSException: No valid credentials provided (Mechanism level: Failed to find 
 any Kerberos tgt)]
 2015-04-29 15:49:34,652 INFO org.apache.hive.spark.client.SparkClientImpl: 
 Exception in thread main java.io.IOException: Failed on local exception: 
 java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed 
 [Caused by GSSException: No valid credentials provided (Mechanism level: 
 Failed to find any Kerberos tgt)]; Host Details : local host is: 
 secure-hos-1.ent.cloudera.com/10.20.77.79; destination host is: 
 secure-hos-1.ent.cloudera.com:8032;
 2015-04-29 15:49:34,653 INFO org.apache.hive.spark.client.SparkClientImpl:
   at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:772)
 2015-04-29 15:49:34,653 INFO org.apache.hive.spark.client.SparkClientImpl:
   at org.apache.hadoop.ipc.Client.call(Client.java:1472)
 2015-04-29 15:49:34,654 INFO org.apache.hive.spark.client.SparkClientImpl:
   at org.apache.hadoop.ipc.Client.call(Client.java:1399)
 2015-04-29 15:49:34,654 INFO org.apache.hive.spark.client.SparkClientImpl:
   at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
 2015-04-29 15:49:34,654 INFO org.apache.hive.spark.client.SparkClientImpl:
   at com.sun.proxy.$Proxy11.getClusterMetrics(Unknown Source)
 2015-04-29 15:49:34,655 INFO org.apache.hive.spark.client.SparkClientImpl:
   at 
 org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getClusterMetrics(ApplicationClientProtocolPBClientImpl.java:202)
 2015-04-29 15:49:34,655 INFO org.apache.hive.spark.client.SparkClientImpl:
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 2015-04-29 15:49:34,655 INFO org.apache.hive.spark.client.SparkClientImpl:
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 2015-04-29 15:49:34,656 INFO org.apache.hive.spark.client.SparkClientImpl:
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 2015-04-29 15:49:34,656 INFO org.apache.hive.spark.client.SparkClientImpl:
   at java.lang.reflect.Method.invoke(Method.java:606)
 2015-04-29 15:49:34,656 INFO org.apache.hive.spark.client.SparkClientImpl:
   at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
 2015-04-29 15:49:34,657 INFO org.apache.hive.spark.client.SparkClientImpl:
   at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
 2015-04-29 15:49:34,657 INFO org.apache.hive.spark.client.SparkClientImpl:
   at com.sun.proxy.$Proxy12.getClusterMetrics(Unknown Source)
 2015-04-29 15:49:34,657 INFO org.apache.hive.spark.client.SparkClientImpl:
   at 
 org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getYarnClusterMetrics(YarnClientImpl.java:461)
 2015-04-29 15:49:34,657 INFO org.apache.hive.spark.client.SparkClientImpl:
   at 
 org.apache.spark.deploy.yarn.Client$$anonfun$submitApplication$1.apply(Client.scala:91)
 2015-04-29 15:49:34,657 INFO org.apache.hive.spark.client.SparkClientImpl:
   at 
 org.apache.spark.deploy.yarn.Client$$anonfun$submitApplication$1.apply(Client.scala:91)
 2015-04-29 15:49:34,657 INFO org.apache.hive.spark.client.SparkClientImpl:
   at org.apache.spark.Logging$class.logInfo(Logging.scala:59)
 2015-04-29 15:49:34,657 INFO org.apache.hive.spark.client.SparkClientImpl:
   at org.apache.spark.deploy.yarn.Client.logInfo(Client.scala:49)
 2015-04-29 15:49:34,657 INFO org.apache.hive.spark.client.SparkClientImpl:
   at org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:90)
 2015-04-29 15:49:34,658 INFO org.apache.hive.spark.client.SparkClientImpl:
   at

[jira] [Updated] (HIVE-11073) ORC FileDump utility ignores errors when writing output

2015-06-22 Thread Elliot West (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elliot West updated HIVE-11073:
---
Attachment: HIVE-11073.1.patch

 ORC FileDump utility ignores errors when writing output
 ---

 Key: HIVE-11073
 URL: https://issues.apache.org/jira/browse/HIVE-11073
 Project: Hive
  Issue Type: Bug
  Components: Hive
Affects Versions: 1.2.0
Reporter: Elliot West
Assignee: Elliot West
Priority: Minor
  Labels: cli, orc
 Attachments: HIVE-11073.1.patch


 The Hive command line provides the {{--orcfiledump}} utility for dumping data 
 contained within ORC files, specifically when using the {{-d}} option. 
 Generally, it is useful to be able to pipe the data extracted into other 
 commands and utilities to transform and control the data so that it is more 
 manageable by the CLI user. A classic example is {{less}}.
 When such command pipelines are currently constructed, the underlying 
 implementation in {{org.apache.hadoop.hive.ql.io.orc.FileDump#printJsonData}} 
 is oblivious to errors occurring when writing to its output stream. Such 
 errors are common place when a user issues {{Ctrl+C}} to kill the leaf 
 process. In this event the leaf process terminates immediately but the Hive 
 CLI process continues to execute until the full contents of the ORC file has 
 been read.
 By making {{FileDump}} considerate of output stream errors the process will 
 terminate as soon as the destination process exits (i.e. when the user kills 
 {{less}}) and control will be returned to the user as expected.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11073) ORC FileDump utility ignores errors when writing output

2015-06-22 Thread Alan Gates (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14596292#comment-14596292
 ] 

Alan Gates commented on HIVE-11073:
---

+1

 ORC FileDump utility ignores errors when writing output
 ---

 Key: HIVE-11073
 URL: https://issues.apache.org/jira/browse/HIVE-11073
 Project: Hive
  Issue Type: Bug
  Components: Hive
Affects Versions: 1.2.0
Reporter: Elliot West
Assignee: Elliot West
Priority: Minor
  Labels: cli, orc
 Attachments: HIVE-11073.1.patch


 The Hive command line provides the {{--orcfiledump}} utility for dumping data 
 contained within ORC files, specifically when using the {{-d}} option. 
 Generally, it is useful to be able to pipe the data extracted into other 
 commands and utilities to transform and control the data so that it is more 
 manageable by the CLI user. A classic example is {{less}}.
 When such command pipelines are currently constructed, the underlying 
 implementation in {{org.apache.hadoop.hive.ql.io.orc.FileDump#printJsonData}} 
 is oblivious to errors occurring when writing to its output stream. Such 
 errors are common place when a user issues {{Ctrl+C}} to kill the leaf 
 process. In this event the leaf process terminates immediately but the Hive 
 CLI process continues to execute until the full contents of the ORC file has 
 been read.
 By making {{FileDump}} considerate of output stream errors the process will 
 terminate as soon as the destination process exits (i.e. when the user kills 
 {{less}}) and control will be returned to the user as expected.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11062) Remove Exception stacktrace from Log.info when ACL is not supported.

2015-06-22 Thread Yongzhi Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongzhi Chen updated HIVE-11062:

Attachment: HIVE-11062.1.patch

 Remove Exception stacktrace from Log.info when ACL is not supported.
 

 Key: HIVE-11062
 URL: https://issues.apache.org/jira/browse/HIVE-11062
 Project: Hive
  Issue Type: Bug
  Components: Logging
Affects Versions: 1.1.0
Reporter: Yongzhi Chen
Assignee: Yongzhi Chen
Priority: Minor
 Attachments: HIVE-11062.1.patch


 When logging set to info, Extended ACL Enabled and the file system does not 
 support ACL, there are a lot of Exception stack trace in the log file. 
 Although it is benign, it can easily make users frustrated. We should set the 
 level to show the Exception in debug. 
 Current, the Exception in the log looks like:
 {noformat}
 2015-06-19 05:09:59,376 INFO org.apache.hadoop.hive.shims.HadoopShimsSecure: 
 Skipping ACL inheritance: File system for path s3a://yibing/hive does not 
 support ACLs but dfs.namenode.acls.enabled is set to true: 
 java.lang.UnsupportedOperationException: S3AFileSystem doesn't support 
 getAclStatus
 java.lang.UnsupportedOperationException: S3AFileSystem doesn't support 
 getAclStatus
   at org.apache.hadoop.fs.FileSystem.getAclStatus(FileSystem.java:2429)
   at 
 org.apache.hadoop.hive.shims.Hadoop23Shims.getFullFileStatus(Hadoop23Shims.java:729)
   at 
 org.apache.hadoop.hive.ql.metadata.Hive.inheritFromTable(Hive.java:2786)
   at org.apache.hadoop.hive.ql.metadata.Hive.replaceFiles(Hive.java:2694)
   at org.apache.hadoop.hive.ql.metadata.Table.replaceFiles(Table.java:640)
   at org.apache.hadoop.hive.ql.metadata.Hive.loadTable(Hive.java:1587)
   at org.apache.hadoop.hive.ql.exec.MoveTask.execute(MoveTask.java:297)
   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160)
   at 
 org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:88)
   at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1638)
   at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1397)
   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1181)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1047)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1042)
   at 
 org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:145)
   at 
 org.apache.hive.service.cli.operation.SQLOperation.access$100(SQLOperation.java:70)
   at 
 org.apache.hive.service.cli.operation.SQLOperation$1$1.run(SQLOperation.java:197)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:415)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
   at 
 org.apache.hive.service.cli.operation.SQLOperation$1.run(SQLOperation.java:209)
   at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at java.lang.Thread.run(Thread.java:745)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11073) ORC FileDump utility ignores errors when writing output

2015-06-22 Thread Elliot West (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elliot West updated HIVE-11073:
---
Summary: ORC FileDump utility ignores errors when writing output  (was: ORC 
FileDump utility ignore errors when writing output)

 ORC FileDump utility ignores errors when writing output
 ---

 Key: HIVE-11073
 URL: https://issues.apache.org/jira/browse/HIVE-11073
 Project: Hive
  Issue Type: Bug
  Components: Hive
Affects Versions: 1.2.0
Reporter: Elliot West
Assignee: Elliot West
Priority: Minor
  Labels: cli, orc

 The Hive command line provides the {{--orcfiledump}} utility for dumping data 
 contained within ORC files, specifically when using the {{-d}} option. 
 Generally, it is useful to be able to pipe the data extracted into other 
 commands and utilities to transform and control the data so that it is more 
 manageable by the CLI user. A classic example is {{less}}.
 When such command pipelines are currently constructed, the underlying 
 implementation in {{org.apache.hadoop.hive.ql.io.orc.FileDump#printJsonData}} 
 is oblivious to errors occurring when writing to its output stream. Such 
 errors are common place when a user issues {{Ctrl+C}} to kill the leaf 
 process. In this event the leaf process terminates immediately but the Hive 
 CLI process continues to execute until the full contents of the ORC file has 
 been read.
 By making {{FileDump}} considerate of output stream errors the process will 
 terminate as soon as the destination process exits (i.e. when the user kills 
 {{less}}) and control will be returned to the user as expected.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11060) Make test windowing.q robust

2015-06-22 Thread Jesus Camacho Rodriguez (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14596195#comment-14596195
 ] 

Jesus Camacho Rodriguez commented on HIVE-11060:


[~ashutoshc], test fails are not related to this patch. Thanks

 Make test windowing.q robust
 

 Key: HIVE-11060
 URL: https://issues.apache.org/jira/browse/HIVE-11060
 Project: Hive
  Issue Type: Bug
  Components: Tests
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez
 Attachments: HIVE-11060.01.patch, HIVE-11060.patch


 Add partition / order by in over clause to make result set deterministic.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-7193) Hive should support additional LDAP authentication parameters

2015-06-22 Thread Naveen Gangam (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naveen Gangam updated HIVE-7193:

Attachment: HIVE-7193.6.patch

Attaching a new patch with a doc change (description of a parameter in 
HiveConf), per Lefty's suggestion. No real code changes.

 Hive should support additional LDAP authentication parameters
 -

 Key: HIVE-7193
 URL: https://issues.apache.org/jira/browse/HIVE-7193
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.10.0
Reporter: Mala Chikka Kempanna
Assignee: Naveen Gangam
 Attachments: HIVE-7193.2.patch, HIVE-7193.3.patch, HIVE-7193.4.patch, 
 HIVE-7193.5.patch, HIVE-7193.6.patch, HIVE-7193.patch, 
 LDAPAuthentication_Design_Doc.docx, LDAPAuthentication_Design_Doc_V2.docx


 Currently hive has only following authenticator parameters for LDAP 
 authentication for hiveserver2:
 {code:xml}
 property 
   namehive.server2.authentication/name 
   valueLDAP/value 
 /property 
 property 
   namehive.server2.authentication.ldap.url/name 
   valueldap://our_ldap_address/value 
 /property 
 {code}
 We need to include other LDAP properties as part of hive-LDAP authentication 
 like below:
 {noformat}
 a group search base - dc=domain,dc=com 
 a group search filter - member={0} 
 a user search base - dc=domain,dc=com 
 a user search filter - sAMAAccountName={0} 
 a list of valid user groups - group1,group2,group3 
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10895) ObjectStore does not close Query objects in some calls, causing a potential leak in some metastore db resources

2015-06-22 Thread Vaibhav Gumashta (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14596013#comment-14596013
 ] 

Vaibhav Gumashta commented on HIVE-10895:
-

[~aihuaxu] I'm OOO for next few days. Please feel free to take over if you have 
a potential solution. If you can wait, I plan to work on this after 07/05. 
Thanks for checking back.

 ObjectStore does not close Query objects in some calls, causing a potential 
 leak in some metastore db resources
 ---

 Key: HIVE-10895
 URL: https://issues.apache.org/jira/browse/HIVE-10895
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.14.0, 1.0.0, 1.2.0, 1.1.0, 0.13
Reporter: Takahiko Saito
Assignee: Vaibhav Gumashta
 Attachments: HIVE-10895.1.patch


 During testing, we've noticed Oracle db running out of cursors. Might be 
 related to this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10895) ObjectStore does not close Query objects in some calls, causing a potential leak in some metastore db resources

2015-06-22 Thread Vaibhav Gumashta (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14596010#comment-14596010
 ] 

Vaibhav Gumashta commented on HIVE-10895:
-

[~aihuaxu] I'm OOO for next few days. Please feel free to take over if you have 
a potential solution. If you can wait, I plan to work on this after 07/05. 
Thanks for checking back.

 ObjectStore does not close Query objects in some calls, causing a potential 
 leak in some metastore db resources
 ---

 Key: HIVE-10895
 URL: https://issues.apache.org/jira/browse/HIVE-10895
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.14.0, 1.0.0, 1.2.0, 1.1.0, 0.13
Reporter: Takahiko Saito
Assignee: Vaibhav Gumashta
 Attachments: HIVE-10895.1.patch


 During testing, we've noticed Oracle db running out of cursors. Might be 
 related to this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11062) Remove Exception stacktrace from Log.info when ACL is not supported.

2015-06-22 Thread Yongzhi Chen (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14596382#comment-14596382
 ] 

Yongzhi Chen commented on HIVE-11062:
-

[~spena], could you review the change? Thanks

 Remove Exception stacktrace from Log.info when ACL is not supported.
 

 Key: HIVE-11062
 URL: https://issues.apache.org/jira/browse/HIVE-11062
 Project: Hive
  Issue Type: Bug
  Components: Logging
Affects Versions: 1.1.0
Reporter: Yongzhi Chen
Assignee: Yongzhi Chen
Priority: Minor
 Attachments: HIVE-11062.1.patch


 When logging set to info, Extended ACL Enabled and the file system does not 
 support ACL, there are a lot of Exception stack trace in the log file. 
 Although it is benign, it can easily make users frustrated. We should set the 
 level to show the Exception in debug. 
 Current, the Exception in the log looks like:
 {noformat}
 2015-06-19 05:09:59,376 INFO org.apache.hadoop.hive.shims.HadoopShimsSecure: 
 Skipping ACL inheritance: File system for path s3a://yibing/hive does not 
 support ACLs but dfs.namenode.acls.enabled is set to true: 
 java.lang.UnsupportedOperationException: S3AFileSystem doesn't support 
 getAclStatus
 java.lang.UnsupportedOperationException: S3AFileSystem doesn't support 
 getAclStatus
   at org.apache.hadoop.fs.FileSystem.getAclStatus(FileSystem.java:2429)
   at 
 org.apache.hadoop.hive.shims.Hadoop23Shims.getFullFileStatus(Hadoop23Shims.java:729)
   at 
 org.apache.hadoop.hive.ql.metadata.Hive.inheritFromTable(Hive.java:2786)
   at org.apache.hadoop.hive.ql.metadata.Hive.replaceFiles(Hive.java:2694)
   at org.apache.hadoop.hive.ql.metadata.Table.replaceFiles(Table.java:640)
   at org.apache.hadoop.hive.ql.metadata.Hive.loadTable(Hive.java:1587)
   at org.apache.hadoop.hive.ql.exec.MoveTask.execute(MoveTask.java:297)
   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160)
   at 
 org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:88)
   at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1638)
   at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1397)
   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1181)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1047)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1042)
   at 
 org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:145)
   at 
 org.apache.hive.service.cli.operation.SQLOperation.access$100(SQLOperation.java:70)
   at 
 org.apache.hive.service.cli.operation.SQLOperation$1$1.run(SQLOperation.java:197)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:415)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
   at 
 org.apache.hive.service.cli.operation.SQLOperation$1.run(SQLOperation.java:209)
   at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at java.lang.Thread.run(Thread.java:745)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-7193) Hive should support additional LDAP authentication parameters

2015-06-22 Thread Chaoyu Tang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-7193:
--
Labels: TODOC1.3 TODOC2.0  (was: )

 Hive should support additional LDAP authentication parameters
 -

 Key: HIVE-7193
 URL: https://issues.apache.org/jira/browse/HIVE-7193
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.10.0
Reporter: Mala Chikka Kempanna
Assignee: Naveen Gangam
  Labels: TODOC1.3, TODOC2.0
 Fix For: 1.3.0, 2.0.0

 Attachments: HIVE-7193.2.patch, HIVE-7193.3.patch, HIVE-7193.4.patch, 
 HIVE-7193.5.patch, HIVE-7193.6.patch, HIVE-7193.patch, 
 LDAPAuthentication_Design_Doc.docx, LDAPAuthentication_Design_Doc_V2.docx


 Currently hive has only following authenticator parameters for LDAP 
 authentication for hiveserver2:
 {code:xml}
 property 
   namehive.server2.authentication/name 
   valueLDAP/value 
 /property 
 property 
   namehive.server2.authentication.ldap.url/name 
   valueldap://our_ldap_address/value 
 /property 
 {code}
 We need to include other LDAP properties as part of hive-LDAP authentication 
 like below:
 {noformat}
 a group search base - dc=domain,dc=com 
 a group search filter - member={0} 
 a user search base - dc=domain,dc=com 
 a user search filter - sAMAAccountName={0} 
 a list of valid user groups - group1,group2,group3 
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11074) Update tests for HIVE-9302 after removing binaries

2015-06-22 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-11074:
---
Component/s: Tests

 Update tests for HIVE-9302 after removing binaries
 --

 Key: HIVE-11074
 URL: https://issues.apache.org/jira/browse/HIVE-11074
 Project: Hive
  Issue Type: Bug
  Components: Tests
Affects Versions: 1.2.0
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10165) Improve hive-hcatalog-streaming extensibility and support updates and deletes.

2015-06-22 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14596613#comment-14596613
 ] 

Hive QA commented on HIVE-10165:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12741056/HIVE-10165.9.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 9104 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join28
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4337/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4337/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4337/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12741056 - PreCommit-HIVE-TRUNK-Build

 Improve hive-hcatalog-streaming extensibility and support updates and deletes.
 --

 Key: HIVE-10165
 URL: https://issues.apache.org/jira/browse/HIVE-10165
 Project: Hive
  Issue Type: Improvement
  Components: HCatalog
Affects Versions: 1.2.0
Reporter: Elliot West
Assignee: Elliot West
  Labels: streaming_api
 Attachments: HIVE-10165.0.patch, HIVE-10165.4.patch, 
 HIVE-10165.5.patch, HIVE-10165.6.patch, HIVE-10165.7.patch, 
 HIVE-10165.9.patch, mutate-system-overview.png


 h3. Overview
 I'd like to extend the 
 [hive-hcatalog-streaming|https://cwiki.apache.org/confluence/display/Hive/Streaming+Data+Ingest]
  API so that it also supports the writing of record updates and deletes in 
 addition to the already supported inserts.
 h3. Motivation
 We have many Hadoop processes outside of Hive that merge changed facts into 
 existing datasets. Traditionally we achieve this by: reading in a 
 ground-truth dataset and a modified dataset, grouping by a key, sorting by a 
 sequence and then applying a function to determine inserted, updated, and 
 deleted rows. However, in our current scheme we must rewrite all partitions 
 that may potentially contain changes. In practice the number of mutated 
 records is very small when compared with the records contained in a 
 partition. This approach results in a number of operational issues:
 * Excessive amount of write activity required for small data changes.
 * Downstream applications cannot robustly read these datasets while they are 
 being updated.
 * Due to scale of the updates (hundreds or partitions) the scope for 
 contention is high. 
 I believe we can address this problem by instead writing only the changed 
 records to a Hive transactional table. This should drastically reduce the 
 amount of data that we need to write and also provide a means for managing 
 concurrent access to the data. Our existing merge processes can read and 
 retain each record's {{ROW_ID}}/{{RecordIdentifier}} and pass this through to 
 an updated form of the hive-hcatalog-streaming API which will then have the 
 required data to perform an update or insert in a transactional manner. 
 h3. Benefits
 * Enables the creation of large-scale dataset merge processes  
 * Opens up Hive transactional functionality in an accessible manner to 
 processes that operate outside of Hive.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (HIVE-8190) LDAP user match for authentication on hiveserver2

2015-06-22 Thread Naveen Gangam (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naveen Gangam resolved HIVE-8190.
-
   Resolution: Fixed
Fix Version/s: 2.0.0
   1.3.0
 Hadoop Flags: Reviewed

A more general fix for this issue has been included in HIVE-7193 that add 
filter support for LDAP user and groups. Users can configure the following 
properties to indicate multiple patterns(COMMA-separated) for DNs where 
users/groups can be located in LDAP.
hive.server2.authentication.ldap.groupDNPattern
hive.server2.authentication.ldap.userDNPattern

ex: uid=%s,ou=Users,DC=domain,DC=com:CN=%s,CN=Users,DC=domain,DC=com
uid=%s,ou=Groups,DC=domain,DC=com:CN=%s,CN=Groups,DC=domain,DC=com

Please provide any feedback you have on the new features. Thanks

 LDAP user match for authentication on hiveserver2
 -

 Key: HIVE-8190
 URL: https://issues.apache.org/jira/browse/HIVE-8190
 Project: Hive
  Issue Type: Improvement
  Components: Authorization, Clients
Affects Versions: 0.13.1
 Environment: Centos 6.5
Reporter: LINTE
Assignee: Naveen Gangam
 Fix For: 1.3.0, 2.0.0


 Some LDAP has the user composant as CN and not UID.
 SO when you try to authenticate the LDAP authentication module of hive try to 
 authenticate with the following string :  
 uid=$login,basedn
 Some AD have user objects that are not uid but cn, so it is be important to 
 personalize the kind of objects that the authentication moduel look for in 
 ldap.
 We can see an exemple in knox LDAP module configuration the parameter 
 main.ldapRealm.userDnTemplate can be configured to look for :
 uid : 'uid={0}, basedn'
 or cn : 'cn={0}, basedn'



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11074) Update tests for HIVE-9302 after removing binaries

2015-06-22 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-11074:
---
Attachment: HIVE-11074.patch

 Update tests for HIVE-9302 after removing binaries
 --

 Key: HIVE-11074
 URL: https://issues.apache.org/jira/browse/HIVE-11074
 Project: Hive
  Issue Type: Bug
  Components: Tests
Affects Versions: 1.2.0
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez
 Attachments: HIVE-11074.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10895) ObjectStore does not close Query objects in some calls, causing a potential leak in some metastore db resources

2015-06-22 Thread Aihua Xu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14596679#comment-14596679
 ] 

Aihua Xu commented on HIVE-10895:
-

[~vgumashta] I was looking into two solutions. 1. a wapper class for the JDO 
query result which will dispose the resources when the wrapper class gets 
garbage collected automatically.  But this approach could still cause issues 
that resources like cursors may not get released soon enough.  2. Clone query 
result to a new list to force iterating the result and return that new list and 
then call query.closeAll() immediately. I will work on the second approach, 

 ObjectStore does not close Query objects in some calls, causing a potential 
 leak in some metastore db resources
 ---

 Key: HIVE-10895
 URL: https://issues.apache.org/jira/browse/HIVE-10895
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.14.0, 1.0.0, 1.2.0, 1.1.0, 0.13
Reporter: Takahiko Saito
Assignee: Aihua Xu
 Attachments: HIVE-10895.1.patch


 During testing, we've noticed Oracle db running out of cursors. Might be 
 related to this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11037) HiveOnTez: make explain user level = true as default

2015-06-22 Thread Laljo John Pullokkaran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14596692#comment-14596692
 ] 

Laljo John Pullokkaran commented on HIVE-11037:
---

Add a note why comparable of vertex uses string comparison.
Otherwise look good.

+1 conditional on QA run.

 HiveOnTez: make explain user level = true as default
 

 Key: HIVE-11037
 URL: https://issues.apache.org/jira/browse/HIVE-11037
 Project: Hive
  Issue Type: Improvement
Reporter: Pengcheng Xiong
Assignee: Pengcheng Xiong
 Attachments: HIVE-11037.01.patch, HIVE-11037.02.patch, 
 HIVE-11037.03.patch, HIVE-11037.04.patch, HIVE-11037.05.patch, 
 HIVE-11037.06.patch


 In Hive-9780, we introduced a new level of explain for hive on tez. We would 
 like to make it running by default.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11074) Update tests for HIVE-9302 after removing binaries

2015-06-22 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-11074:
---
Attachment: HIVE-11074.patch

[~hsubramaniyan], could you check the patch? It seems I added two test files in 
the wrong place in HIVE-11041, which is causing TestSessionState to fail. Thanks

 Update tests for HIVE-9302 after removing binaries
 --

 Key: HIVE-11074
 URL: https://issues.apache.org/jira/browse/HIVE-11074
 Project: Hive
  Issue Type: Bug
  Components: Tests
Affects Versions: 1.2.0
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11074) Update tests for HIVE-9302 after removing binaries

2015-06-22 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-11074:
---
Attachment: (was: HIVE-11074.patch)

 Update tests for HIVE-9302 after removing binaries
 --

 Key: HIVE-11074
 URL: https://issues.apache.org/jira/browse/HIVE-11074
 Project: Hive
  Issue Type: Bug
  Components: Tests
Affects Versions: 1.2.0
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11037) HiveOnTez: make explain user level = true as default

2015-06-22 Thread Pengcheng Xiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-11037:
---
Attachment: HIVE-11037.06.patch

address the vertex ordering issue. [~jpullokkaran], could you please take a 
look?

 HiveOnTez: make explain user level = true as default
 

 Key: HIVE-11037
 URL: https://issues.apache.org/jira/browse/HIVE-11037
 Project: Hive
  Issue Type: Improvement
Reporter: Pengcheng Xiong
Assignee: Pengcheng Xiong
 Attachments: HIVE-11037.01.patch, HIVE-11037.02.patch, 
 HIVE-11037.03.patch, HIVE-11037.04.patch, HIVE-11037.05.patch, 
 HIVE-11037.06.patch


 In Hive-9780, we introduced a new level of explain for hive on tez. We would 
 like to make it running by default.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (HIVE-9880) Support configurable username attribute for HiveServer2 LDAP authentication

2015-06-22 Thread Naveen Gangam (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naveen Gangam resolved HIVE-9880.
-
   Resolution: Fixed
Fix Version/s: 2.0.0
   1.3.0
 Hadoop Flags: Reviewed

A more general fix for this issue has been included in HIVE-7193 that add 
filter support for LDAP user and groups. Users can configure the following 
properties to indicate multiple patterns(COMMA-separated) for DNs where 
users/groups can be located in LDAP.
hive.server2.authentication.ldap.groupDNPattern
hive.server2.authentication.ldap.userDNPattern

ex: uid=%s,ou=Users,DC=domain,DC=com:CN=%s,CN=Users,DC=domain,DC=com
uid=%s,ou=Groups,DC=domain,DC=com:CN=%s,CN=Groups,DC=domain,DC=com

Please provide any feedback you have on the new features. Thanks

 Support configurable username attribute for HiveServer2 LDAP authentication
 ---

 Key: HIVE-9880
 URL: https://issues.apache.org/jira/browse/HIVE-9880
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Affects Versions: 0.13.0
Reporter: Jaime Murillo
Assignee: Naveen Gangam
 Fix For: 1.3.0, 2.0.0

 Attachments: HIVE-9880-1.patch


 OpenLDAP requires that when bind authenticating, the DN being supplied must 
 be the creation DN of the account.  Since, OpenLDAP allows for any attribute 
 to be used when creating a DN for an account, organizations that don’t use 
 hardcoded *uid* attribute won’t be able to utilize HiveServer2 LDAP 
 authentication.
 HiveServer2 should support a configurable username attribute when 
 constructing the bindDN



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11037) HiveOnTez: make explain user level = true as default

2015-06-22 Thread Pengcheng Xiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-11037:
---
Attachment: HIVE-11037.07.patch

 HiveOnTez: make explain user level = true as default
 

 Key: HIVE-11037
 URL: https://issues.apache.org/jira/browse/HIVE-11037
 Project: Hive
  Issue Type: Improvement
Reporter: Pengcheng Xiong
Assignee: Pengcheng Xiong
 Attachments: HIVE-11037.01.patch, HIVE-11037.02.patch, 
 HIVE-11037.03.patch, HIVE-11037.04.patch, HIVE-11037.05.patch, 
 HIVE-11037.06.patch, HIVE-11037.07.patch


 In Hive-9780, we introduced a new level of explain for hive on tez. We would 
 like to make it running by default.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11071) FIx the output of beeline dbinfo command

2015-06-22 Thread Xuefu Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14596834#comment-14596834
 ] 

Xuefu Zhang commented on HIVE-11071:


+1

 FIx the output of beeline dbinfo command
 

 Key: HIVE-11071
 URL: https://issues.apache.org/jira/browse/HIVE-11071
 Project: Hive
  Issue Type: Bug
  Components: Beeline
Reporter: Shinichi Yamashita
Assignee: Shinichi Yamashita
 Attachments: HIVE-11071-001-output.txt, HIVE-11071-001.patch


 When dbinfo is executed by beeline, it is displayed as follows. 
 {code}
 0: jdbc:hive2://localhost:10001/ !dbinfo
 Error: Method not supported (state=,code=0)
 allTablesAreSelectabletrue
 Error: Method not supported (state=,code=0)
 Error: Method not supported (state=,code=0)
 Error: Method not supported (state=,code=0)
 getCatalogSeparator   .
 getCatalogTerminstance
 getDatabaseProductNameApache Hive
 getDatabaseProductVersion 2.0.0-SNAPSHOT
 getDefaultTransactionIsolation0
 getDriverMajorVersion 1
 getDriverMinorVersion 1
 getDriverName Hive JDBC
 ...
 {code}
 The method name of Error is not understood. I fix this output.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10996) Aggregation / Projection over Multi-Join Inner Query producing incorrect results

2015-06-22 Thread Laljo John Pullokkaran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14596729#comment-14596729
 ] 

Laljo John Pullokkaran commented on HIVE-10996:
---

+1 Conditional on QA run.

 Aggregation / Projection over Multi-Join Inner Query producing incorrect 
 results
 

 Key: HIVE-10996
 URL: https://issues.apache.org/jira/browse/HIVE-10996
 Project: Hive
  Issue Type: Bug
  Components: Hive
Affects Versions: 1.0.0, 1.2.0, 1.1.0, 1.3.0, 2.0.0
Reporter: Gautam Kowshik
Assignee: Jesus Camacho Rodriguez
Priority: Critical
 Attachments: HIVE-10996.01.patch, HIVE-10996.02.patch, 
 HIVE-10996.03.patch, HIVE-10996.04.patch, HIVE-10996.05.patch, 
 HIVE-10996.06.patch, HIVE-10996.07.patch, HIVE-10996.08.patch, 
 HIVE-10996.09.patch, HIVE-10996.patch, explain_q1.txt, explain_q2.txt


 We see the following problem on 1.1.0 and 1.2.0 but not 0.13 which seems like 
 a regression.
 The following query (Q1) produces no results:
 {code}
 select s
 from (
   select last.*, action.st2, action.n
   from (
 select purchase.s, purchase.timestamp, max (mevt.timestamp) as 
 last_stage_timestamp
 from (select * from purchase_history) purchase
 join (select * from cart_history) mevt
 on purchase.s = mevt.s
 where purchase.timestamp  mevt.timestamp
 group by purchase.s, purchase.timestamp
   ) last
   join (select * from events) action
   on last.s = action.s and last.last_stage_timestamp = action.timestamp
 ) list;
 {code}
 While this one (Q2) does produce results :
 {code}
 select *
 from (
   select last.*, action.st2, action.n
   from (
 select purchase.s, purchase.timestamp, max (mevt.timestamp) as 
 last_stage_timestamp
 from (select * from purchase_history) purchase
 join (select * from cart_history) mevt
 on purchase.s = mevt.s
 where purchase.timestamp  mevt.timestamp
 group by purchase.s, purchase.timestamp
   ) last
   join (select * from events) action
   on last.s = action.s and last.last_stage_timestamp = action.timestamp
 ) list;
 1 21  20  Bob 1234
 1 31  30  Bob 1234
 3 51  50  Jeff1234
 {code}
 The setup to test this is:
 {code}
 create table purchase_history (s string, product string, price double, 
 timestamp int);
 insert into purchase_history values ('1', 'Belt', 20.00, 21);
 insert into purchase_history values ('1', 'Socks', 3.50, 31);
 insert into purchase_history values ('3', 'Belt', 20.00, 51);
 insert into purchase_history values ('4', 'Shirt', 15.50, 59);
 create table cart_history (s string, cart_id int, timestamp int);
 insert into cart_history values ('1', 1, 10);
 insert into cart_history values ('1', 2, 20);
 insert into cart_history values ('1', 3, 30);
 insert into cart_history values ('1', 4, 40);
 insert into cart_history values ('3', 5, 50);
 insert into cart_history values ('4', 6, 60);
 create table events (s string, st2 string, n int, timestamp int);
 insert into events values ('1', 'Bob', 1234, 20);
 insert into events values ('1', 'Bob', 1234, 30);
 insert into events values ('1', 'Bob', 1234, 25);
 insert into events values ('2', 'Sam', 1234, 30);
 insert into events values ('3', 'Jeff', 1234, 50);
 insert into events values ('4', 'Ted', 1234, 60);
 {code}
 I realize select * and select s are not all that interesting in this context 
 but what lead us to this issue was select count(distinct s) was not returning 
 results. The above queries are the simplified queries that produce the issue. 
 I will note that if I convert the inner join to a table and select from that 
 the issue does not appear.
 Update: Found that turning off  hive.optimize.remove.identity.project fixes 
 this issue. This optimization was introduced in 
 https://issues.apache.org/jira/browse/HIVE-8435



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11073) ORC FileDump utility ignores errors when writing output

2015-06-22 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14596752#comment-14596752
 ] 

Hive QA commented on HIVE-11073:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12741071/HIVE-11073.1.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 9014 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join28
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4338/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4338/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4338/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12741071 - PreCommit-HIVE-TRUNK-Build

 ORC FileDump utility ignores errors when writing output
 ---

 Key: HIVE-11073
 URL: https://issues.apache.org/jira/browse/HIVE-11073
 Project: Hive
  Issue Type: Bug
  Components: Hive
Affects Versions: 1.2.0
Reporter: Elliot West
Assignee: Elliot West
Priority: Minor
  Labels: cli, orc
 Attachments: HIVE-11073.1.patch


 The Hive command line provides the {{--orcfiledump}} utility for dumping data 
 contained within ORC files, specifically when using the {{-d}} option. 
 Generally, it is useful to be able to pipe the data extracted into other 
 commands and utilities to transform and control the data so that it is more 
 manageable by the CLI user. A classic example is {{less}}.
 When such command pipelines are currently constructed, the underlying 
 implementation in {{org.apache.hadoop.hive.ql.io.orc.FileDump#printJsonData}} 
 is oblivious to errors occurring when writing to its output stream. Such 
 errors are common place when a user issues {{Ctrl+C}} to kill the leaf 
 process. In this event the leaf process terminates immediately but the Hive 
 CLI process continues to execute until the full contents of the ORC file has 
 been read.
 By making {{FileDump}} considerate of output stream errors the process will 
 terminate as soon as the destination process exits (i.e. when the user kills 
 {{less}}) and control will be returned to the user as expected.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11076) Explicitly set hive.cbo.enable=true for some tests

2015-06-22 Thread Pengcheng Xiong (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14596769#comment-14596769
 ] 

Pengcheng Xiong commented on HIVE-11076:


[~ashutoshc], could you please take a look? Thanks.

 Explicitly set hive.cbo.enable=true for some tests
 --

 Key: HIVE-11076
 URL: https://issues.apache.org/jira/browse/HIVE-11076
 Project: Hive
  Issue Type: Improvement
Reporter: Pengcheng Xiong
Assignee: Pengcheng Xiong
 Attachments: HIVE-11076.01.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11076) Explicitly set hive.cbo.enable=true for some tests

2015-06-22 Thread Pengcheng Xiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-11076:
---
Attachment: HIVE-11076.01.patch

 Explicitly set hive.cbo.enable=true for some tests
 --

 Key: HIVE-11076
 URL: https://issues.apache.org/jira/browse/HIVE-11076
 Project: Hive
  Issue Type: Improvement
Reporter: Pengcheng Xiong
Assignee: Pengcheng Xiong
 Attachments: HIVE-11076.01.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11062) Remove Exception stacktrace from Log.info when ACL is not supported.

2015-06-22 Thread JIRA


[ 
https://issues.apache.org/jira/browse/HIVE-11062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14596579#comment-14596579
 ] 

Sergio Peña commented on HIVE-11062:


Thanks [~ychena]
+1

 Remove Exception stacktrace from Log.info when ACL is not supported.
 

 Key: HIVE-11062
 URL: https://issues.apache.org/jira/browse/HIVE-11062
 Project: Hive
  Issue Type: Bug
  Components: Logging
Affects Versions: 1.1.0
Reporter: Yongzhi Chen
Assignee: Yongzhi Chen
Priority: Minor
 Attachments: HIVE-11062.1.patch


 When logging set to info, Extended ACL Enabled and the file system does not 
 support ACL, there are a lot of Exception stack trace in the log file. 
 Although it is benign, it can easily make users frustrated. We should set the 
 level to show the Exception in debug. 
 Current, the Exception in the log looks like:
 {noformat}
 2015-06-19 05:09:59,376 INFO org.apache.hadoop.hive.shims.HadoopShimsSecure: 
 Skipping ACL inheritance: File system for path s3a://yibing/hive does not 
 support ACLs but dfs.namenode.acls.enabled is set to true: 
 java.lang.UnsupportedOperationException: S3AFileSystem doesn't support 
 getAclStatus
 java.lang.UnsupportedOperationException: S3AFileSystem doesn't support 
 getAclStatus
   at org.apache.hadoop.fs.FileSystem.getAclStatus(FileSystem.java:2429)
   at 
 org.apache.hadoop.hive.shims.Hadoop23Shims.getFullFileStatus(Hadoop23Shims.java:729)
   at 
 org.apache.hadoop.hive.ql.metadata.Hive.inheritFromTable(Hive.java:2786)
   at org.apache.hadoop.hive.ql.metadata.Hive.replaceFiles(Hive.java:2694)
   at org.apache.hadoop.hive.ql.metadata.Table.replaceFiles(Table.java:640)
   at org.apache.hadoop.hive.ql.metadata.Hive.loadTable(Hive.java:1587)
   at org.apache.hadoop.hive.ql.exec.MoveTask.execute(MoveTask.java:297)
   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160)
   at 
 org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:88)
   at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1638)
   at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1397)
   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1181)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1047)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1042)
   at 
 org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:145)
   at 
 org.apache.hive.service.cli.operation.SQLOperation.access$100(SQLOperation.java:70)
   at 
 org.apache.hive.service.cli.operation.SQLOperation$1$1.run(SQLOperation.java:197)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:415)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
   at 
 org.apache.hive.service.cli.operation.SQLOperation$1.run(SQLOperation.java:209)
   at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at java.lang.Thread.run(Thread.java:745)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-7193) Hive should support additional LDAP authentication parameters

2015-06-22 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14596472#comment-14596472
 ] 

Hive QA commented on HIVE-7193:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12741055/HIVE-7193.6.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 9013 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join28
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4336/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4336/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4336/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12741055 - PreCommit-HIVE-TRUNK-Build

 Hive should support additional LDAP authentication parameters
 -

 Key: HIVE-7193
 URL: https://issues.apache.org/jira/browse/HIVE-7193
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.10.0
Reporter: Mala Chikka Kempanna
Assignee: Naveen Gangam
 Attachments: HIVE-7193.2.patch, HIVE-7193.3.patch, HIVE-7193.4.patch, 
 HIVE-7193.5.patch, HIVE-7193.6.patch, HIVE-7193.patch, 
 LDAPAuthentication_Design_Doc.docx, LDAPAuthentication_Design_Doc_V2.docx


 Currently hive has only following authenticator parameters for LDAP 
 authentication for hiveserver2:
 {code:xml}
 property 
   namehive.server2.authentication/name 
   valueLDAP/value 
 /property 
 property 
   namehive.server2.authentication.ldap.url/name 
   valueldap://our_ldap_address/value 
 /property 
 {code}
 We need to include other LDAP properties as part of hive-LDAP authentication 
 like below:
 {noformat}
 a group search base - dc=domain,dc=com 
 a group search filter - member={0} 
 a user search base - dc=domain,dc=com 
 a user search filter - sAMAAccountName={0} 
 a list of valid user groups - group1,group2,group3 
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-7193) Hive should support additional LDAP authentication parameters

2015-06-22 Thread Naveen Gangam (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14596489#comment-14596489
 ] 

Naveen Gangam commented on HIVE-7193:
-

This test seems flaky. It failed with patch v4, and passed with patch v5 and 
failed again with patch v6, although there are no code changes between v4,v5 
and v6 of the patch, just doc changes (javadocs and hiveconf parameter 
descriptions). The failure does not appear to be related to my fix. 

 Hive should support additional LDAP authentication parameters
 -

 Key: HIVE-7193
 URL: https://issues.apache.org/jira/browse/HIVE-7193
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.10.0
Reporter: Mala Chikka Kempanna
Assignee: Naveen Gangam
 Attachments: HIVE-7193.2.patch, HIVE-7193.3.patch, HIVE-7193.4.patch, 
 HIVE-7193.5.patch, HIVE-7193.6.patch, HIVE-7193.patch, 
 LDAPAuthentication_Design_Doc.docx, LDAPAuthentication_Design_Doc_V2.docx


 Currently hive has only following authenticator parameters for LDAP 
 authentication for hiveserver2:
 {code:xml}
 property 
   namehive.server2.authentication/name 
   valueLDAP/value 
 /property 
 property 
   namehive.server2.authentication.ldap.url/name 
   valueldap://our_ldap_address/value 
 /property 
 {code}
 We need to include other LDAP properties as part of hive-LDAP authentication 
 like below:
 {noformat}
 a group search base - dc=domain,dc=com 
 a group search filter - member={0} 
 a user search base - dc=domain,dc=com 
 a user search filter - sAMAAccountName={0} 
 a list of valid user groups - group1,group2,group3 
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10673) Dynamically partitioned hash join for Tez

2015-06-22 Thread Jason Dere (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-10673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-10673:
--
Attachment: HIVE-10673.5.patch

Patch v5 - rebasing with trunk

 Dynamically partitioned hash join for Tez
 -

 Key: HIVE-10673
 URL: https://issues.apache.org/jira/browse/HIVE-10673
 Project: Hive
  Issue Type: Bug
  Components: Query Planning, Query Processor
Reporter: Jason Dere
Assignee: Jason Dere
 Attachments: HIVE-10673.1.patch, HIVE-10673.2.patch, 
 HIVE-10673.3.patch, HIVE-10673.4.patch, HIVE-10673.5.patch


 Reduce-side hash join (using MapJoinOperator), where the Tez inputs to the 
 reducer are unsorted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10996) Aggregation / Projection over Multi-Join Inner Query producing incorrect results

2015-06-22 Thread Laljo John Pullokkaran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14596562#comment-14596562
 ] 

Laljo John Pullokkaran commented on HIVE-10996:
---

Some more comments:
1. ql/src/java/org/apache/hadoop/hive/ql/optimizer/ColumnPrunerProcFactory.java
for (Operator? child : op.getChildOperators()) {
+if (child instanceof SelectOperator || child instanceof 
ReduceSinkOperator) {
+  continue;
+}
+ListString neededCols = cppCtx.genColLists(op, child);
+if (neededCols.size()  op.getSchema().getSignature().size()) {
+  ArrayListExprNodeDesc exprs = new ArrayListExprNodeDesc();
+  ArrayListString outputs = new ArrayListString();
+  MapString, ExprNodeDesc colExprMap = new HashMapString, 
ExprNodeDesc();
+  ArrayListColumnInfo outputRS = new ArrayListColumnInfo();
+  for (String internalName : neededCols) {
+ColumnInfo colInfo = op.getSchema().getColumnInfo(

Should preserve the order of cols as it appears in GB

2. Nit Pick: change name of OP to GBOP
3. Nit Pick: Change name of output to outputColNames

 Aggregation / Projection over Multi-Join Inner Query producing incorrect 
 results
 

 Key: HIVE-10996
 URL: https://issues.apache.org/jira/browse/HIVE-10996
 Project: Hive
  Issue Type: Bug
  Components: Hive
Affects Versions: 1.0.0, 1.2.0, 1.1.0, 1.3.0, 2.0.0
Reporter: Gautam Kowshik
Assignee: Jesus Camacho Rodriguez
Priority: Critical
 Attachments: HIVE-10996.01.patch, HIVE-10996.02.patch, 
 HIVE-10996.03.patch, HIVE-10996.04.patch, HIVE-10996.05.patch, 
 HIVE-10996.06.patch, HIVE-10996.07.patch, HIVE-10996.08.patch, 
 HIVE-10996.patch, explain_q1.txt, explain_q2.txt


 We see the following problem on 1.1.0 and 1.2.0 but not 0.13 which seems like 
 a regression.
 The following query (Q1) produces no results:
 {code}
 select s
 from (
   select last.*, action.st2, action.n
   from (
 select purchase.s, purchase.timestamp, max (mevt.timestamp) as 
 last_stage_timestamp
 from (select * from purchase_history) purchase
 join (select * from cart_history) mevt
 on purchase.s = mevt.s
 where purchase.timestamp  mevt.timestamp
 group by purchase.s, purchase.timestamp
   ) last
   join (select * from events) action
   on last.s = action.s and last.last_stage_timestamp = action.timestamp
 ) list;
 {code}
 While this one (Q2) does produce results :
 {code}
 select *
 from (
   select last.*, action.st2, action.n
   from (
 select purchase.s, purchase.timestamp, max (mevt.timestamp) as 
 last_stage_timestamp
 from (select * from purchase_history) purchase
 join (select * from cart_history) mevt
 on purchase.s = mevt.s
 where purchase.timestamp  mevt.timestamp
 group by purchase.s, purchase.timestamp
   ) last
   join (select * from events) action
   on last.s = action.s and last.last_stage_timestamp = action.timestamp
 ) list;
 1 21  20  Bob 1234
 1 31  30  Bob 1234
 3 51  50  Jeff1234
 {code}
 The setup to test this is:
 {code}
 create table purchase_history (s string, product string, price double, 
 timestamp int);
 insert into purchase_history values ('1', 'Belt', 20.00, 21);
 insert into purchase_history values ('1', 'Socks', 3.50, 31);
 insert into purchase_history values ('3', 'Belt', 20.00, 51);
 insert into purchase_history values ('4', 'Shirt', 15.50, 59);
 create table cart_history (s string, cart_id int, timestamp int);
 insert into cart_history values ('1', 1, 10);
 insert into cart_history values ('1', 2, 20);
 insert into cart_history values ('1', 3, 30);
 insert into cart_history values ('1', 4, 40);
 insert into cart_history values ('3', 5, 50);
 insert into cart_history values ('4', 6, 60);
 create table events (s string, st2 string, n int, timestamp int);
 insert into events values ('1', 'Bob', 1234, 20);
 insert into events values ('1', 'Bob', 1234, 30);
 insert into events values ('1', 'Bob', 1234, 25);
 insert into events values ('2', 'Sam', 1234, 30);
 insert into events values ('3', 'Jeff', 1234, 50);
 insert into events values ('4', 'Ted', 1234, 60);
 {code}
 I realize select * and select s are not all that interesting in this context 
 but what lead us to this issue was select count(distinct s) was not returning 
 results. The above queries are the simplified queries that produce the issue. 
 I will note that if I convert the inner join to a table and select from that 
 the issue does not appear.
 Update: Found that turning off  hive.optimize.remove.identity.project fixes 
 this issue. This optimization was introduced in 
 https://issues.apache.org/jira/browse/HIVE-8435



--
This message was sent by Atlassian

[jira] [Commented] (HIVE-10233) Hive on tez: memory manager for grace hash join

2015-06-22 Thread Gunther Hagleitner (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14597097#comment-14597097
 ] 

Gunther Hagleitner commented on HIVE-10233:
---

.11 is a simplified version. I threw out all the mem computation wrt edges and 
basically just kept the adjustment for graceful joins. 
[~vikram.dixit]/[~wzheng] could you take a look?

 Hive on tez: memory manager for grace hash join
 ---

 Key: HIVE-10233
 URL: https://issues.apache.org/jira/browse/HIVE-10233
 Project: Hive
  Issue Type: Bug
  Components: Tez
Affects Versions: llap, 2.0.0
Reporter: Vikram Dixit K
Assignee: Gunther Hagleitner
 Attachments: HIVE-10233-WIP-2.patch, HIVE-10233-WIP-3.patch, 
 HIVE-10233-WIP-4.patch, HIVE-10233-WIP-5.patch, HIVE-10233-WIP-6.patch, 
 HIVE-10233-WIP-7.patch, HIVE-10233-WIP-8.patch, HIVE-10233.08.patch, 
 HIVE-10233.09.patch, HIVE-10233.10.patch, HIVE-10233.11.patch


 We need a memory manager in llap/tez to manage the usage of memory across 
 threads. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10233) Hive on tez: memory manager for grace hash join

2015-06-22 Thread Gunther Hagleitner (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-10233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-10233:
--
Attachment: HIVE-10233.11.patch

 Hive on tez: memory manager for grace hash join
 ---

 Key: HIVE-10233
 URL: https://issues.apache.org/jira/browse/HIVE-10233
 Project: Hive
  Issue Type: Bug
  Components: Tez
Affects Versions: llap, 2.0.0
Reporter: Vikram Dixit K
Assignee: Vikram Dixit K
 Attachments: HIVE-10233-WIP-2.patch, HIVE-10233-WIP-3.patch, 
 HIVE-10233-WIP-4.patch, HIVE-10233-WIP-5.patch, HIVE-10233-WIP-6.patch, 
 HIVE-10233-WIP-7.patch, HIVE-10233-WIP-8.patch, HIVE-10233.08.patch, 
 HIVE-10233.09.patch, HIVE-10233.10.patch, HIVE-10233.11.patch


 We need a memory manager in llap/tez to manage the usage of memory across 
 threads. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-5457) Concurrent calls to getTable() result in: MetaException: org.datanucleus.exceptions.NucleusException: Invalid index 1 for DataStoreMapping. NucleusException: Invalid in

2015-06-22 Thread JIRA


[ 
https://issues.apache.org/jira/browse/HIVE-5457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14597085#comment-14597085
 ] 

陈典贵 commented on HIVE-5457:
---

hi! Did you resolve this problem? It happend when we use hue to execute hql. 
And we set datanucleus.autoStartMechanism= SchemaTabler refered in fixed 
4762,but it helpless.

 Concurrent calls to getTable() result in: MetaException: 
 org.datanucleus.exceptions.NucleusException: Invalid index 1 for 
 DataStoreMapping.  NucleusException: Invalid index 1 for DataStoreMapping
 ---

 Key: HIVE-5457
 URL: https://issues.apache.org/jira/browse/HIVE-5457
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.10.0
Reporter: Lenni Kuff
Priority: Critical

 Concurrent calls to getTable() result in: MetaException: 
 org.datanucleus.exceptions.NucleusException: Invalid index 1 for 
 DataStoreMapping.  NucleusException: Invalid index 1 for DataStoreMapping
 This happens when using a Hive Metastore Service directly connecting to the 
 backend metastore db. I have been able to hit this with as few as 2 
 concurrent calls.  When I update my app to serialize all calls to getTable() 
 this problem is resolved. 
 Stack Trace:
 {code}
 Caused by: org.datanucleus.exceptions.NucleusException: Invalid index 1 for 
 DataStoreMapping.
 at 
 org.datanucleus.store.mapped.mapping.PersistableMapping.getDatastoreMapping(PersistableMapping.java:307)
 at 
 org.datanucleus.store.rdbms.scostore.RDBMSElementContainerStoreSpecialization.getSizeStmt(RDBMSElementContainerStoreSpecialization.java:407)
 at 
 org.datanucleus.store.rdbms.scostore.RDBMSElementContainerStoreSpecialization.getSize(RDBMSElementContainerStoreSpecialization.java:257)
 at 
 org.datanucleus.store.rdbms.scostore.RDBMSJoinListStoreSpecialization.getSize(RDBMSJoinListStoreSpecialization.java:46)
 at 
 org.datanucleus.store.mapped.scostore.ElementContainerStore.size(ElementContainerStore.java:440)
 at org.datanucleus.sco.backed.List.size(List.java:557) 
 at 
 org.apache.hadoop.hive.metastore.ObjectStore.convertToSkewedValues(ObjectStore.java:1029)
 at 
 org.apache.hadoop.hive.metastore.ObjectStore.convertToStorageDescriptor(ObjectStore.java:1007)
 at 
 org.apache.hadoop.hive.metastore.ObjectStore.convertToStorageDescriptor(ObjectStore.java:1017)
 at 
 org.apache.hadoop.hive.metastore.ObjectStore.convertToTable(ObjectStore.java:872)
 at 
 org.apache.hadoop.hive.metastore.ObjectStore.getTable(ObjectStore.java:743)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at 
 org.apache.hadoop.hive.metastore.RetryingRawStore.invoke(RetryingRawStore.java:111)
 at $Proxy6.getTable(Unknown Source)
 at 
 org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_table(HiveMetaStore.java:1349)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10233) Hive on tez: memory manager for grace hash join

2015-06-22 Thread Mostafa Mokhtar (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14597126#comment-14597126
 ] 

Mostafa Mokhtar commented on HIVE-10233:


[~hagleitn] [~vikram.dixit] [~wzheng]

It would make sense to annotate the explain plan with memory assigned to each 
Hash table, as in 
{code}
  DagName: jenkins_20150622122318_f770d9ab-0ddd-43cf-b950-32f38e2f17e1:1
  Vertices:
Map 1 
Map Operator Tree:
TableScan
  alias: store_sales
  filterExpr: (ss_item_sk is not null and ss_sold_date_sk 
BETWEEN 2450816 AND 2451500) (type: boolean)
  Statistics: Num rows: 28878719387 Data size: 2405805439460 
Basic stats: COMPLETE Column stats: COMPLETE
  Filter Operator
predicate: ss_item_sk is not null (type: boolean)
Statistics: Num rows: 28878719387 Data size: 231029755096 
Basic stats: COMPLETE Column stats: COMPLETE
Map Join Operator
  condition map:
   Inner Join 0 to 1
  keys:
0 ss_item_sk (type: int)
1 i_item_sk (type: int)
  outputColumnNames: _col1, _col22, _col26
  input vertices:
1 Map 3
  Statistics: Num rows: 28878719387 Data size: 346544632644 
Basic stats: COMPLETE Column stats: COMPLETE
  HybridGraceHashJoin: true Hash table memory : 1848000 
Bytes
  Filter Operator
predicate: ((_col26 = _col1) and _col22 BETWEEN 2450816 
AND 2451500) (type: boolean)
Statistics: Num rows: 7219679846 Data size: 86636158152 
Basic stats: COMPLETE Column stats: COMPLETE
Select Operator
  Statistics: Num rows: 7219679846 Data size: 
86636158152 Basic stats: COMPLETE Column stats: COMPLETE
  Group By Operator
aggregations: count()
mode: hash
outputColumnNames: _col0
Statistics: Num rows: 1 Data size: 8 Basic stats: 
COMPLETE Column stats: COMPLETE
Reduce Output Operator
  sort order: 
  Statistics: Num rows: 1 Data size: 8 Basic stats: 
COMPLETE Column stats: COMPLETE
  value expressions: _col0 (type: bigint)
Execution mode: vectorized
Map 3 
Map Operator Tree:
TableScan
  alias: item
  filterExpr: i_item_sk is not null (type: boolean)
  Statistics: Num rows: 462000 Data size: 663560457 Basic 
stats: COMPLETE Column stats: COMPLETE
  Filter Operator
predicate: i_item_sk is not null (type: boolean)
Statistics: Num rows: 462000 Data size: 1848000 Basic 
stats: COMPLETE Column stats: COMPLETE
Reduce Output Operator
  key expressions: i_item_sk (type: int)
  sort order: +
  Map-reduce partition columns: i_item_sk (type: int)
  Statistics: Num rows: 462000 Data size: 1848000 Basic 
stats: COMPLETE Column stats: COMPLETE
Execution mode: vectorized
{code}

 Hive on tez: memory manager for grace hash join
 ---

 Key: HIVE-10233
 URL: https://issues.apache.org/jira/browse/HIVE-10233
 Project: Hive
  Issue Type: Bug
  Components: Tez
Affects Versions: llap, 2.0.0
Reporter: Vikram Dixit K
Assignee: Gunther Hagleitner
 Attachments: HIVE-10233-WIP-2.patch, HIVE-10233-WIP-3.patch, 
 HIVE-10233-WIP-4.patch, HIVE-10233-WIP-5.patch, HIVE-10233-WIP-6.patch, 
 HIVE-10233-WIP-7.patch, HIVE-10233-WIP-8.patch, HIVE-10233.08.patch, 
 HIVE-10233.09.patch, HIVE-10233.10.patch, HIVE-10233.11.patch


 We need a memory manager in llap/tez to manage the usage of memory across 
 threads. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10233) Hive on tez: memory manager for grace hash join

2015-06-22 Thread Mostafa Mokhtar (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14597141#comment-14597141
 ] 

Mostafa Mokhtar commented on HIVE-10233:


[~hagleitn]
Please check comment above. 

 Hive on tez: memory manager for grace hash join
 ---

 Key: HIVE-10233
 URL: https://issues.apache.org/jira/browse/HIVE-10233
 Project: Hive
  Issue Type: Bug
  Components: Tez
Affects Versions: llap, 2.0.0
Reporter: Vikram Dixit K
Assignee: Gunther Hagleitner
 Attachments: HIVE-10233-WIP-2.patch, HIVE-10233-WIP-3.patch, 
 HIVE-10233-WIP-4.patch, HIVE-10233-WIP-5.patch, HIVE-10233-WIP-6.patch, 
 HIVE-10233-WIP-7.patch, HIVE-10233-WIP-8.patch, HIVE-10233.08.patch, 
 HIVE-10233.09.patch, HIVE-10233.10.patch, HIVE-10233.11.patch


 We need a memory manager in llap/tez to manage the usage of memory across 
 threads. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11076) Explicitly set hive.cbo.enable=true for some tests

2015-06-22 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14597149#comment-14597149
 ] 

Hive QA commented on HIVE-11076:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12741152/HIVE-11076.01.patch

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 9013 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_merge_multi_expressions
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join_merge_multi_expressions
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4343/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4343/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4343/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12741152 - PreCommit-HIVE-TRUNK-Build

 Explicitly set hive.cbo.enable=true for some tests
 --

 Key: HIVE-11076
 URL: https://issues.apache.org/jira/browse/HIVE-11076
 Project: Hive
  Issue Type: Improvement
Reporter: Pengcheng Xiong
Assignee: Pengcheng Xiong
 Attachments: HIVE-11076.01.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-5457) Concurrent calls to getTable() result in: MetaException: org.datanucleus.exceptions.NucleusException: Invalid index 1 for DataStoreMapping. NucleusException: Invalid in

2015-06-22 Thread JIRA


[ 
https://issues.apache.org/jira/browse/HIVE-5457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14597084#comment-14597084
 ] 

陈典贵 commented on HIVE-5457:
---

hi! Did you resolve this problem? It happend when we use hue to execute hql. 
And we set datanucleus.autoStartMechanism= SchemaTabler refered in fixed 
4762,but it helpless.

 Concurrent calls to getTable() result in: MetaException: 
 org.datanucleus.exceptions.NucleusException: Invalid index 1 for 
 DataStoreMapping.  NucleusException: Invalid index 1 for DataStoreMapping
 ---

 Key: HIVE-5457
 URL: https://issues.apache.org/jira/browse/HIVE-5457
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.10.0
Reporter: Lenni Kuff
Priority: Critical

 Concurrent calls to getTable() result in: MetaException: 
 org.datanucleus.exceptions.NucleusException: Invalid index 1 for 
 DataStoreMapping.  NucleusException: Invalid index 1 for DataStoreMapping
 This happens when using a Hive Metastore Service directly connecting to the 
 backend metastore db. I have been able to hit this with as few as 2 
 concurrent calls.  When I update my app to serialize all calls to getTable() 
 this problem is resolved. 
 Stack Trace:
 {code}
 Caused by: org.datanucleus.exceptions.NucleusException: Invalid index 1 for 
 DataStoreMapping.
 at 
 org.datanucleus.store.mapped.mapping.PersistableMapping.getDatastoreMapping(PersistableMapping.java:307)
 at 
 org.datanucleus.store.rdbms.scostore.RDBMSElementContainerStoreSpecialization.getSizeStmt(RDBMSElementContainerStoreSpecialization.java:407)
 at 
 org.datanucleus.store.rdbms.scostore.RDBMSElementContainerStoreSpecialization.getSize(RDBMSElementContainerStoreSpecialization.java:257)
 at 
 org.datanucleus.store.rdbms.scostore.RDBMSJoinListStoreSpecialization.getSize(RDBMSJoinListStoreSpecialization.java:46)
 at 
 org.datanucleus.store.mapped.scostore.ElementContainerStore.size(ElementContainerStore.java:440)
 at org.datanucleus.sco.backed.List.size(List.java:557) 
 at 
 org.apache.hadoop.hive.metastore.ObjectStore.convertToSkewedValues(ObjectStore.java:1029)
 at 
 org.apache.hadoop.hive.metastore.ObjectStore.convertToStorageDescriptor(ObjectStore.java:1007)
 at 
 org.apache.hadoop.hive.metastore.ObjectStore.convertToStorageDescriptor(ObjectStore.java:1017)
 at 
 org.apache.hadoop.hive.metastore.ObjectStore.convertToTable(ObjectStore.java:872)
 at 
 org.apache.hadoop.hive.metastore.ObjectStore.getTable(ObjectStore.java:743)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at 
 org.apache.hadoop.hive.metastore.RetryingRawStore.invoke(RetryingRawStore.java:111)
 at $Proxy6.getTable(Unknown Source)
 at 
 org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_table(HiveMetaStore.java:1349)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-5457) Concurrent calls to getTable() result in: MetaException: org.datanucleus.exceptions.NucleusException: Invalid index 1 for DataStoreMapping. NucleusException: Invalid in

2015-06-22 Thread JIRA


[ 
https://issues.apache.org/jira/browse/HIVE-5457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14597079#comment-14597079
 ] 

陈典贵 commented on HIVE-5457:
---

hi! Did you resolve this problem? It happend when we use hue to execute hql. 
And we set datanucleus.autoStartMechanism= SchemaTabler refered in fixed 
4762,but it helpless.

 Concurrent calls to getTable() result in: MetaException: 
 org.datanucleus.exceptions.NucleusException: Invalid index 1 for 
 DataStoreMapping.  NucleusException: Invalid index 1 for DataStoreMapping
 ---

 Key: HIVE-5457
 URL: https://issues.apache.org/jira/browse/HIVE-5457
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.10.0
Reporter: Lenni Kuff
Priority: Critical

 Concurrent calls to getTable() result in: MetaException: 
 org.datanucleus.exceptions.NucleusException: Invalid index 1 for 
 DataStoreMapping.  NucleusException: Invalid index 1 for DataStoreMapping
 This happens when using a Hive Metastore Service directly connecting to the 
 backend metastore db. I have been able to hit this with as few as 2 
 concurrent calls.  When I update my app to serialize all calls to getTable() 
 this problem is resolved. 
 Stack Trace:
 {code}
 Caused by: org.datanucleus.exceptions.NucleusException: Invalid index 1 for 
 DataStoreMapping.
 at 
 org.datanucleus.store.mapped.mapping.PersistableMapping.getDatastoreMapping(PersistableMapping.java:307)
 at 
 org.datanucleus.store.rdbms.scostore.RDBMSElementContainerStoreSpecialization.getSizeStmt(RDBMSElementContainerStoreSpecialization.java:407)
 at 
 org.datanucleus.store.rdbms.scostore.RDBMSElementContainerStoreSpecialization.getSize(RDBMSElementContainerStoreSpecialization.java:257)
 at 
 org.datanucleus.store.rdbms.scostore.RDBMSJoinListStoreSpecialization.getSize(RDBMSJoinListStoreSpecialization.java:46)
 at 
 org.datanucleus.store.mapped.scostore.ElementContainerStore.size(ElementContainerStore.java:440)
 at org.datanucleus.sco.backed.List.size(List.java:557) 
 at 
 org.apache.hadoop.hive.metastore.ObjectStore.convertToSkewedValues(ObjectStore.java:1029)
 at 
 org.apache.hadoop.hive.metastore.ObjectStore.convertToStorageDescriptor(ObjectStore.java:1007)
 at 
 org.apache.hadoop.hive.metastore.ObjectStore.convertToStorageDescriptor(ObjectStore.java:1017)
 at 
 org.apache.hadoop.hive.metastore.ObjectStore.convertToTable(ObjectStore.java:872)
 at 
 org.apache.hadoop.hive.metastore.ObjectStore.getTable(ObjectStore.java:743)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at 
 org.apache.hadoop.hive.metastore.RetryingRawStore.invoke(RetryingRawStore.java:111)
 at $Proxy6.getTable(Unknown Source)
 at 
 org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_table(HiveMetaStore.java:1349)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-5457) Concurrent calls to getTable() result in: MetaException: org.datanucleus.exceptions.NucleusException: Invalid index 1 for DataStoreMapping. NucleusException: Invalid in

2015-06-22 Thread JIRA


[ 
https://issues.apache.org/jira/browse/HIVE-5457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14597062#comment-14597062
 ] 

陈典贵 commented on HIVE-5457:
---

hi! Did you resolve this problem? It happend when we use hue to execute hql. 
And we set datanucleus.autoStartMechanism= SchemaTabler refered in fixed 
4762,but it helpless.

 Concurrent calls to getTable() result in: MetaException: 
 org.datanucleus.exceptions.NucleusException: Invalid index 1 for 
 DataStoreMapping.  NucleusException: Invalid index 1 for DataStoreMapping
 ---

 Key: HIVE-5457
 URL: https://issues.apache.org/jira/browse/HIVE-5457
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.10.0
Reporter: Lenni Kuff
Priority: Critical

 Concurrent calls to getTable() result in: MetaException: 
 org.datanucleus.exceptions.NucleusException: Invalid index 1 for 
 DataStoreMapping.  NucleusException: Invalid index 1 for DataStoreMapping
 This happens when using a Hive Metastore Service directly connecting to the 
 backend metastore db. I have been able to hit this with as few as 2 
 concurrent calls.  When I update my app to serialize all calls to getTable() 
 this problem is resolved. 
 Stack Trace:
 {code}
 Caused by: org.datanucleus.exceptions.NucleusException: Invalid index 1 for 
 DataStoreMapping.
 at 
 org.datanucleus.store.mapped.mapping.PersistableMapping.getDatastoreMapping(PersistableMapping.java:307)
 at 
 org.datanucleus.store.rdbms.scostore.RDBMSElementContainerStoreSpecialization.getSizeStmt(RDBMSElementContainerStoreSpecialization.java:407)
 at 
 org.datanucleus.store.rdbms.scostore.RDBMSElementContainerStoreSpecialization.getSize(RDBMSElementContainerStoreSpecialization.java:257)
 at 
 org.datanucleus.store.rdbms.scostore.RDBMSJoinListStoreSpecialization.getSize(RDBMSJoinListStoreSpecialization.java:46)
 at 
 org.datanucleus.store.mapped.scostore.ElementContainerStore.size(ElementContainerStore.java:440)
 at org.datanucleus.sco.backed.List.size(List.java:557) 
 at 
 org.apache.hadoop.hive.metastore.ObjectStore.convertToSkewedValues(ObjectStore.java:1029)
 at 
 org.apache.hadoop.hive.metastore.ObjectStore.convertToStorageDescriptor(ObjectStore.java:1007)
 at 
 org.apache.hadoop.hive.metastore.ObjectStore.convertToStorageDescriptor(ObjectStore.java:1017)
 at 
 org.apache.hadoop.hive.metastore.ObjectStore.convertToTable(ObjectStore.java:872)
 at 
 org.apache.hadoop.hive.metastore.ObjectStore.getTable(ObjectStore.java:743)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at 
 org.apache.hadoop.hive.metastore.RetryingRawStore.invoke(RetryingRawStore.java:111)
 at $Proxy6.getTable(Unknown Source)
 at 
 org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_table(HiveMetaStore.java:1349)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-5457) Concurrent calls to getTable() result in: MetaException: org.datanucleus.exceptions.NucleusException: Invalid index 1 for DataStoreMapping. NucleusException: Invalid in

2015-06-22 Thread JIRA


[ 
https://issues.apache.org/jira/browse/HIVE-5457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14597067#comment-14597067
 ] 

陈典贵 commented on HIVE-5457:
---

hi! Did you resolve this problem? It happend when we use hue to execute hql. 
And we set datanucleus.autoStartMechanism= SchemaTabler refered in fixed 
4762,but it helpless.

 Concurrent calls to getTable() result in: MetaException: 
 org.datanucleus.exceptions.NucleusException: Invalid index 1 for 
 DataStoreMapping.  NucleusException: Invalid index 1 for DataStoreMapping
 ---

 Key: HIVE-5457
 URL: https://issues.apache.org/jira/browse/HIVE-5457
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.10.0
Reporter: Lenni Kuff
Priority: Critical

 Concurrent calls to getTable() result in: MetaException: 
 org.datanucleus.exceptions.NucleusException: Invalid index 1 for 
 DataStoreMapping.  NucleusException: Invalid index 1 for DataStoreMapping
 This happens when using a Hive Metastore Service directly connecting to the 
 backend metastore db. I have been able to hit this with as few as 2 
 concurrent calls.  When I update my app to serialize all calls to getTable() 
 this problem is resolved. 
 Stack Trace:
 {code}
 Caused by: org.datanucleus.exceptions.NucleusException: Invalid index 1 for 
 DataStoreMapping.
 at 
 org.datanucleus.store.mapped.mapping.PersistableMapping.getDatastoreMapping(PersistableMapping.java:307)
 at 
 org.datanucleus.store.rdbms.scostore.RDBMSElementContainerStoreSpecialization.getSizeStmt(RDBMSElementContainerStoreSpecialization.java:407)
 at 
 org.datanucleus.store.rdbms.scostore.RDBMSElementContainerStoreSpecialization.getSize(RDBMSElementContainerStoreSpecialization.java:257)
 at 
 org.datanucleus.store.rdbms.scostore.RDBMSJoinListStoreSpecialization.getSize(RDBMSJoinListStoreSpecialization.java:46)
 at 
 org.datanucleus.store.mapped.scostore.ElementContainerStore.size(ElementContainerStore.java:440)
 at org.datanucleus.sco.backed.List.size(List.java:557) 
 at 
 org.apache.hadoop.hive.metastore.ObjectStore.convertToSkewedValues(ObjectStore.java:1029)
 at 
 org.apache.hadoop.hive.metastore.ObjectStore.convertToStorageDescriptor(ObjectStore.java:1007)
 at 
 org.apache.hadoop.hive.metastore.ObjectStore.convertToStorageDescriptor(ObjectStore.java:1017)
 at 
 org.apache.hadoop.hive.metastore.ObjectStore.convertToTable(ObjectStore.java:872)
 at 
 org.apache.hadoop.hive.metastore.ObjectStore.getTable(ObjectStore.java:743)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at 
 org.apache.hadoop.hive.metastore.RetryingRawStore.invoke(RetryingRawStore.java:111)
 at $Proxy6.getTable(Unknown Source)
 at 
 org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_table(HiveMetaStore.java:1349)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-5457) Concurrent calls to getTable() result in: MetaException: org.datanucleus.exceptions.NucleusException: Invalid index 1 for DataStoreMapping. NucleusException: Invalid in

2015-06-22 Thread JIRA


[ 
https://issues.apache.org/jira/browse/HIVE-5457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14597080#comment-14597080
 ] 

陈典贵 commented on HIVE-5457:
---

hi! Did you resolve this problem? It happend when we use hue to execute hql. 
And we set datanucleus.autoStartMechanism= SchemaTabler refered in fixed 
4762,but it helpless.

 Concurrent calls to getTable() result in: MetaException: 
 org.datanucleus.exceptions.NucleusException: Invalid index 1 for 
 DataStoreMapping.  NucleusException: Invalid index 1 for DataStoreMapping
 ---

 Key: HIVE-5457
 URL: https://issues.apache.org/jira/browse/HIVE-5457
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.10.0
Reporter: Lenni Kuff
Priority: Critical

 Concurrent calls to getTable() result in: MetaException: 
 org.datanucleus.exceptions.NucleusException: Invalid index 1 for 
 DataStoreMapping.  NucleusException: Invalid index 1 for DataStoreMapping
 This happens when using a Hive Metastore Service directly connecting to the 
 backend metastore db. I have been able to hit this with as few as 2 
 concurrent calls.  When I update my app to serialize all calls to getTable() 
 this problem is resolved. 
 Stack Trace:
 {code}
 Caused by: org.datanucleus.exceptions.NucleusException: Invalid index 1 for 
 DataStoreMapping.
 at 
 org.datanucleus.store.mapped.mapping.PersistableMapping.getDatastoreMapping(PersistableMapping.java:307)
 at 
 org.datanucleus.store.rdbms.scostore.RDBMSElementContainerStoreSpecialization.getSizeStmt(RDBMSElementContainerStoreSpecialization.java:407)
 at 
 org.datanucleus.store.rdbms.scostore.RDBMSElementContainerStoreSpecialization.getSize(RDBMSElementContainerStoreSpecialization.java:257)
 at 
 org.datanucleus.store.rdbms.scostore.RDBMSJoinListStoreSpecialization.getSize(RDBMSJoinListStoreSpecialization.java:46)
 at 
 org.datanucleus.store.mapped.scostore.ElementContainerStore.size(ElementContainerStore.java:440)
 at org.datanucleus.sco.backed.List.size(List.java:557) 
 at 
 org.apache.hadoop.hive.metastore.ObjectStore.convertToSkewedValues(ObjectStore.java:1029)
 at 
 org.apache.hadoop.hive.metastore.ObjectStore.convertToStorageDescriptor(ObjectStore.java:1007)
 at 
 org.apache.hadoop.hive.metastore.ObjectStore.convertToStorageDescriptor(ObjectStore.java:1017)
 at 
 org.apache.hadoop.hive.metastore.ObjectStore.convertToTable(ObjectStore.java:872)
 at 
 org.apache.hadoop.hive.metastore.ObjectStore.getTable(ObjectStore.java:743)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at 
 org.apache.hadoop.hive.metastore.RetryingRawStore.invoke(RetryingRawStore.java:111)
 at $Proxy6.getTable(Unknown Source)
 at 
 org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_table(HiveMetaStore.java:1349)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-5457) Concurrent calls to getTable() result in: MetaException: org.datanucleus.exceptions.NucleusException: Invalid index 1 for DataStoreMapping. NucleusException: Invalid in

2015-06-22 Thread JIRA


[ 
https://issues.apache.org/jira/browse/HIVE-5457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14597081#comment-14597081
 ] 

陈典贵 commented on HIVE-5457:
---

hi! Did you resolve this problem? It happend when we use hue to execute hql. 
And we set datanucleus.autoStartMechanism= SchemaTabler refered in fixed 
4762,but it helpless.

 Concurrent calls to getTable() result in: MetaException: 
 org.datanucleus.exceptions.NucleusException: Invalid index 1 for 
 DataStoreMapping.  NucleusException: Invalid index 1 for DataStoreMapping
 ---

 Key: HIVE-5457
 URL: https://issues.apache.org/jira/browse/HIVE-5457
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.10.0
Reporter: Lenni Kuff
Priority: Critical

 Concurrent calls to getTable() result in: MetaException: 
 org.datanucleus.exceptions.NucleusException: Invalid index 1 for 
 DataStoreMapping.  NucleusException: Invalid index 1 for DataStoreMapping
 This happens when using a Hive Metastore Service directly connecting to the 
 backend metastore db. I have been able to hit this with as few as 2 
 concurrent calls.  When I update my app to serialize all calls to getTable() 
 this problem is resolved. 
 Stack Trace:
 {code}
 Caused by: org.datanucleus.exceptions.NucleusException: Invalid index 1 for 
 DataStoreMapping.
 at 
 org.datanucleus.store.mapped.mapping.PersistableMapping.getDatastoreMapping(PersistableMapping.java:307)
 at 
 org.datanucleus.store.rdbms.scostore.RDBMSElementContainerStoreSpecialization.getSizeStmt(RDBMSElementContainerStoreSpecialization.java:407)
 at 
 org.datanucleus.store.rdbms.scostore.RDBMSElementContainerStoreSpecialization.getSize(RDBMSElementContainerStoreSpecialization.java:257)
 at 
 org.datanucleus.store.rdbms.scostore.RDBMSJoinListStoreSpecialization.getSize(RDBMSJoinListStoreSpecialization.java:46)
 at 
 org.datanucleus.store.mapped.scostore.ElementContainerStore.size(ElementContainerStore.java:440)
 at org.datanucleus.sco.backed.List.size(List.java:557) 
 at 
 org.apache.hadoop.hive.metastore.ObjectStore.convertToSkewedValues(ObjectStore.java:1029)
 at 
 org.apache.hadoop.hive.metastore.ObjectStore.convertToStorageDescriptor(ObjectStore.java:1007)
 at 
 org.apache.hadoop.hive.metastore.ObjectStore.convertToStorageDescriptor(ObjectStore.java:1017)
 at 
 org.apache.hadoop.hive.metastore.ObjectStore.convertToTable(ObjectStore.java:872)
 at 
 org.apache.hadoop.hive.metastore.ObjectStore.getTable(ObjectStore.java:743)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at 
 org.apache.hadoop.hive.metastore.RetryingRawStore.invoke(RetryingRawStore.java:111)
 at $Proxy6.getTable(Unknown Source)
 at 
 org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_table(HiveMetaStore.java:1349)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-5457) Concurrent calls to getTable() result in: MetaException: org.datanucleus.exceptions.NucleusException: Invalid index 1 for DataStoreMapping. NucleusException: Invalid in

2015-06-22 Thread JIRA


[ 
https://issues.apache.org/jira/browse/HIVE-5457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14597072#comment-14597072
 ] 

陈典贵 commented on HIVE-5457:
---

hi! Did you resolve this problem? It happend when we use hue to execute hql. 
And we set datanucleus.autoStartMechanism= SchemaTabler refered in fixed 
4762,but it helpless.

 Concurrent calls to getTable() result in: MetaException: 
 org.datanucleus.exceptions.NucleusException: Invalid index 1 for 
 DataStoreMapping.  NucleusException: Invalid index 1 for DataStoreMapping
 ---

 Key: HIVE-5457
 URL: https://issues.apache.org/jira/browse/HIVE-5457
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.10.0
Reporter: Lenni Kuff
Priority: Critical

 Concurrent calls to getTable() result in: MetaException: 
 org.datanucleus.exceptions.NucleusException: Invalid index 1 for 
 DataStoreMapping.  NucleusException: Invalid index 1 for DataStoreMapping
 This happens when using a Hive Metastore Service directly connecting to the 
 backend metastore db. I have been able to hit this with as few as 2 
 concurrent calls.  When I update my app to serialize all calls to getTable() 
 this problem is resolved. 
 Stack Trace:
 {code}
 Caused by: org.datanucleus.exceptions.NucleusException: Invalid index 1 for 
 DataStoreMapping.
 at 
 org.datanucleus.store.mapped.mapping.PersistableMapping.getDatastoreMapping(PersistableMapping.java:307)
 at 
 org.datanucleus.store.rdbms.scostore.RDBMSElementContainerStoreSpecialization.getSizeStmt(RDBMSElementContainerStoreSpecialization.java:407)
 at 
 org.datanucleus.store.rdbms.scostore.RDBMSElementContainerStoreSpecialization.getSize(RDBMSElementContainerStoreSpecialization.java:257)
 at 
 org.datanucleus.store.rdbms.scostore.RDBMSJoinListStoreSpecialization.getSize(RDBMSJoinListStoreSpecialization.java:46)
 at 
 org.datanucleus.store.mapped.scostore.ElementContainerStore.size(ElementContainerStore.java:440)
 at org.datanucleus.sco.backed.List.size(List.java:557) 
 at 
 org.apache.hadoop.hive.metastore.ObjectStore.convertToSkewedValues(ObjectStore.java:1029)
 at 
 org.apache.hadoop.hive.metastore.ObjectStore.convertToStorageDescriptor(ObjectStore.java:1007)
 at 
 org.apache.hadoop.hive.metastore.ObjectStore.convertToStorageDescriptor(ObjectStore.java:1017)
 at 
 org.apache.hadoop.hive.metastore.ObjectStore.convertToTable(ObjectStore.java:872)
 at 
 org.apache.hadoop.hive.metastore.ObjectStore.getTable(ObjectStore.java:743)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at 
 org.apache.hadoop.hive.metastore.RetryingRawStore.invoke(RetryingRawStore.java:111)
 at $Proxy6.getTable(Unknown Source)
 at 
 org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_table(HiveMetaStore.java:1349)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10233) Hive on tez: memory manager for grace hash join

2015-06-22 Thread Mostafa Mokhtar (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14597139#comment-14597139
 ] 

Mostafa Mokhtar commented on HIVE-10233:


[~gunther]
Should totalAvailableMemory be HIVECONVERTJOINNOCONDITIONALTASKTHRESHOLD?
resourceAvailable.getMemory() would basically return container size, this will 
result in a lot of GC if all memory gets used up. 

Change Log.Debug to Log.INFO and print the input size. 
{code}
 for (MapJoinOperator mj : mapJoins) {
mj.getConf().setMemoryNeeded(minMemory);
  LOG.info(Setting  + minMemory +  bytes needed for  + mj);
  }
{code}

Also I am not following the logic here, shouldn't the memory needed per 
operator be something like (estimate size) / (Total input sizes) x 
memoryAvailable ?
{code}
  int numJoins = mapJoins.size();
  long minMemory = totalAvailableMemory / ((numJoins  0) ? 
numJoins : 1);
  minMemory = Math.min(minMemory, onePercentMemory);

  for (MapJoinOperator mj : mapJoins) {
mj.getConf().setMemoryNeeded(minMemory);
if (LOG.isDebugEnabled()) {
  LOG.debug(Setting  + minMemory +  bytes needed for  + mj);
}
  }
{code}

 Hive on tez: memory manager for grace hash join
 ---

 Key: HIVE-10233
 URL: https://issues.apache.org/jira/browse/HIVE-10233
 Project: Hive
  Issue Type: Bug
  Components: Tez
Affects Versions: llap, 2.0.0
Reporter: Vikram Dixit K
Assignee: Gunther Hagleitner
 Attachments: HIVE-10233-WIP-2.patch, HIVE-10233-WIP-3.patch, 
 HIVE-10233-WIP-4.patch, HIVE-10233-WIP-5.patch, HIVE-10233-WIP-6.patch, 
 HIVE-10233-WIP-7.patch, HIVE-10233-WIP-8.patch, HIVE-10233.08.patch, 
 HIVE-10233.09.patch, HIVE-10233.10.patch, HIVE-10233.11.patch


 We need a memory manager in llap/tez to manage the usage of memory across 
 threads. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11076) Explicitly set hive.cbo.enable=true for some tests

2015-06-22 Thread Pengcheng Xiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-11076:
---
Attachment: HIVE-11076.02.patch

 Explicitly set hive.cbo.enable=true for some tests
 --

 Key: HIVE-11076
 URL: https://issues.apache.org/jira/browse/HIVE-11076
 Project: Hive
  Issue Type: Improvement
Reporter: Pengcheng Xiong
Assignee: Pengcheng Xiong
 Attachments: HIVE-11076.01.patch, HIVE-11076.02.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-7193) Hive should support additional LDAP authentication parameters

2015-06-22 Thread Lefty Leverenz (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-7193:
-
Labels: TODOC1.3  (was: TODOC1.3 TODOC2.0)

 Hive should support additional LDAP authentication parameters
 -

 Key: HIVE-7193
 URL: https://issues.apache.org/jira/browse/HIVE-7193
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.10.0
Reporter: Mala Chikka Kempanna
Assignee: Naveen Gangam
  Labels: TODOC1.3
 Fix For: 1.3.0, 2.0.0

 Attachments: HIVE-7193.2.patch, HIVE-7193.3.patch, HIVE-7193.4.patch, 
 HIVE-7193.5.patch, HIVE-7193.6.patch, HIVE-7193.patch, 
 LDAPAuthentication_Design_Doc.docx, LDAPAuthentication_Design_Doc_V2.docx


 Currently hive has only following authenticator parameters for LDAP 
 authentication for hiveserver2:
 {code:xml}
 property 
   namehive.server2.authentication/name 
   valueLDAP/value 
 /property 
 property 
   namehive.server2.authentication.ldap.url/name 
   valueldap://our_ldap_address/value 
 /property 
 {code}
 We need to include other LDAP properties as part of hive-LDAP authentication 
 like below:
 {noformat}
 a group search base - dc=domain,dc=com 
 a group search filter - member={0} 
 a user search base - dc=domain,dc=com 
 a user search filter - sAMAAccountName={0} 
 a list of valid user groups - group1,group2,group3 
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-7193) Hive should support additional LDAP authentication parameters

2015-06-22 Thread Lefty Leverenz (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14597186#comment-14597186
 ] 

Lefty Leverenz commented on HIVE-7193:
--

Doc note:  (Removed TODOC2.0 because we only need to document the initial 
version, which is 1.3.)

This adds five configuration parameters, which need to be documented in the 
HiveServer2 section of Configuration Properties.

* hive.server2.authentication.ldap.groupDNPattern
* hive.server2.authentication.ldap.groupFilter
* hive.server2.authentication.ldap.userDNPattern
* hive.server2.authentication.ldap.userFilter
* hive.server2.authentication.ldap.customLDAPQuery
* [Configuration Properties -- HiveServer2 | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-HiveServer2]

Ben Tse wrote up the general documentation here (thanks, Ben):

* [User and Group Filter Support with LDAP Atn Provider in HiveServer2 | 
https://cwiki.apache.org/confluence/display/Hive/User+and+Group+Filter+Support+with+LDAP+Atn+Provider+in+HiveServer2]

 Hive should support additional LDAP authentication parameters
 -

 Key: HIVE-7193
 URL: https://issues.apache.org/jira/browse/HIVE-7193
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.10.0
Reporter: Mala Chikka Kempanna
Assignee: Naveen Gangam
  Labels: TODOC1.3
 Fix For: 1.3.0, 2.0.0

 Attachments: HIVE-7193.2.patch, HIVE-7193.3.patch, HIVE-7193.4.patch, 
 HIVE-7193.5.patch, HIVE-7193.6.patch, HIVE-7193.patch, 
 LDAPAuthentication_Design_Doc.docx, LDAPAuthentication_Design_Doc_V2.docx


 Currently hive has only following authenticator parameters for LDAP 
 authentication for hiveserver2:
 {code:xml}
 property 
   namehive.server2.authentication/name 
   valueLDAP/value 
 /property 
 property 
   namehive.server2.authentication.ldap.url/name 
   valueldap://our_ldap_address/value 
 /property 
 {code}
 We need to include other LDAP properties as part of hive-LDAP authentication 
 like below:
 {noformat}
 a group search base - dc=domain,dc=com 
 a group search filter - member={0} 
 a user search base - dc=domain,dc=com 
 a user search filter - sAMAAccountName={0} 
 a list of valid user groups - group1,group2,group3 
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10895) ObjectStore does not close Query objects in some calls, causing a potential leak in some metastore db resources

2015-06-22 Thread Aihua Xu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14595962#comment-14595962
 ] 

Aihua Xu commented on HIVE-10895:
-

Hi [~vgumashta] Any updates on that? Sorry to push you on that. Our customer is 
waiting on the fix. 

 ObjectStore does not close Query objects in some calls, causing a potential 
 leak in some metastore db resources
 ---

 Key: HIVE-10895
 URL: https://issues.apache.org/jira/browse/HIVE-10895
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.14.0, 1.0.0, 1.2.0, 1.1.0, 0.13
Reporter: Takahiko Saito
Assignee: Vaibhav Gumashta
 Attachments: HIVE-10895.1.patch


 During testing, we've noticed Oracle db running out of cursors. Might be 
 related to this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11071) FIx the output of beeline dbinfo command

2015-06-22 Thread Shinichi Yamashita (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shinichi Yamashita updated HIVE-11071:
--
Attachment: HIVE-11071-001.patch

I attach a patch file.

 FIx the output of beeline dbinfo command
 

 Key: HIVE-11071
 URL: https://issues.apache.org/jira/browse/HIVE-11071
 Project: Hive
  Issue Type: Bug
  Components: Beeline
Reporter: Shinichi Yamashita
Assignee: Shinichi Yamashita
 Attachments: HIVE-11071-001.patch


 When dbinfo is executed by beeline, it is displayed as follows. 
 {code}
 0: jdbc:hive2://localhost:10001/ !dbinfo
 Error: Method not supported (state=,code=0)
 allTablesAreSelectabletrue
 Error: Method not supported (state=,code=0)
 Error: Method not supported (state=,code=0)
 Error: Method not supported (state=,code=0)
 getCatalogSeparator   .
 getCatalogTerminstance
 getDatabaseProductNameApache Hive
 getDatabaseProductVersion 2.0.0-SNAPSHOT
 getDefaultTransactionIsolation0
 getDriverMajorVersion 1
 getDriverMinorVersion 1
 getDriverName Hive JDBC
 ...
 {code}
 The method name of Error is not understood. I fix this output.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10970) Investigate HIVE-10453: HS2 leaking open file descriptors when using UDFs

2015-06-22 Thread Yongzhi Chen (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14595968#comment-14595968
 ] 

Yongzhi Chen commented on HIVE-10970:
-

[~vgumashta], HIVE-10453 may not be the root cause of your test failures, hive 
seems to have issues on how to handle threads going across different sessions, 
HIVE-10453 may accelerate exposing the issues. 

 Investigate HIVE-10453: HS2 leaking open file descriptors when using UDFs
 -

 Key: HIVE-10970
 URL: https://issues.apache.org/jira/browse/HIVE-10970
 Project: Hive
  Issue Type: Bug
Reporter: Vaibhav Gumashta
Assignee: Vaibhav Gumashta





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11071) FIx the output of beeline dbinfo command

2015-06-22 Thread Xuefu Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14595974#comment-14595974
 ] 

Xuefu Zhang commented on HIVE-11071:


[~yamashitasni], could you please post the output with your patch included? 
Thanks.

 FIx the output of beeline dbinfo command
 

 Key: HIVE-11071
 URL: https://issues.apache.org/jira/browse/HIVE-11071
 Project: Hive
  Issue Type: Bug
  Components: Beeline
Reporter: Shinichi Yamashita
Assignee: Shinichi Yamashita
 Attachments: HIVE-11071-001.patch


 When dbinfo is executed by beeline, it is displayed as follows. 
 {code}
 0: jdbc:hive2://localhost:10001/ !dbinfo
 Error: Method not supported (state=,code=0)
 allTablesAreSelectabletrue
 Error: Method not supported (state=,code=0)
 Error: Method not supported (state=,code=0)
 Error: Method not supported (state=,code=0)
 getCatalogSeparator   .
 getCatalogTerminstance
 getDatabaseProductNameApache Hive
 getDatabaseProductVersion 2.0.0-SNAPSHOT
 getDefaultTransactionIsolation0
 getDriverMajorVersion 1
 getDriverMinorVersion 1
 getDriverName Hive JDBC
 ...
 {code}
 The method name of Error is not understood. I fix this output.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10142) Calculating formula based on difference between each row's value and current row's in Windowing function

2015-06-22 Thread Aihua Xu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14595951#comment-14595951
 ] 

Aihua Xu commented on HIVE-10142:
-

Can you share some sample tables and queries for this type of models I can try 
to see if they are supported or not?

 Calculating formula based on difference between each row's value and current 
 row's in Windowing function
 

 Key: HIVE-10142
 URL: https://issues.apache.org/jira/browse/HIVE-10142
 Project: Hive
  Issue Type: New Feature
  Components: PTF-Windowing
Affects Versions: 1.0.0
Reporter: Yi Zhang
Assignee: Aihua Xu

 For analytics with windowing function, the calculation formula sometimes 
 needs to perform over each row's value against current tow's value. The decay 
 value is a good example, such as sums of value with a decay function based on 
 difference of timestamp between each row and current row.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11025) In windowing spec, when the datatype is decimal, it's comparing the value against NULL value incorrectly

2015-06-22 Thread Aihua Xu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14595954#comment-14595954
 ] 

Aihua Xu commented on HIVE-11025:
-

[~ashutoshc] Can you help submit the patch? Thanks.

 In windowing spec, when the datatype is decimal, it's comparing the value 
 against NULL value incorrectly
 

 Key: HIVE-11025
 URL: https://issues.apache.org/jira/browse/HIVE-11025
 Project: Hive
  Issue Type: Sub-task
  Components: PTF-Windowing
Affects Versions: 2.0.0
Reporter: Aihua Xu
Assignee: Aihua Xu
 Attachments: HIVE-11025.patch


 Given data and the following query,
 {noformat}
 deptno  empno  bonussalary
 307698 NULL2850.0 
 307900 NULL950.0 
 307844 0   1500.0 
 select avg(salary) over (partition by deptno order by bonus range 200 
 preceding) from emp2;
 {noformat}
 It produces incorrect result for the row in which bonus=0
 1900.0
 1900.0
 1766.7



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)