[jira] [Created] (HIVE-27173) Add method for Spark to be able to trigger DML events

2023-03-24 Thread Naveen Gangam (Jira)
Naveen Gangam created HIVE-27173:


 Summary: Add method for Spark to be able to trigger DML events
 Key: HIVE-27173
 URL: https://issues.apache.org/jira/browse/HIVE-27173
 Project: Hive
  Issue Type: Improvement
Reporter: Naveen Gangam


Spark currently uses Hive.java from Hive as a convenient way to hide from the 
having to deal with HMS Client and the thrift objects. Currently, Hive has 
support for DML events (being able to generate events on DML operations but 
does not expose a public method to do so). It has a private method that takes 
in Hive objects like Table etc. Would be nice if we can have something with 
more primitive datatypes.




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HIVE-27063) LDAP+JWT auth forms not supported

2023-02-09 Thread Naveen Gangam (Jira)
Naveen Gangam created HIVE-27063:


 Summary: LDAP+JWT auth forms not supported
 Key: HIVE-27063
 URL: https://issues.apache.org/jira/browse/HIVE-27063
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Affects Versions: 4.0.0
Reporter: Naveen Gangam


In HIVE-25875, support for multiple authentication forms was added for Hive 
Server. In HIVE-25575, support for JWT authentication was added. However, 
setting hive.server2.authentication="JWT,LDAP" will fail with the following 
validation error.


{noformat}
<12>1 2023-02-03T09:32:11.018Z hiveserver2-0 hiveserver2 1 
0393cf91-48f7-49e3-b2b1-b983000d4cd6 [mdc@18060 class="server.HiveServer2" 
level="WARN" thread="main"] Error starting HiveServer2 on attempt 2, will retry 
in 6ms\rorg.apache.hive.service.ServiceException: Failed to Start 
HiveServer2\r at 
org.apache.hive.service.CompositeService.start(CompositeService.java:80)\r at 
org.apache.hive.service.server.HiveServer2.start(HiveServer2.java:692)\r at 
org.apache.hive.service.server.HiveServer2.startHiveServer2(HiveServer2.java:1154)\r
 at 
org.apache.hive.service.server.HiveServer2.access$1400(HiveServer2.java:145)\r 
at 
org.apache.hive.service.server.HiveServer2$StartOptionExecutor.execute(HiveServer2.java:1503)\r
 at org.apache.hive.service.server.HiveServer2.main(HiveServer2.java:1316)\r at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native 
Method)\r at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)\r
 at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)\r
 at java.base/java.lang.reflect.Method.invoke(Method.java:566)\r at 
org.apache.hadoop.util.RunJar.run(RunJar.java:318)\r at 
org.apache.hadoop.util.RunJar.main(RunJar.java:232)\rCaused by: 
java.lang.RuntimeException: Failed to init HttpServer\r at 
org.apache.hive.service.cli.thrift.ThriftHttpCLIService.initServer(ThriftHttpCLIService.java:239)\r
 at 
org.apache.hive.service.cli.thrift.ThriftCLIService.start(ThriftCLIService.java:235)\r
 at org.apache.hive.service.CompositeService.start(CompositeService.java:70)\r 
... 11 more\rCaused by: java.lang.Exception: The authentication types have 
conflicts: LDAP,JWT\r at 
org.apache.hive.service.auth.AuthType.verifyTypes(AuthType.java:69)\r at 
org.apache.hive.service.auth.AuthType.(AuthType.java:43)\r at 
org.apache.hive.service.cli.thrift.ThriftHttpServlet.(ThriftHttpServlet.java:124)\r
 at 
org.apache.hive.service.cli.thrift.ThriftHttpCLIService.initServer(ThriftHttpCLIService.java:197)\r
 ... 13 more\r
{noformat}

We never fixed the AuthType.validateTypes() to support this.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HIVE-26568) Upgrade Log4j2 to 2.18.0 due to CVEs

2022-09-26 Thread Naveen Gangam (Jira)
Naveen Gangam created HIVE-26568:


 Summary: Upgrade Log4j2 to 2.18.0 due to CVEs
 Key: HIVE-26568
 URL: https://issues.apache.org/jira/browse/HIVE-26568
 Project: Hive
  Issue Type: Bug
Affects Versions: 3.1.2
Reporter: weidong
Assignee: Hankó Gergely
 Fix For: 4.0.0, 4.0.0-alpha-1


High security vulnerability in Log4J - CVE-2021-44832 bundled with Hive



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HIVE-26566) Upgrade H2 database version to 2.1.214

2022-09-26 Thread Naveen Gangam (Jira)
Naveen Gangam created HIVE-26566:


 Summary: Upgrade H2 database version to 2.1.214
 Key: HIVE-26566
 URL: https://issues.apache.org/jira/browse/HIVE-26566
 Project: Hive
  Issue Type: Task
  Components: Testing Infrastructure
Reporter: Stamatis Zampetakis
Assignee: Stamatis Zampetakis
 Fix For: 4.0.0, 4.0.0-alpha-1


The 1.3.166 version, which is in use in Hive, suffers from the following 
security vulnerabilities:
https://nvd.nist.gov/vuln/detail/CVE-2021-42392
https://nvd.nist.gov/vuln/detail/CVE-2022-23221

In the project, we use H2 only for testing purposes (inside the jdbc-handler 
module) thus the H2 binaries are not present in the runtime classpath thus 
these CVEs do not pose a problem for Hive or its users. Nevertheless, it would 
be good to upgrade to a more recent version to avoid Hive coming up in 
vulnerability scans due to this.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HIVE-26502) Improve LDAP auth to support include generic user filters

2022-08-29 Thread Naveen Gangam (Jira)
Naveen Gangam created HIVE-26502:


 Summary: Improve LDAP auth to support include generic user filters
 Key: HIVE-26502
 URL: https://issues.apache.org/jira/browse/HIVE-26502
 Project: Hive
  Issue Type: Improvement
  Components: HiveServer2
Affects Versions: 4.0.0-alpha-1
Reporter: Naveen Gangam
Assignee: Naveen Gangam


Currently, Hive's ldap userfiltering is based on configuring a set of patterns 
in which wild cards are replaced by usernames and searched for. While this 
model supports advanced filtering options where a corporate ldap can have users 
in different orgs and trees, it does not quite support generic ldap searches 
like this.
(&(uid={0})(objectClass=person))

To be able to support this without making changes to the semantics of existing 
configuration params, and to be backward compatible, we can enhance the 
existing custom query functionality to support this.

For with a configuration like this, we should be able to perform a search for 
user who uid matches the username being authenticated.
  
hive.server2.authentication.ldap.baseDN
dc=apache,dc=org
  
  
hive.server2.authentication.ldap.customLDAPQuery
(&(uid={0})(objectClass=person))
  




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HIVE-26321) Upgrade commons-io to 2.11.0

2022-06-13 Thread Naveen Gangam (Jira)
Naveen Gangam created HIVE-26321:


 Summary: Upgrade commons-io to 2.11.0
 Key: HIVE-26321
 URL: https://issues.apache.org/jira/browse/HIVE-26321
 Project: Hive
  Issue Type: Improvement
Affects Versions: 4.0.0
Reporter: Naveen Gangam
Assignee: Naveen Gangam


Upgrade commons-io to 2.11.0



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Created] (HIVE-26118) [Standalone Beeline] Jar name mismatch between build and assembly

2022-04-05 Thread Naveen Gangam (Jira)
Naveen Gangam created HIVE-26118:


 Summary: [Standalone Beeline] Jar name mismatch between build and 
assembly
 Key: HIVE-26118
 URL: https://issues.apache.org/jira/browse/HIVE-26118
 Project: Hive
  Issue Type: Sub-task
  Components: Beeline
Affects Versions: 3.0.0
Reporter: Naveen Gangam
Assignee: Naveen Gangam


Fix from HIVE-25750 has an issue where the beeline builds a jar named 
"jar-with-dependencies.jar" but the assembly looks for a jar name 
"original-jar-with-dependencies.jar". Thus this uber jar never gets included in 
the distribution.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (HIVE-26046) MySQL's bit datatype is default to void datatype in hive

2022-03-17 Thread Naveen Gangam (Jira)
Naveen Gangam created HIVE-26046:


 Summary: MySQL's bit datatype is default to void datatype in hive
 Key: HIVE-26046
 URL: https://issues.apache.org/jira/browse/HIVE-26046
 Project: Hive
  Issue Type: Sub-task
  Components: Standalone Metastore
Affects Versions: 4.0.0
Reporter: Naveen Gangam


describe on a table that contains a "bit" datatype gets mapped to void. We need 
a explicit conversion logic in the MySQL ConnectorProvider to map it to a 
suitable datatype in hive.

{noformat}
+---+---++
|   col_name| data_type 
|  comment   |
+---+---++
| tbl_id| bigint
| from deserializer  |
| create_time   | int   
| from deserializer  |
| db_id | bigint
| from deserializer  |
| last_access_time  | int   
| from deserializer  |
| owner | varchar(767)  
| from deserializer  |
| owner_type| varchar(10)   
| from deserializer  |
| retention | int   
| from deserializer  |
| sd_id | bigint
| from deserializer  |
| tbl_name  | varchar(256)  
| from deserializer  |
| tbl_type  | varchar(128)  
| from deserializer  |
| view_expanded_text| string
| from deserializer  |
| view_original_text| string
| from deserializer  |
| is_rewrite_enabled| void  
| from deserializer  |
| write_id  | bigint
| from deserializer  
{noformat}




--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (HIVE-26045) Detect timed out connections for providers and auto-reconnect

2022-03-17 Thread Naveen Gangam (Jira)
Naveen Gangam created HIVE-26045:


 Summary: Detect timed out connections for providers and 
auto-reconnect
 Key: HIVE-26045
 URL: https://issues.apache.org/jira/browse/HIVE-26045
 Project: Hive
  Issue Type: Sub-task
  Components: HiveServer2
Affects Versions: 4.0.0
Reporter: Naveen Gangam


For the connectors, we use single connection, no pooling. But when the 
connection is idle for an extended period, the JDBC connection times out. We 
need to check for closed connections (Connection.isClosed()?) and re-establish 
the connection. Otherwise it renders the connector fairly useless.

{noformat}
2022-03-17T13:02:16,635  WARN [HiveServer2-Handler-Pool: Thread-116] 
thrift.ThriftCLIService: Error executing statement: 
org.apache.hive.service.cli.HiveSQLException: Error while compiling statement: 
FAILED: SemanticException Unable to fetch table temp_dbs. Error retrieving 
remote 
table:com.mysql.jdbc.exceptions.jdbc4.MySQLNonTransientConnectionException: No 
operations allowed after connection closed.
at 
org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:373)
 ~[hive-service-3.1.3000.7.2.15.0-SNAPSHOT.jar:3.1.3000.7.2.15.0-SNAPSHOT]
at 
org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:211)
 ~[hive-service-3.1.3000.7.2.15.0-SNAPSHOT.jar:3.1.3000.7.2.15.0-SNAPSHOT]
at 
org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:265)
 ~[hive-service-3.1.3000.7.2.15.0-SNAPSHOT.jar:3.1.3000.7.2.15.0-SNAPSHOT]
at 
org.apache.hive.service.cli.operation.Operation.run(Operation.java:285) 
~[hive-service-3.1.3000.7.2.15.0-SNAPSHOT.jar:3.1.3000.7.2.15.0-SNAPSHOT]
at 
org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:576)
 ~[hive-service-3.1.3000.7.2.15.0-SNAPSHOT.jar:3.1.3000.7.2.15.0-SNAPSHOT]
at 
org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:562)
 ~[hive-service-3.1.3000.7.2.15.0-SNAPSHOT.jar:3.1.3000.7.2.15.0-SNAPSHOT]
at sun.reflect.GeneratedMethodAccessor52.invoke(Unknown Source) ~[?:?]
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 ~[?:1.8.0_231]
at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_231]
at 
org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:78)
 ~[hive-service-3.1.3000.7.2.15.0-SNAPSHOT.jar:3.1.3000.7.2.15.0-SNAPSHOT]
at 
org.apache.hive.service.cli.session.HiveSessionProxy.access$000(HiveSessionProxy.java:36)
 ~[hive-service-3.1.3000.7.2.15.0-SNAPSHOT.jar:3.1.3000.7.2.15.0-SNAPSHOT]
at 
org.apache.hive.service.cli.session.HiveSessionProxy$1.run(HiveSessionProxy.java:63)
 ~[hive-service-3.1.3000.7.2.15.0-SNAPSHOT.jar:3.1.3000.7.2.15.0-SNAPSHOT]
at java.security.AccessController.doPrivileged(Native Method) 
~[?:1.8.0_231]
at javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_231]
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1682)
 ~[hadoop-common-3.1.0.jar:?]
at 
org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:59)
 ~[hive-service-3.1.3000.7.2.15.0-SNAPSHOT.jar:3.1.3000.7.2.15.0-SNAPSHOT]
at com.sun.proxy.$Proxy44.executeStatementAsync(Unknown Source) ~[?:?]
at 
org.apache.hive.service.cli.CLIService.executeStatementAsync(CLIService.java:315)
 ~[hive-service-3.1.3000.7.2.15.0-SNAPSHOT.jar:3.1.3000.7.2.15.0-SNAPSHOT]
at 
org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:567)
 ~[hive-service-3.1.3000.7.2.15.0-SNAPSHOT.jar:3.1.3000.7.2.15.0-SNAPSHOT]
at 
org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1550)
 ~[hive-exec-3.1.3000.7.2.15.0-SNAPSHOT.jar:3.1.3000.7.2.15.0-SNAPSHOT]
at 
org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1530)
 ~[hive-exec-3.1.3000.7.2.15.0-SNAPSHOT.jar:3.1.3000.7.2.15.0-SNAPSHOT]
at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:38) 
~[hive-exec-3.1.3000.7.2.15.0-SNAPSHOT.jar:3.1.3000.7.2.15.0-SNAPSHOT]
at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:38) 
~[hive-exec-3.1.3000.7.2.15.0-SNAPSHOT.jar:3.1.3000.7.2.15.0-SNAPSHOT]
at 
org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:56)
 ~[hive-service-3.1.3000.7.2.15.0-SNAPSHOT.jar:3.1.3000.7.2.15.0-SNAPSHOT]
at 
org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:313)
 ~[hive-exec-3.1.3000.7.2.15.0-SNAPSHOT.jar:3.1.3000.7.2.15.0-SNAPSHOT]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 

[jira] [Created] (HIVE-26012) HMS APIs to be enhanced for metadata replication

2022-03-07 Thread Naveen Gangam (Jira)
Naveen Gangam created HIVE-26012:


 Summary: HMS APIs to be enhanced for metadata replication
 Key: HIVE-26012
 URL: https://issues.apache.org/jira/browse/HIVE-26012
 Project: Hive
  Issue Type: Improvement
  Components: Metastore
Affects Versions: 3.1.0
Reporter: Naveen Gangam


HMS currently has APIs like these that automatically create/delete the 
directories on the associated DFS. 
[create/drop]_database
[create/drop]_table*
[add/append/drop]_partition*

This is expected and should be this way when query processors use this APIs. 
However, when tools that replicate hive metadata use this APIs on the target 
cluster, creating these dirs on target side which cause the replication of 
DFS-snapshots to fail.

So we if provide an option to bypass this creation of dirs, dfs replications 
will be smoother. In the future we will need to restrict users that can use 
these APIs. So we will have some sort of an authorization policy.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (HIVE-25875) Support multiple authentication mechanisms simultaneously

2022-01-18 Thread Naveen Gangam (Jira)
Naveen Gangam created HIVE-25875:


 Summary: Support multiple authentication mechanisms simultaneously 
 Key: HIVE-25875
 URL: https://issues.apache.org/jira/browse/HIVE-25875
 Project: Hive
  Issue Type: Improvement
  Components: HiveServer2
Affects Versions: 3.1.0
Reporter: Naveen Gangam
Assignee: Naveen Gangam


Currently, HS2 supports a single form of auth on any given instance of 
HiveServer2. Hive should be able to support multiple auth mechanisms on a 
single instance especially with http transport. for example, LDAP and SAML.  In 
both cases, HS2 ends up with receiving an Authorization Header in the request. 
Similarly we could be able to support JWT support or other forms of boundary 
authentication that is done outside of Hive.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (HIVE-25855) Make a branch-3 release

2022-01-10 Thread Naveen Gangam (Jira)
Naveen Gangam created HIVE-25855:


 Summary: Make a branch-3 release 
 Key: HIVE-25855
 URL: https://issues.apache.org/jira/browse/HIVE-25855
 Project: Hive
  Issue Type: Bug
Reporter: Naveen Gangam
Assignee: Naveen Gangam


This jira is to track commits for a hive release off branch-3



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (HIVE-25798) Update pom.xml

2021-12-11 Thread Naveen Gangam (Jira)
Naveen Gangam created HIVE-25798:


 Summary: Update pom.xml
 Key: HIVE-25798
 URL: https://issues.apache.org/jira/browse/HIVE-25798
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Naveen Gangam
Assignee: Naveen Gangam






--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (HIVE-25214) Add hive authorization support for Data connectors.

2021-06-07 Thread Naveen Gangam (Jira)
Naveen Gangam created HIVE-25214:


 Summary: Add hive authorization support for Data connectors.
 Key: HIVE-25214
 URL: https://issues.apache.org/jira/browse/HIVE-25214
 Project: Hive
  Issue Type: Sub-task
Reporter: Naveen Gangam
Assignee: Naveen Gangam


We need to add authorization support for data connectors in hive. The default 
behavior should be
1) Connectors can be create/dropped by users in admin role.
2) Connectors have READ and WRITE permissions.
*   READ permissions are required to fetch a connector object or fetch all 
connector names. So to create a REMOTE database using a connector, users will 
need READ permission on the connector. DDL queries like "show connectors" and 
"describe " will check for read access on the connector as well.
*   WRITE permissions are required to alter/drop a connector. DDL queries like 
"alter connector" and "drop connector" will need WRITE access on the connector.

Adding this support, Ranger can integrate with this.
   



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-25213) Implement List getTables() for existing connectors.

2021-06-07 Thread Naveen Gangam (Jira)
Naveen Gangam created HIVE-25213:


 Summary: Implement List getTables() for existing connectors.
 Key: HIVE-25213
 URL: https://issues.apache.org/jira/browse/HIVE-25213
 Project: Hive
  Issue Type: Sub-task
Reporter: Naveen Gangam
Assignee: Naveen Gangam


In the initial implementation, connector providers do not implement the 
getTables(string pattern) spi. We had deferred it for later. Only 
getTableNames() and getTable() were implemented. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24970) Reject location and managed locations in DDL for REMOTE databases.

2021-04-02 Thread Naveen Gangam (Jira)
Naveen Gangam created HIVE-24970:


 Summary: Reject location and managed locations in DDL for REMOTE 
databases.
 Key: HIVE-24970
 URL: https://issues.apache.org/jira/browse/HIVE-24970
 Project: Hive
  Issue Type: Sub-task
  Components: HiveServer2
Affects Versions: 4.0.0
Reporter: Naveen Gangam
Assignee: Naveen Gangam


This was part of the review feedback from Yongzhi. Creating a followup jira to 
track this discussion.
So, using DB connector for DB, will not create managed tables?

 
@nrg4878 nrg4878 1 hour ago Author Member
we don't support create/drop/alter in REMOTE databases at this point. the 
concepts of managed vs external is not in the picture at this point. When we do 
support it, it will be application to the hive connectors only (or other hive 
based connectors like AWS Glue)

 
@nrg4878 nrg4878 2 minutes ago Author Member
will file a separate jira for this. Basically, instead of ignoring the location 
and managedlocation that may be specified for remote database, the grammer 
needs to not accept any locations in the DDL at all.
The argument is fair, why accept something we do not honor or entirely 
irrelevant for such databases. However, this requires some thought when we have 
additional connectors for remote hive instances. It might have some relevance 
in terms of security with Ranger etc.
So will create new jira for followup discussion.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24942) Consider use of lambda expressions in formatters.

2021-03-25 Thread Naveen Gangam (Jira)
Naveen Gangam created HIVE-24942:


 Summary: Consider use of lambda expressions in formatters.
 Key: HIVE-24942
 URL: https://issues.apache.org/jira/browse/HIVE-24942
 Project: Hive
  Issue Type: Sub-task
Affects Versions: 4.0.0
Reporter: Naveen Gangam
Assignee: Narayanan Venkateswaran


ArrayList dcDescription = new ArrayList();

dcDescription.add(connector);
dcDescription.add(type);
dcDescription.add(ownerName);
dcDescription.add(ownerType);
dcDescription.add(HiveStringUtils.escapeJava(comment));
dcDecription.add(params.toString());

Consumer description_handler = (param) -> { 
out.write(param.getBytes(StandardCharsets.UTF_8));};

dcDescription.forEach(param);





--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24941) [Evaluate] if ReplicationSpec is needed for DataConnectors.

2021-03-25 Thread Naveen Gangam (Jira)
Naveen Gangam created HIVE-24941:


 Summary: [Evaluate] if ReplicationSpec is needed for 
DataConnectors.
 Key: HIVE-24941
 URL: https://issues.apache.org/jira/browse/HIVE-24941
 Project: Hive
  Issue Type: Sub-task
  Components: Hive
Affects Versions: 4.0.0
Reporter: Naveen Gangam
Assignee: Naveen Gangam


We have ReplicationSpec on Connector. Not sure if this is needed, if we do not 
want to replicate connectors.

  public ReplicationSpec getReplicationSpec() {
return replicationSpec;
  }



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24938) [Evaluate] Dataconnector URL validation on create

2021-03-25 Thread Naveen Gangam (Jira)
Naveen Gangam created HIVE-24938:


 Summary: [Evaluate] Dataconnector URL validation on create
 Key: HIVE-24938
 URL: https://issues.apache.org/jira/browse/HIVE-24938
 Project: Hive
  Issue Type: Sub-task
Reporter: Naveen Gangam
Assignee: Naveen Gangam


>From the review feedback, there was a comment about validating URL specified 
>in the connector URL when it is created. Currently, there is no validation 
>except for checking for empty/null value. This is by-design and the desired 
>behavior, IMHO. But filing this to be discussed with wider audience.

{noformat}
I tried creating a connector without the mysql JDBC URL specified properly and 
it went through,

please see below,

CREATE CONNECTOR mysql_test_2
TYPE 'mysql'
URL 'jdbc://'
COMMENT 'test connector'
WITH DCPROPERTIES (
"hive.sql.dbcp.username"="hive1",
"hive.sql.dbcp.password"="hive1");

CREATE CONNECTOR mysql_test_3
TYPE 'mysql'
URL 'jdbc:derby://nightly1.apache.org:3306/hive1'
COMMENT 'test connector'
WITH DCPROPERTIES (
"hive.sql.dbcp.username"="hive1",
"hive.sql.dbcp.password"="hive1");

I am not saying they are wrong, but we should probably call this out in the 
documentation. Document that URLs are not verified.

Another thing I noticed is that the password is displayed in plain
text on the command line. This used be considered a security problem
in a product I worked in a past life. But I notice that an external
table can be created with this semantics. I guess it is acceptable
here.

It is also stored in plain text in the metastore, please see below,

CREATE TABLE DATACONNECTOR_PARAMS (
NAME VARCHAR(128) NOT NULL,
PARAM_KEY VARCHAR(180) NOT NULL,
PARAM_VALUE VARCHAR(4000),
PRIMARY KEY (NAME, PARAM_KEY),
CONSTRAINT DATACONNECTOR_NAME_FK1 FOREIGN KEY (NAME) REFERENCES DATACONNECTORS 
(NAME) ON DELETE CASCADE
) ENGINE=InnoDB DEFAULT CHARSET=latin1;

Again I am not saying this is a problem, but I thought I can call this out to 
you.

 
@nrg4878 nrg4878 24 minutes ago Author Member
We check for null/empty values for URL. We error out in those cases. Other than 
that, any non-empty value is accepted. I dont think we should check for 
correctness of the URL or even can for that matter.
a) The URL is meant to be a freeform value against dozens of datasource types 
(mysql, postgres, hive, AWS Glue, Redshift etc). For each such source type, 
there could be dozens of variations of the url (includes properties and other 
params specific to the source). So I dont think we can meaningfully detect 
incorrect URLs.
For example, MySQL though the URL might look fine syntactically, we cannot 
confirm dbName1 or dbName2 exist without actually attempting to connect to the 
DB.
jdbc:mysql://:3306/
jdbc:mysql://:3306/
b) The format for the URLs could be changing overtime as well. It is 
unnecessary burden for maintaining new formats in hive. We want to be able to 
plugin a new datasource type by simply adding a provider.

c) To be able to validate the URL, we have to establish the connection to the 
datasource at the time of creation. We are trying to delay making that 
connection as long as possible. When actual show tables is called. We avoid 
using up extra resources and leak connections.

d) Users can do "create connector" .. followed by "alter connector set url". So 
any incorrect URLS can be modified using alter. Also in this case, we would be 
checking the URL twice. Better to have the onus of configuring it correctly on 
the end user.

Passwords can be secured using jceks files as described in the "Securing 
Password" section of the doc below.
https://cwiki.apache.org/confluence/display/Hive/JDBC+Storage+Handler
So users have an option of using non-CTVs
{noformat}




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24887) getDatabase() to call translation code even if client has no capabilities

2021-03-15 Thread Naveen Gangam (Jira)
Naveen Gangam created HIVE-24887:


 Summary: getDatabase() to call translation code even if client has 
no capabilities
 Key: HIVE-24887
 URL: https://issues.apache.org/jira/browse/HIVE-24887
 Project: Hive
  Issue Type: Sub-task
Reporter: Naveen Gangam
Assignee: Naveen Gangam


We do this for other calls that go thru translation layer. For some reason, the 
current code only calls it when the client sets the capabilities.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24844) Add implementation for a 'hive' connector provider

2021-03-04 Thread Naveen Gangam (Jira)
Naveen Gangam created HIVE-24844:


 Summary: Add implementation for a 'hive' connector provider
 Key: HIVE-24844
 URL: https://issues.apache.org/jira/browse/HIVE-24844
 Project: Hive
  Issue Type: Sub-task
Reporter: Naveen Gangam


This connector implementation will allow HMS to communicate with remote HMS 
instances for metadata.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24770) Upgrade should update changed FQN in HMS DB.

2021-02-10 Thread Naveen Gangam (Jira)
Naveen Gangam created HIVE-24770:


 Summary: Upgrade should update changed FQN in HMS DB.
 Key: HIVE-24770
 URL: https://issues.apache.org/jira/browse/HIVE-24770
 Project: Hive
  Issue Type: Sub-task
  Components: Hive
Affects Versions: 4.0.0
Reporter: Naveen Gangam
Assignee: Naveen Gangam


While the parent change has does not cause this directly, but post upgrade the 
existing tables that use MultiDelimiterSerDe will be broken as the hive-contrib 
jar would no longer exist. Instead if the Hive schema upgrade script can update 
the SERDES table to alter the classname to the new classname, the old tables 
would work automatically. Much better user experience.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24464) Evaluate the need to have directSQL implementation for data connectors

2020-12-01 Thread Naveen Gangam (Jira)
Naveen Gangam created HIVE-24464:


 Summary: Evaluate the need to have directSQL implementation for 
data connectors
 Key: HIVE-24464
 URL: https://issues.apache.org/jira/browse/HIVE-24464
 Project: Hive
  Issue Type: Sub-task
Reporter: Naveen Gangam


I expect that there will be just a handful of connectors not 100's of them like 
databases. But creating a placeholder item to evaluate at a future time. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24461) Provide CachedStore implementation for dataconnectors

2020-12-01 Thread Naveen Gangam (Jira)
Naveen Gangam created HIVE-24461:


 Summary: Provide CachedStore implementation for dataconnectors
 Key: HIVE-24461
 URL: https://issues.apache.org/jira/browse/HIVE-24461
 Project: Hive
  Issue Type: Sub-task
Reporter: Naveen Gangam


Currently, none of the connectors are cached. They are all delegated to the 
ObjectStore for every call.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24452) Add a generic JDBC implementation that can be used to other JDBC DBs

2020-11-30 Thread Naveen Gangam (Jira)
Naveen Gangam created HIVE-24452:


 Summary: Add a generic JDBC implementation that can be used to 
other JDBC DBs
 Key: HIVE-24452
 URL: https://issues.apache.org/jira/browse/HIVE-24452
 Project: Hive
  Issue Type: Sub-task
Reporter: Naveen Gangam


Currently, we added a custom provider for each of the JDBC DBs supported by 
hive (MySQL, Postgres, MSSQL(pending), Oracle(pending) and Derby (pending)).  
But if there are other JDBC providers we want to add support for, adding a 
generic JDBC provider would be useful that hive can default to.
This means
1) We have to support means to indicate that a connector is for a JDBC 
datasource. So maybe add a property in DCPROPERTIES on connector to indicate 
that the datasource supports JDBC.
2) If there is no custom connector for a data source, use the 
GenericJDBCDatasource connector that is to be added as part of this jira.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24451) Add schema changes for MSSQL

2020-11-30 Thread Naveen Gangam (Jira)
Naveen Gangam created HIVE-24451:


 Summary: Add schema changes for MSSQL
 Key: HIVE-24451
 URL: https://issues.apache.org/jira/browse/HIVE-24451
 Project: Hive
  Issue Type: Sub-task
Reporter: Naveen Gangam


The current patch does not include schema changes for MSSQL backend. This 
should be right after the initial commit.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24449) Implement connector provider for Derby DB

2020-11-30 Thread Naveen Gangam (Jira)
Naveen Gangam created HIVE-24449:


 Summary: Implement connector provider for Derby DB
 Key: HIVE-24449
 URL: https://issues.apache.org/jira/browse/HIVE-24449
 Project: Hive
  Issue Type: Sub-task
  Components: Hive
Affects Versions: 4.0.0
Reporter: Naveen Gangam
Assignee: Naveen Gangam


Provide an implementation of Connector provider for Derby DB.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24448) Support case-sensitivity for tables in REMOTE database.

2020-11-30 Thread Naveen Gangam (Jira)
Naveen Gangam created HIVE-24448:


 Summary: Support case-sensitivity for tables in REMOTE database.
 Key: HIVE-24448
 URL: https://issues.apache.org/jira/browse/HIVE-24448
 Project: Hive
  Issue Type: Sub-task
  Components: HiveServer2
Affects Versions: 4.0.0
Reporter: Naveen Gangam


Hive tables are case-insensitive. So any case specified in user queries are 
converted to lower case for query planning and all of the HMS metadata is also 
persisted as lower case names.
However, with REMOTE data sources, certain data source will support 
case-sensitivity for tables. 
So HiveServer2 query planner needs to preserve user-provided case to be used 
with HMS APIs, for HMS to be able to fetch the metadata from a remote data 
source.
We now see something like this

{noformat}
2020-11-25T16:45:36,402  WARN [HiveServer2-Handler-Pool: Thread-76] 
thrift.ThriftCLIService: Error executing statement: 
org.apache.hive.service.cli.HiveSQLException: Error while compiling statement: 
FAILED: RuntimeException 
MetaException(message:org.apache.hadoop.hive.serde2.SerDeException 
org.apache.hive.storage.jdbc.exception.HiveJdbcDatabaseAccessException: Error 
while trying to get column names: Table 'hive1.txns' doesn't exist)
at 
org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:365)
 ~[hive-service-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:206)
 ~[hive-service-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:262)
 ~[hive-service-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hive.service.cli.operation.Operation.run(Operation.java:277) 
~[hive-service-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:560)
 ~[hive-service-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:545)
 ~[hive-service-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at sun.reflect.GeneratedMethodAccessor68.invoke(Unknown Source) ~[?:?]
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 ~[?:1.8.0_231]
at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_231]
at 
org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:78)
 ~[hive-service-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hive.service.cli.session.HiveSessionProxy.access$000(HiveSessionProxy.java:36)
 ~[hive-service-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hive.service.cli.session.HiveSessionProxy$1.run(HiveSessionProxy.java:63)
 ~[hive-service-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at java.security.AccessController.doPrivileged(Native Method) 
~[?:1.8.0_231]
at javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_231]
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1682)
 ~[hadoop-common-3.1.0.jar:?]
at 
org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:59)
 ~[hive-service-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at com.sun.proxy.$Proxy43.executeStatementAsync(Unknown Source) ~[?:?]
at 
org.apache.hive.service.cli.CLIService.executeStatementAsync(CLIService.java:315)
 ~[hive-service-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:571)
 ~[hive-service-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1550)
 ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1530)
 ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:38) 
~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:38) 
~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:56)
 ~[hive-service-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:313)
 ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
~[?:1.8.0_231]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
~[?:1.8.0_231]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_231]
Caused by: java.lang.RuntimeException: 

[jira] [Created] (HIVE-24447) Move create/drop/alter table to the provider interface

2020-11-30 Thread Naveen Gangam (Jira)
Naveen Gangam created HIVE-24447:


 Summary: Move create/drop/alter table to the provider interface
 Key: HIVE-24447
 URL: https://issues.apache.org/jira/browse/HIVE-24447
 Project: Hive
  Issue Type: Sub-task
  Components: Hive
Affects Versions: 4.0.0
Reporter: Naveen Gangam
Assignee: Naveen Gangam


The support for such operations on a table in a REMOTE database will be left to 
the discretion of the providers to support/implement.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24425) Create table in REMOTE db should fail

2020-11-24 Thread Naveen Gangam (Jira)
Naveen Gangam created HIVE-24425:


 Summary: Create table in REMOTE db should fail
 Key: HIVE-24425
 URL: https://issues.apache.org/jira/browse/HIVE-24425
 Project: Hive
  Issue Type: Sub-task
Reporter: Naveen Gangam
Assignee: Naveen Gangam


Currently it creates the table in that DB but show tables does not show 
anything. Preventing the creation of table will resolve this inconsistency too.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24405) Missing datatype for table column in oracle

2020-11-19 Thread Naveen Gangam (Jira)
Naveen Gangam created HIVE-24405:


 Summary: Missing datatype for table column in oracle
 Key: HIVE-24405
 URL: https://issues.apache.org/jira/browse/HIVE-24405
 Project: Hive
  Issue Type: Sub-task
  Components: Hive
Reporter: Naveen Gangam
Assignee: Naveen Gangam


The parent change introduces an issue in the oracle schema script.  No datatype 
is specified.
{noformat}
1 row created.

  CQ_COMMIT_TIME(19)
*
ERROR at line 19:
ORA-00902: invalid datatype
{noformat}






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24396) [New Feature] Add data connector support for remote datasources

2020-11-16 Thread Naveen Gangam (Jira)
Naveen Gangam created HIVE-24396:


 Summary: [New Feature] Add data connector support for remote 
datasources
 Key: HIVE-24396
 URL: https://issues.apache.org/jira/browse/HIVE-24396
 Project: Hive
  Issue Type: Improvement
  Components: Hive
Reporter: Naveen Gangam
Assignee: Naveen Gangam


This feature work is to be able to support in Hive Metastore to be able to 
configure data connectors for remote datasources and map databases. We 
currently have support for remote tables via StorageHandlers like 
JDBCStorageHandler and HBaseStorageHandler.

Data connectors are a natural extension to this where we can map an entire 
database or catalogs instead of individual tables. The tables within are 
automagically mapped at runtime. The metadata for these tables are not 
persisted in Hive. They are always mapped and built at runtime. 

With this feature, we introduce a concept of type for Databases in Hive. NATIVE 
vs REMOTE. All current databases are NATIVE. To create a REMOTE database, the 
following syntax is to be used
CREATE REMOTE DATABASE remote_db USING  WITH DCPROPERTIES ();

Will attach a design doc to this jira. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24348) Beeline: Isolating dependencies and execution with java

2020-11-02 Thread Naveen Gangam (Jira)
Naveen Gangam created HIVE-24348:


 Summary: Beeline: Isolating dependencies and execution with java
 Key: HIVE-24348
 URL: https://issues.apache.org/jira/browse/HIVE-24348
 Project: Hive
  Issue Type: Improvement
  Components: Beeline
Affects Versions: 3.1.0
Reporter: Naveen Gangam
Assignee: Naveen Gangam


Currently, beeline code, binaries and executables are somewhat tightly coupled 
with the hive product. To be able to execute beeline from a node with just JRE 
installed and some jars in classpath is impossible.
* beeline.sh/hive scripts rely on HADOOP_HOME to be set which are designed to 
use "hadoop" executable to run beeline.
* Ideally, just the hive-beeline.jar and hive-jdbc-standalone jars should be 
enough but sadly they arent. The latter jar adds more problems than it solves 
because all the classfiles are shaded some dependencies cannot be resolved.
* Beeline has many other dependencies like hive-exec, hive-common. 
hadoop-common, supercsv, jline, commons-cli, commons-io, commons-logging etc. 
While it may not be possible to eliminate some of these, we should atleast have 
a self-contains jar that contains all these to be able to make it work.
* the underlying script used to run beeline should use JAVA as an alternate 
means to execute if HADOOP_HOME is not set



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24288) Files created by CompileProcessor have incorrect permissions

2020-10-19 Thread Naveen Gangam (Jira)
Naveen Gangam created HIVE-24288:


 Summary: Files created by CompileProcessor have incorrect 
permissions
 Key: HIVE-24288
 URL: https://issues.apache.org/jira/browse/HIVE-24288
 Project: Hive
  Issue Type: Bug
  Components: Hive
Affects Versions: 3.1.0
Reporter: Naveen Gangam
Assignee: Naveen Gangam


Compile processor generates some temporary files as part of processing.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24271) Create managed table relies on hive.create.as.acid settings.

2020-10-13 Thread Naveen Gangam (Jira)
Naveen Gangam created HIVE-24271:


 Summary: Create managed table relies on hive.create.as.acid 
settings.
 Key: HIVE-24271
 URL: https://issues.apache.org/jira/browse/HIVE-24271
 Project: Hive
  Issue Type: Sub-task
  Components: HiveServer2
Affects Versions: 4.0.0
Reporter: Naveen Gangam
Assignee: Naveen Gangam


0: jdbc:hive2://ngangam-3.ngangam.root.hwx.si> set hive.create.as.acid;
++
|set |
++
| hive.create.as.acid=false  |
++
1 row selected (0.018 seconds)
0: jdbc:hive2://ngangam-3.ngangam.root.hwx.si> set hive.create.as.insert.only;
+---+
|set|
+---+
| hive.create.as.insert.only=false  |
+---+
1 row selected (0.013 seconds)
0: jdbc:hive2://ngangam-3.ngangam.root.hwx.si> create managed table mgd_table(a 
int);
INFO  : Compiling 
command(queryId=hive_20201014053526_9ba1ffa3-3aa2-47c3-8514-1fe58fe4f140): 
create managed table mgd_table(a int)
INFO  : Semantic Analysis Completed (retrial = false)
INFO  : Created Hive schema: Schema(fieldSchemas:null, properties:null)
INFO  : Completed compiling 
command(queryId=hive_20201014053526_9ba1ffa3-3aa2-47c3-8514-1fe58fe4f140); Time 
taken: 0.021 seconds
INFO  : Executing 
command(queryId=hive_20201014053526_9ba1ffa3-3aa2-47c3-8514-1fe58fe4f140): 
create managed table mgd_table(a int)
INFO  : Starting task [Stage-0:DDL] in serial mode
INFO  : Completed executing 
command(queryId=hive_20201014053526_9ba1ffa3-3aa2-47c3-8514-1fe58fe4f140); Time 
taken: 0.048 seconds
INFO  : OK
No rows affected (0.107 seconds)
0: jdbc:hive2://ngangam-3.ngangam.root.hwx.si> describe formatted mgd_table;
INFO  : Compiling 
command(queryId=hive_20201014053533_8919be7d-41b0-41e5-b9eb-847801a9d8c5): 
describe formatted mgd_table
INFO  : Semantic Analysis Completed (retrial = false)
INFO  : Created Hive schema: Schema(fieldSchemas:[FieldSchema(name:col_name, 
type:string, comment:from deserializer), FieldSchema(name:data_type, 
type:string, comment:from deserializer), FieldSchema(name:comment, type:string, 
comment:from deserializer)], properties:null)
INFO  : Completed compiling 
command(queryId=hive_20201014053533_8919be7d-41b0-41e5-b9eb-847801a9d8c5); Time 
taken: 0.037 seconds
INFO  : Executing 
command(queryId=hive_20201014053533_8919be7d-41b0-41e5-b9eb-847801a9d8c5): 
describe formatted mgd_table
INFO  : Starting task [Stage-0:DDL] in serial mode
INFO  : Completed executing 
command(queryId=hive_20201014053533_8919be7d-41b0-41e5-b9eb-847801a9d8c5); Time 
taken: 0.03 seconds
INFO  : OK
+---+++
|   col_name| data_type 
 |  comment   |
+---+++
| a | int   
 ||
|   | NULL  
 | NULL   |
| # Detailed Table Information  | NULL  
 | NULL   |
| Database: | bothfalseonhs2
 | NULL   |
| OwnerType:| USER  
 | NULL   |
| Owner:| hive  
 | NULL   |
| CreateTime:   | Wed Oct 14 05:35:26 UTC 2020  
 | NULL   |
| LastAccessTime:   | UNKNOWN   
 | NULL   |
| Retention:| 0 
 | NULL   |
| Location: | 
hdfs://ngangam-3.ngangam.root.hwx.site:8020/warehouse/tablespace/external/hive/bothfalseonhs2.db/mgd_table
 | NULL   |
| Table Type:   | EXTERNAL_TABLE
 | NULL   |
| Table Parameters: | NULL  
 | NULL   |
|   

[jira] [Created] (HIVE-24175) Ease database managed location restrictions in HMS translation

2020-09-17 Thread Naveen Gangam (Jira)
Naveen Gangam created HIVE-24175:


 Summary: Ease database managed location restrictions in HMS 
translation
 Key: HIVE-24175
 URL: https://issues.apache.org/jira/browse/HIVE-24175
 Project: Hive
  Issue Type: Sub-task
Affects Versions: 4.0.0
Reporter: Naveen Gangam
Assignee: Naveen Gangam


Currently, HMS translation layer restricts the path of database's managed 
location to be within hive warehouse. so a getDatabase call will return a 
managedlocation path that adheres to this restriction regardless of what has 
been set in the HMS DB. This leads to issues like having inconsistent paths if 
hive-site.xml is not in sync across HMS and HS2 instances or even different HMS 
instances as each instance has a different version of warehouse root.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24152) Comment out test until it is investigated.

2020-09-11 Thread Naveen Gangam (Jira)
Naveen Gangam created HIVE-24152:


 Summary: Comment out test until it is investigated.
 Key: HIVE-24152
 URL: https://issues.apache.org/jira/browse/HIVE-24152
 Project: Hive
  Issue Type: Sub-task
Affects Versions: 4.0.0
Reporter: Naveen Gangam
Assignee: Naveen Gangam


Looks like this test was re-enabled between the time the precommits were run 
and it was committed (a few hours later). This is blocking all other commits. 
Commenting it out for now



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24086) CTAS with HMS translation enabled returns empty results.

2020-08-27 Thread Naveen Gangam (Jira)
Naveen Gangam created HIVE-24086:


 Summary: CTAS with HMS translation enabled returns empty results.
 Key: HIVE-24086
 URL: https://issues.apache.org/jira/browse/HIVE-24086
 Project: Hive
  Issue Type: Bug
  Components: Hive
Reporter: Naveen Gangam
Assignee: Naveen Gangam


when you execute something like 
create table ctas_table as select * from mgd_table;

if mgd_table is a managed table, the hive query planner creates a plan with 
ctas_table as a managed table, so the location is set to something in the 
managed warehouse directory.

However with HMS translation enabled, non-acid MANAGED tables are converted to 
EXTERNAL with purge set to true. So the table location for this table is 
altered to be in the external warehouse directory.
But after the table creation, the rest of the query executes but the data is 
copied to the location set in the query plan. As a result when you execute a 
select from ctas_table, it will not return any results because that location is 
empty.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24076) MetastoreDirectSql.getDatabase() needs a space in the query

2020-08-26 Thread Naveen Gangam (Jira)
Naveen Gangam created HIVE-24076:


 Summary: MetastoreDirectSql.getDatabase() needs a space in the 
query
 Key: HIVE-24076
 URL: https://issues.apache.org/jira/browse/HIVE-24076
 Project: Hive
  Issue Type: Sub-task
  Components: Hive
Affects Versions: 4.0.0
Reporter: Naveen Gangam
Assignee: Naveen Gangam


String queryTextDbSelector= "select "
  + "\"DB_ID\", \"NAME\", \"DB_LOCATION_URI\", \"DESC\", "
  + "\"OWNER_NAME\", \"OWNER_TYPE\", \"CTLG_NAME\" , \"CREATE_TIME\", 
\"DB_MANAGED_LOCATION_URI\""
  + "FROM "+ DBS

There needs to be a space before FROM so the query is right. Currently it falls 
back to JDO, so not lapse in functionality.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23970) Reject database creation if managedlocation is incorrect

2020-07-31 Thread Naveen Gangam (Jira)
Naveen Gangam created HIVE-23970:


 Summary: Reject database creation if managedlocation is incorrect
 Key: HIVE-23970
 URL: https://issues.apache.org/jira/browse/HIVE-23970
 Project: Hive
  Issue Type: Sub-task
  Components: Hive
Affects Versions: 4.0.0
Reporter: Naveen Gangam
Assignee: Naveen Gangam


With some changes in HIVE-23387, managed location check gets bypassed. Need to 
be fixed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23603) transformDatabase() should work with changes from HIVE-22995

2020-06-03 Thread Naveen Gangam (Jira)
Naveen Gangam created HIVE-23603:


 Summary: transformDatabase() should work with changes from 
HIVE-22995
 Key: HIVE-23603
 URL: https://issues.apache.org/jira/browse/HIVE-23603
 Project: Hive
  Issue Type: Sub-task
  Components: Hive
Reporter: Naveen Gangam
Assignee: Naveen Gangam
 Fix For: 4.0.0


The translation layer alters the locationUri on Database based on the 
capabilities of the client. Now that we have separate locations for managed and 
external for database, the implementation should be adjusted to work with both 
locations. locationUri could already be external location.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23588) create table like tabletype should match source tabletype and proper location

2020-06-01 Thread Naveen Gangam (Jira)
Naveen Gangam created HIVE-23588:


 Summary: create table like tabletype should match source tabletype 
and proper location
 Key: HIVE-23588
 URL: https://issues.apache.org/jira/browse/HIVE-23588
 Project: Hive
  Issue Type: Sub-task
  Components: Hive
Affects Versions: 4.0.0
Reporter: Naveen Gangam
Assignee: Naveen Gangam






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23562) Upgrade thrift version in hive

2020-05-28 Thread Naveen Gangam (Jira)
Naveen Gangam created HIVE-23562:


 Summary: Upgrade thrift version in hive
 Key: HIVE-23562
 URL: https://issues.apache.org/jira/browse/HIVE-23562
 Project: Hive
  Issue Type: Improvement
  Components: Hive
Affects Versions: 4.0.0
Reporter: Naveen Gangam


Hive has been using thrift 0.9.3 for a long time. We might be able to avail new 
features like deprecation support etc in the newer releases of thrift. But this 
impacts interoperability between older clients and newer servers. We need to 
assess what can break atleast for the purposes of documenting before we make 
this change.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23435) Full outer join result is missing rows

2020-05-11 Thread Naveen Gangam (Jira)
Naveen Gangam created HIVE-23435:


 Summary: Full outer join result is missing rows 
 Key: HIVE-23435
 URL: https://issues.apache.org/jira/browse/HIVE-23435
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Affects Versions: 3.1.0
Reporter: Naveen Gangam
Assignee: Jesus Camacho Rodriguez


Full Outer join result has missing rows. Appears to be a bug with the full 
outer join logic. Expected output is receiving when we do a left and right 
outer join.

Reproducible steps are mentioned below.

~~

SUPPORT ANALYSIS

Steps to Reproduce:

1. Create a table and insert data:

create table x (z char(5), x int, y int);

insert into x values ('one', 1, 50),
('two', 2, 30),
('three', 3, 30),
('four', 4, 60),
('five', 5, 70),
('six', 6, 80);

2. Try full outer with the below command. The result is incomplete, it is 
missing the row:

NULLNULLNULLthree   3   30.0
Full Outer Join:

select x1.`z`, x1.`x`, x1.`y`, x2.`z`,
x2.`x`, x2.`y`
from `x` x1 full outer join
`x` x2 on (x1.`x` > 3) and (x2.`x` < 4) and (x1.`x` =
x2.`x`);

Result:

--+

x1.zx1.xx1.yx2.zx2.xx2.y
--+

one 1   50  NULLNULLNULL
NULLNULLNULLone 1   50
two 2   30  NULLNULLNULL
NULLNULLNULLtwo 2   30
three   3   30  NULLNULLNULL
four4   60  NULLNULLNULL
NULLNULLNULLfour4   60
five5   70  NULLNULLNULL
NULLNULLNULLfive5   70
six 6   80  NULLNULLNULL
NULLNULLNULLsix 6   80
--+

3. Expected output is coming when we use left/right join + union:

select x1.`z`, x1.`x`, x1.`y`, x2.`z`,
x2.`x`, x2.`y`
from `x` x1 left outer join
`x` x2 on (x1.`x` > 3) and (x2.`x` < 4) and (x1.`x` =
x2.`x`)
union
select x1.`z`, x1.`x`, x1.`y`, x2.`z`,
x2.`x`, x2.`y`
from `x` x1 right outer join
`x` x2 on (x1.`x` > 3) and (x2.`x` < 4) and (x1.`x` =
x2.`x`);

Result:

+

z   x   y   _col3   _col4   _col5
+

NULLNULLNULLfive5   70
NULLNULLNULLfour4   60
NULLNULLNULLone 1   50
four4   60  NULLNULLNULL
one 1   50  NULLNULLNULL
six 6   80  NULLNULLNULL
three   3   30  NULLNULLNULL
two 2   30  NULLNULLNULL
NULLNULLNULLsix 6   80
NULLNULLNULLthree   3   30
NULLNULLNULLtwo 2   30
five5   70  NULLNULLNULL
+

~~

EXPECTED ENGINEERING ACTION

Confirm this is a bug. If so, any work around or just use left+right outer join.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23388) CTAS queries should use target's location for staging.

2020-05-06 Thread Naveen Gangam (Jira)
Naveen Gangam created HIVE-23388:


 Summary: CTAS queries should use target's location for staging.
 Key: HIVE-23388
 URL: https://issues.apache.org/jira/browse/HIVE-23388
 Project: Hive
  Issue Type: Sub-task
  Components: HiveServer2
Affects Versions: 4.0.0
Reporter: Naveen Gangam
Assignee: Naveen Gangam


In cloud based storage systems, renaming files across different root level 
buckets seem to be disallowed. The S3AFileSystem throws the following 
exception. This appears to be bug in S3FS impl.

Failed with exception Wrong FS 
s3a://hive-managed/clusters/env-x/warehouse--/warehouse/tablespace/managed/hive/tpch.db/customer/delta_001_001_
 -expected s3a://hive-external
2020-04-27T19:34:27,573 INFO  [Thread-6] jdbc.TestDriver: 
java.lang.IllegalArgumentException: Wrong FS 
s3a://hive-managed//clusters/env-/warehouse--/warehouse/tablespace/managed/hive/tpch.db/customer/delta_001_001_
 -expected s3a://hive-external

But we should fix our query plans to use the target table's directory for 
staging as well. That should resolve this issue and it is the right thing to do 
as well (in case there are different encryption zones/keys for these buckets).

Fix in HIVE-22995 probably changed this behavior.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23387) Flip the Warehouse.getDefaultTablePath() to return path from ext warehouse

2020-05-06 Thread Naveen Gangam (Jira)
Naveen Gangam created HIVE-23387:


 Summary: Flip the Warehouse.getDefaultTablePath() to return path 
from ext warehouse
 Key: HIVE-23387
 URL: https://issues.apache.org/jira/browse/HIVE-23387
 Project: Hive
  Issue Type: Sub-task
  Components: Hive
Affects Versions: 4.0.0
Reporter: Naveen Gangam
Assignee: Naveen Gangam


For backward compatibility, initial fix returned path that was set on db. It 
could have been either from managed warehouse or external depending on what was 
set. There were tests relying on certain paths to be returned. This fix is to 
address the tests.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23260) Add support for unmodified_metadata capability

2020-04-20 Thread Naveen Gangam (Jira)
Naveen Gangam created HIVE-23260:


 Summary: Add support for unmodified_metadata capability
 Key: HIVE-23260
 URL: https://issues.apache.org/jira/browse/HIVE-23260
 Project: Hive
  Issue Type: Sub-task
  Components: Hive
Affects Versions: 4.0.0
Reporter: Naveen Gangam
Assignee: Naveen Gangam


Currently, the translator removes bucketing info for tables for clients that do 
not possess the HIVEBUCKET2 capability. While this is desirable, some clients 
that have write access to these tables can turn around overwrite the metadata 
thus corrupting original bucketing info.

So adding support for a capability for client that are capable of interpreting 
the original metadata would prevent such corruption.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23192) "default" database locationUri should be external warehouse root.

2020-04-13 Thread Naveen Gangam (Jira)
Naveen Gangam created HIVE-23192:


 Summary: "default" database locationUri should be external 
warehouse root.
 Key: HIVE-23192
 URL: https://issues.apache.org/jira/browse/HIVE-23192
 Project: Hive
  Issue Type: Sub-task
  Components: Hive
Affects Versions: 4.0.0
Reporter: Naveen Gangam
Assignee: Naveen Gangam


When creating the default database, the database locationUri should be set to 
external warehouse.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23121) Re-examine TestWarehouseExternalDir to see if it uses HMS translation.

2020-04-01 Thread Naveen Gangam (Jira)
Naveen Gangam created HIVE-23121:


 Summary: Re-examine TestWarehouseExternalDir to see if it uses HMS 
translation.
 Key: HIVE-23121
 URL: https://issues.apache.org/jira/browse/HIVE-23121
 Project: Hive
  Issue Type: Sub-task
Reporter: Naveen Gangam
Assignee: Naveen Gangam


TestWarehouseExternalDir currently passes with just one change related to 
HIVE-22995. But that change was assuming it was using HMS Translation to 
convert non-acid managed table to external. 
Ensure that it still does.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22995) Add support for location for managed tables on database

2020-03-06 Thread Naveen Gangam (Jira)
Naveen Gangam created HIVE-22995:


 Summary: Add support for location for managed tables on database
 Key: HIVE-22995
 URL: https://issues.apache.org/jira/browse/HIVE-22995
 Project: Hive
  Issue Type: Improvement
  Components: Hive
Affects Versions: 3.1.0
Reporter: Naveen Gangam
Assignee: Naveen Gangam
 Attachments: Hive Metastore Support for Tenant-based storage 
heirarchy.pdf

I have attached the initial spec to this jira.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22874) Beeline unable to use credentials from URL.

2020-02-11 Thread Naveen Gangam (Jira)
Naveen Gangam created HIVE-22874:


 Summary: Beeline unable to use credentials from URL.
 Key: HIVE-22874
 URL: https://issues.apache.org/jira/browse/HIVE-22874
 Project: Hive
  Issue Type: Bug
  Components: Beeline
Reporter: Naveen Gangam
Assignee: Naveen Gangam
 Fix For: 4.0.0


Beeline is not using password value from the URL. 
Using LDAP Auth in this case, so the failure is on connect.
bin/beeline -u "jdbc:hive2://localhost:1/default;user=test1;password=test1" 

On the server side in LdapAuthenticator, the principals come out to (via a 
special debug logging)

2020-02-11T11:10:31,613  INFO [HiveServer2-Handler-Pool: Thread-67] 
auth.LdapAuthenticationProviderImpl: Connecting to ldap as 
user/password:test1:anonymous


This bug may have been introduced via
https://github.com/apache/hive/commit/749e831060381a8ae4775630efb72d5cd040652f

pass = "" ( an empty string on this line) 
https://github.com/apache/hive/blob/master/beeline/src/java/org/apache/hive/beeline/BeeLine.java#L848

but on this line of code, it checks to see it is null which will not be true 
and hence it never picks up from the jdbc url
https://github.com/apache/hive/blob/master/beeline/src/java/org/apache/hive/beeline/BeeLine.java#L900
It has another chance here but pass != null will always be true and never goes 
into the else condition.
https://github.com/apache/hive/blob/master/beeline/src/java/org/apache/hive/beeline/BeeLine.java#L909



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22853) Beeline should use HS2 server defaults for fetchSize

2020-02-07 Thread Naveen Gangam (Jira)
Naveen Gangam created HIVE-22853:


 Summary: Beeline should use HS2 server defaults for fetchSize
 Key: HIVE-22853
 URL: https://issues.apache.org/jira/browse/HIVE-22853
 Project: Hive
  Issue Type: Bug
  Components: Beeline
Affects Versions: 4.0.0
Reporter: Naveen Gangam
Assignee: Naveen Gangam


Currently beeline uses a hard coded default of 1000 rows for fetchSize. This 
default value is different from what the server has set. While the beeline user 
can reset the value via set command, its cumbersome to change the workloads.
Rather it should default to the server-side value and set should be used to 
override within the session.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22794) Disallow ACID table location outside hive warehouse

2020-01-30 Thread Naveen Gangam (Jira)
Naveen Gangam created HIVE-22794:


 Summary: Disallow ACID table location outside hive warehouse
 Key: HIVE-22794
 URL: https://issues.apache.org/jira/browse/HIVE-22794
 Project: Hive
  Issue Type: Sub-task
  Components: Hive
Affects Versions: 3.1.0
Reporter: Naveen Gangam
Assignee: Naveen Gangam


The co-location of managed tables enables hive to govern them effectively, 
using common policies for security, S3Guard, support quotas etc.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22708) To be updated later

2020-01-08 Thread Naveen Gangam (Jira)
Naveen Gangam created HIVE-22708:


 Summary: To be updated later
 Key: HIVE-22708
 URL: https://issues.apache.org/jira/browse/HIVE-22708
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Affects Versions: 4.0.0
Reporter: Naveen Gangam
Assignee: Naveen Gangam






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22498) Schema tool enhancements to merge catalogs

2019-11-14 Thread Naveen Gangam (Jira)
Naveen Gangam created HIVE-22498:


 Summary: Schema tool enhancements to merge catalogs
 Key: HIVE-22498
 URL: https://issues.apache.org/jira/browse/HIVE-22498
 Project: Hive
  Issue Type: Sub-task
Reporter: Naveen Gangam
Assignee: Naveen Gangam


Schema tool currently supports relocation of database from one catalog to 
another, one at a time. While having to do this one at a time is painful, it 
also lacks support for converting them to external tables during migration, in 
lieu of the changes to the translation layer where a MANAGED table is strictly 
ACID-only table.
Hence we also need to convert them to external tables during relocation.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22497) Remove default value for Capabilities from HiveConf

2019-11-14 Thread Naveen Gangam (Jira)
Naveen Gangam created HIVE-22497:


 Summary: Remove default value for Capabilities from HiveConf
 Key: HIVE-22497
 URL: https://issues.apache.org/jira/browse/HIVE-22497
 Project: Hive
  Issue Type: Sub-task
Reporter: Naveen Gangam


This class is used and bundled in other jars that 3rd party connectors like 
teradata etc. So it would be good to remove this default value from HiveConf 
but rely on it being set in hive-site.xml instead. The HiveServer2 should still 
set this as part of HS2 initialization or via hiveserver2-site.xml



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22406) TRUNCATE TABLE fails due MySQL limitations on limit value

2019-10-25 Thread Naveen Gangam (Jira)
Naveen Gangam created HIVE-22406:


 Summary: TRUNCATE TABLE fails due MySQL limitations on limit value
 Key: HIVE-22406
 URL: https://issues.apache.org/jira/browse/HIVE-22406
 Project: Hive
  Issue Type: Bug
Reporter: Naveen Gangam


HMS currently has some APIs that accepts an integer limit value. Prior to the 
change in HIVE-21734, HMS was silently converting this int to short and thus we 
havent seen this issue. But semantically, its incorrect to do so quietly.

{noformat}
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_191]
Caused by: java.sql.SQLException: setMaxRows() out of range. 2147483647 > 
5000.
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:996) ~[mysql-
connector-java.jar:5.1.33]
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:935) ~[mysql-
connector-java.jar:5.1.33]
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:924) ~[mysql-
connector-java.jar:5.1.33]
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:870) ~[mysql-
connector-java.jar:5.1.33]
at com.mysql.jdbc.StatementImpl.setMaxRows(StatementImpl.java:2525) ~[mysql-
connector-java.jar:5.1.33]
at 
com.zaxxer.hikari.pool.HikariProxyPreparedStatement.setMaxRows(HikariProxyPreparedS
tatement.java) ~[HikariCP-2.6.1.jar:?]
{noformat}

We cannot change the RawStore api to accept shorts instead of ints. 
So we have to fix the caller to use a lower limit instead of Integer.MAX_VALUE.


{noformat}
Caused by: org.apache.hadoop.hive.metastore.api.MetaException: Exception thrown 
when executing query : SELECT DISTINCT 
'org.apache.hadoop.hive.metastore.model.MPartition' AS 
`NUCLEUS_TYPE`,`A0`.`CREATE_TIME`,`A0`.`LAST_ACCESS_TIME`,`A0`.`PART_NAME`,`A0`.`WRITE_ID`,`A0`.`PART_ID`,`A0`.`PART_NAME`
 AS `NUCORDER0` FROM `PARTITIONS` `A0` LEFT OUTER JOIN `TBLS` `B0` ON 
`A0`.`TBL_ID` = `B0`.`TBL_ID` LEFT OUTER JOIN `DBS` `C0` ON `B0`.`DB_ID` = 
`C0`.`DB_ID` WHERE `B0`.`TBL_NAME` = ? AND `C0`.`NAME` = ? AND `C0`.`CTLG_NAME` 
= ? ORDER BY `NUCORDER0` LIMIT 0,2147483647
at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$truncate_table_req_result$truncate_table_req_resultStandardScheme.read(ThriftHiveMetastore.java)
 ~[hive-exec-3.1.0.3.1.5.0-17.jar:3.1.0.3.1.5.0-17]
at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$truncate_table_req_result$truncate_table_req_resultStandardScheme.read(ThriftHiveMetastore.java)
 ~[hive-exec-3.1.0.3.1.5.0-17.jar:3.1.0.3.1.5.0-17]
at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$truncate_table_req_result.read(ThriftHiveMetastore.java)
 ~[hive-exec-3.1.0.3.1.5.0-17.jar:3.1.0.3.1.5.0-17]
at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:86) 
~[hive-exec-3.1.0.3.1.5.0-17.jar:3.1.0.3.1.5.0-17]
at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_truncate_table_req(ThriftHiveMetastore.java:1999)
 ~[hive-exec-3.1.0.3.1.5.0-17.jar:3.1.0.3.1.5.0-17]
at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.truncate_table_req(ThriftHiveMetastore.java:1986)
 ~[hive-exec-3.1.0.3.1.5.0-17.jar:3.1.0.3.1.5.0-17]
at 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.truncateTableInternal(HiveMetaStoreClient.java:1450)
 ~[hive-exec-3.1.0.3.1.5.0-17.jar:3.1.0.3.1.5.0-17]
at 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.truncateTable(HiveMetaStoreClient.java:1427)
 ~[hive-exec-3.1.0.3.1.5.0-17.jar:3.1.0.3.1.5.0-17]
at 
org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.truncateTable(SessionHiveMetaStoreClient.java:171)
 ~[hive-exec-3.1.0.3.1.5.0-17.jar:3.1.0.3.1.5.0-17]
at sun.reflect.GeneratedMethodAccessor112.invoke(Unknown Source) ~[?:?]
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 ~[?:1.8.0_191]
at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_191]
at 
org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:212)
 ~[hive-exec-3.1.0.3.1.5.0-17.jar:3.1.0.3.1.5.0-17]
at com.sun.proxy.$Proxy59.truncateTable(Unknown Source) ~[?:?]
at sun.reflect.GeneratedMethodAccessor112.invoke(Unknown Source) ~[?:?]
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 ~[?:1.8.0_191]
at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_191]
at 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient$SynchronizedHandler.invoke(HiveMetaStoreClient.java:3122)
 ~[hive-exec-3.1.0.3.1.5.0-17.jar:3.1.0.3.1.5.0-17]
at com.sun.proxy.$Proxy59.truncateTable(Unknown Source) ~[?:?]
at 
org.apache.hadoop.hive.ql.metadata.Hive.truncateTable(Hive.java:1277) 
~[hive-exec-3.1.0.3.1.5.0-17.jar:3.1.0.3.1.5.0-17]
at 
org.apache.hadoop.hive.ql.exec.DDLTask.truncateTable(DDLTask.java:5111) 

[jira] [Created] (HIVE-22342) HMS Translation: HIVE-22189 too strict with location for EXTERNAL tables

2019-10-14 Thread Naveen Gangam (Jira)
Naveen Gangam created HIVE-22342:


 Summary: HMS Translation: HIVE-22189 too strict with location for 
EXTERNAL tables
 Key: HIVE-22342
 URL: https://issues.apache.org/jira/browse/HIVE-22342
 Project: Hive
  Issue Type: Sub-task
  Components: Hive
Affects Versions: 4.0.0
Reporter: Naveen Gangam
Assignee: Naveen Gangam


HIVE-22189 restricts EXTERNAL tables being created to be restricted to the 
EXTERNAL_WAREHOUSE_DIR. This might be too strict as any other location should 
be allowed as long as the location is outside the MANAGED warehouse directory.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22291) HMS Translation: Limit translation to hive default catalog only

2019-10-04 Thread Naveen Gangam (Jira)
Naveen Gangam created HIVE-22291:


 Summary: HMS Translation: Limit translation to hive default 
catalog only
 Key: HIVE-22291
 URL: https://issues.apache.org/jira/browse/HIVE-22291
 Project: Hive
  Issue Type: Sub-task
  Components: Hive
Affects Versions: 4.0.0
Reporter: Naveen Gangam
Assignee: Naveen Gangam


HMS Translation should only be limited to a single catalog.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22266) Addendum fix to have HS2 pom add explicit curator dependency

2019-09-27 Thread Naveen Gangam (Jira)
Naveen Gangam created HIVE-22266:


 Summary: Addendum fix to have HS2 pom add explicit curator 
dependency
 Key: HIVE-22266
 URL: https://issues.apache.org/jira/browse/HIVE-22266
 Project: Hive
  Issue Type: Sub-task
  Components: HiveServer2
Affects Versions: 4.0.0
Reporter: Naveen Gangam
Assignee: Naveen Gangam


It might be better to add an explicit dependency on apache-curator in the 
service/pom.xml.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22205) Upgrade zookeeper and curator versions

2019-09-13 Thread Naveen Gangam (Jira)
Naveen Gangam created HIVE-22205:


 Summary: Upgrade zookeeper and curator versions
 Key: HIVE-22205
 URL: https://issues.apache.org/jira/browse/HIVE-22205
 Project: Hive
  Issue Type: Improvement
  Components: HiveServer2
Affects Versions: 4.0.0
Reporter: Naveen Gangam
Assignee: Naveen Gangam


Other components like hadoop have switched to using new ZK versions. So these 
jars end up in classpath for hive services and could cause issues due to 
in-compatible curator versions that hive uses.

So it makes sense for hive to upgrade the ZK and curator versions to try to 
keep up.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Created] (HIVE-22189) HMS Translation: Enforce strict locations for managed vs external tables.

2019-09-10 Thread Naveen Gangam (Jira)
Naveen Gangam created HIVE-22189:


 Summary: HMS Translation: Enforce strict locations for managed vs 
external tables.
 Key: HIVE-22189
 URL: https://issues.apache.org/jira/browse/HIVE-22189
 Project: Hive
  Issue Type: Sub-task
  Components: Hive
Affects Versions: 4.0.0
Reporter: Naveen Gangam
Assignee: Naveen Gangam


Currently, HMS allows flexibility with location of a table. External tables can 
be located within Hive managed warehouse space and managed tables can be 
located within the external warehouse directory if the user chooses to do so.

There are certain advantages to restrict such flexibility. We could have 
different encryption policies for different warehouses, different replication 
policies etc.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Created] (HIVE-22158) HMS Translation layer - Disallow non-ACID MANAGED tables.

2019-08-29 Thread Naveen Gangam (Jira)
Naveen Gangam created HIVE-22158:


 Summary: HMS Translation layer - Disallow non-ACID MANAGED tables.
 Key: HIVE-22158
 URL: https://issues.apache.org/jira/browse/HIVE-22158
 Project: Hive
  Issue Type: Sub-task
  Components: Hive
Affects Versions: 4.0.0
Reporter: Naveen Gangam
Assignee: Naveen Gangam


In the recent commits, we have allowed non-ACID MANAGED tables to be created by 
clients that have some form of ACID WRITE capabilities. 
I think it would make sense to disallow this entirely. MANAGED tables should be 
ACID tables only.




--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Created] (HIVE-22159) HMS Translation layer - Turn off HMS Translation by default.

2019-08-29 Thread Naveen Gangam (Jira)
Naveen Gangam created HIVE-22159:


 Summary: HMS Translation layer - Turn off HMS Translation by 
default.
 Key: HIVE-22159
 URL: https://issues.apache.org/jira/browse/HIVE-22159
 Project: Hive
  Issue Type: Sub-task
  Components: Hive
Affects Versions: 4.0.0
Reporter: Naveen Gangam
Assignee: Naveen Gangam


Because of certain backward incompatibilities in terms of behavior, I think it 
makes sense to turn off this translation in the Apache Hive codebase.
Consumers can selectively enable it and even plugin their own set of 
translation rules as well.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Created] (HIVE-22123) Use GetDatabaseResponse to allow for future extension

2019-08-16 Thread Naveen Gangam (JIRA)
Naveen Gangam created HIVE-22123:


 Summary: Use GetDatabaseResponse to allow for future extension
 Key: HIVE-22123
 URL: https://issues.apache.org/jira/browse/HIVE-22123
 Project: Hive
  Issue Type: Sub-task
  Components: Hive
Affects Versions: 4.0.0
Reporter: Naveen Gangam
Assignee: Naveen Gangam


As part of the review, it was suggested to use the GetDatabaseResponse object 
to allow for any potential future expansions for these requests.
https://reviews.apache.org/r/71267/#comment304501




--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (HIVE-22109) Hive.renamePartition expects catalog name to be set instead of using default

2019-08-13 Thread Naveen Gangam (JIRA)
Naveen Gangam created HIVE-22109:


 Summary: Hive.renamePartition expects catalog name to be set 
instead of using default
 Key: HIVE-22109
 URL: https://issues.apache.org/jira/browse/HIVE-22109
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Affects Versions: 4.0.0
Reporter: Naveen Gangam
Assignee: Naveen Gangam






--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (HIVE-22095) Hive.get() resets the capabilities from HiveConf instead of set capabilities

2019-08-09 Thread Naveen Gangam (JIRA)
Naveen Gangam created HIVE-22095:


 Summary: Hive.get() resets the capabilities from HiveConf instead 
of set capabilities
 Key: HIVE-22095
 URL: https://issues.apache.org/jira/browse/HIVE-22095
 Project: Hive
  Issue Type: Sub-task
  Components: Hive
Affects Versions: 4.0.0
Reporter: Naveen Gangam
Assignee: Naveen Gangam


Hive.get() resets the capabilities set on the HiveMetaStoreClient from what is 
set in HiveConf instead of preserving the capabilities that have already been 
set via setHMSClientCapabilties()



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (HIVE-22087) HMS Translation: Translate getDatabase() API to alter warehouse location

2019-08-07 Thread Naveen Gangam (JIRA)
Naveen Gangam created HIVE-22087:


 Summary: HMS Translation: Translate getDatabase() API to alter 
warehouse location
 Key: HIVE-22087
 URL: https://issues.apache.org/jira/browse/HIVE-22087
 Project: Hive
  Issue Type: Sub-task
Reporter: Naveen Gangam
Assignee: Naveen Gangam


It makes sense to translate getDatabase() calls as well, to alter the location 
for the Database based on whether or not the processor has capabilities to 
write to the managed warehouse directory. Every DB has 2 locations, one 
external and the other in the managed warehouse directory. If the processor has 
any AcidWrite capability, then the location remains unchanged for the database.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (HIVE-22069) joda-time binary conflict between druid-handler and phoenix-hive jars.

2019-08-01 Thread Naveen Gangam (JIRA)
Naveen Gangam created HIVE-22069:


 Summary: joda-time binary conflict between druid-handler and 
phoenix-hive jars.
 Key: HIVE-22069
 URL: https://issues.apache.org/jira/browse/HIVE-22069
 Project: Hive
  Issue Type: Bug
  Components: Hive
Affects Versions: 3.1.0, 4.0.0
Reporter: Naveen Gangam
Assignee: Naveen Gangam


Hive's druid storage handler uses 2.8.1 version of the joda time library where 
as the phoenix-hive.jar uses 1.6 version of this library. When both jars are in 
the classpath, bad things happen.
Apache phoenix has its own release cycle and them uptaking a new version is not 
what hive should count on. Besides they could decide to move to a new version 
of this library and we would still have this problem.
So its best we use shaded jars in hive for the version we are on.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (HIVE-22002) Insert into table partition fails partially with stats.autogather is on.

2019-07-16 Thread Naveen Gangam (JIRA)
Naveen Gangam created HIVE-22002:


 Summary: Insert into table partition fails partially with 
stats.autogather is on.
 Key: HIVE-22002
 URL: https://issues.apache.org/jira/browse/HIVE-22002
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Affects Versions: 4.0.0
Reporter: Naveen Gangam


create table test_double(id int) partitioned by (dbtest double); 
insert into test_double partition(dbtest) values (1,9.9); --> this works
insert into test_double partition(dbtest) values (1,10); --> this fails 

But if we change it to
insert into test_double partition(dbtest) values (1, cast (10 as double)); it 
succeeds 

-> the problem is only seen when trying to insert a whole number i.e. 10, 10.0, 
15, 14.0 etc. The issue is not seen when inserting a number with decimal values 
other than 0. So insert of 10.1 goes though. 

The underlying exception from the HMS is 
{code}
2019-07-11T07:58:16,670 ERROR [pool-6-thread-196]: server.TThreadPoolServer 
(TThreadPoolServer.java:run(297)) - Error occurred during processing of 
message. java.lang.IndexOutOfBoundsException: Index: 0 at 
java.util.Collections$EmptyList.get(Collections.java:4454) ~[?:1.8.0_112] at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.updatePartColumnStatsWithMerge(HiveMetaStore.java:7808)
 ~[hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78] at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.set_aggr_stats_for(HiveMetaStore.java:7769)
 ~[hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78] 
{code}

With {{hive.stats.column.autogather=false}}, this exception does not occur with 
or without the explicit casting.

The issue stems from the fact that HS2 created a partition with value 
{{dbtest=10}} for the table and the stats processor is attempting to add column 
statistics for partition with value {{dbtest=10.0}}. Thus HMS 
{{getPartitionsByNames}} cannot find the partition with that value and thus 
fails to insert the stats. So while the failure initiates on HMS side, the 
cause in the HS2 query planning.

It makes sense that turning off {{hive.stats.column.autogather}} resolves the 
issue because there is no StatsTask in a query plan.

But {{SHOW PARTITIONS}} shows the partition as created while the query planner 
is not including it any plan because of the absence of stats on the partition.




--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (HIVE-21816) HMS Translation: Refactor tests to work with ACID tables.

2019-05-31 Thread Naveen Gangam (JIRA)
Naveen Gangam created HIVE-21816:


 Summary: HMS Translation: Refactor tests to work with ACID tables.
 Key: HIVE-21816
 URL: https://issues.apache.org/jira/browse/HIVE-21816
 Project: Hive
  Issue Type: Sub-task
  Components: Hive
Affects Versions: 4.0.0
Reporter: Naveen Gangam
Assignee: Naveen Gangam


1) TestHiveMetaStore unit tests does not work for full ACID tables as the 
TransactionalValidationListener enforces that this table use AcidIO. The Orc IO 
files are only included in the hive-exec jars that are not used by tests under 
standalone-metastore module. Even adding a test-scoped dependency on hive-exec 
did not work. I had to relocate these tests into itests.

2) Implementation of logic that allows skipping of translation via the use of 
"MANAGERAWMETADATA" capability.

3) Fixed some test bugs as the test was not failing originally when the 
createTable failed because of the issue in #1. As a result, about 3 tests never 
ran fully and never failed. The tests now fail if there are issues.

4) Refactoring of the code in the DefaultTransformer to make static lists of 
capabilities. The return capabilities now is dependent on the table 
capabilities, the processor capabilities and the accessType assigned to the 
table.

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21804) HMS Translation: External tables with no capabilities returns duplicate entries/

2019-05-29 Thread Naveen Gangam (JIRA)
Naveen Gangam created HIVE-21804:


 Summary: HMS Translation: External tables with no capabilities 
returns duplicate entries/
 Key: HIVE-21804
 URL: https://issues.apache.org/jira/browse/HIVE-21804
 Project: Hive
  Issue Type: Sub-task
  Components: Hive
Affects Versions: 4.0.0
Reporter: Naveen Gangam
Assignee: Naveen Gangam


2019-05-24T12:50:52,978  WARN [pool-6-thread-4] metastore.HiveMetaStore: 
Unexpected resultset size:2
2019-05-24T12:50:52,981 ERROR [pool-6-thread-4] metastore.RetryingHMSHandler: 
MetaException(message:Unexpected result from metadata transformer:return list 
size=2)
at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.getTableInternal(HiveMetaStore.java:3154)
at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_table_req(HiveMetaStore.java:3118)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147)
at 
org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108)
at com.sun.proxy.$Proxy28.get_table_req(Unknown Source)
at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$get_table_req.getResult(ThriftHiveMetastore.java:16497)
at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$get_table_req.getResult(ThriftHiveMetastore.java:16481)
at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
at 
org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:111)
at 
org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:107)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1682)
at 
org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:119)
at 
org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21744) Make hive side changes to enforce table access type on queries.

2019-05-16 Thread Naveen Gangam (JIRA)
Naveen Gangam created HIVE-21744:


 Summary: Make hive side changes to enforce table access type on 
queries.
 Key: HIVE-21744
 URL: https://issues.apache.org/jira/browse/HIVE-21744
 Project: Hive
  Issue Type: Sub-task
Affects Versions: 4.0.0
Reporter: Naveen Gangam






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21734) HMS Translation: Pending items from code review

2019-05-15 Thread Naveen Gangam (JIRA)
Naveen Gangam created HIVE-21734:


 Summary: HMS Translation: Pending items from code review
 Key: HIVE-21734
 URL: https://issues.apache.org/jira/browse/HIVE-21734
 Project: Hive
  Issue Type: Sub-task
  Components: Hive
Affects Versions: 4.0.0
Reporter: Naveen Gangam
Assignee: Naveen Gangam


A sub-task of HIVE-21663. Some items came from the review feedback and some 
were left out from the initial implementation.
1) Enforce limit being passed into get_tables_ext. Currently being ignored.
2) Filter out some capabilities being returned to called based on the 
capabilities possessed by the processor.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21718) Improvement performance of UpdateInputAccessTimeHook

2019-05-10 Thread Naveen Gangam (JIRA)
Naveen Gangam created HIVE-21718:


 Summary: Improvement performance of UpdateInputAccessTimeHook
 Key: HIVE-21718
 URL: https://issues.apache.org/jira/browse/HIVE-21718
 Project: Hive
  Issue Type: Improvement
  Components: HiveServer2
Affects Versions: 2.1.1
Reporter: Naveen Gangam
Assignee: Naveen Gangam


Currently, Hive does not update the lastAccessTime property for any entities 
when a query accesses them. Thus it has not possible to know when a table was 
last accessed.
Hive does provide a configurable hook to HS2 that is execcuted as a pre-query 
hook prior to the query being executed. However, this hook is inefficient 
because for each table or partition it is attempting to update time for, it 
executes an "alter table ... " command internally. This is bad 
1) For a query touching 1000's of partitions, this hook takes forever to update 
them.
2) Meanwhile, it is holding up the original query from executing.

So even though we do not recommend using the hook, because the reward is too 
little (having lastAccessTime updated), we realize there is no other means to 
achieve this.
Also, we can improve the performance of the hook significantly by adding a new 
thrift API on HMS to update the lastAccessTime on the database rows directly 
instead of going to HMS front end for 1 entity at time (leading to 1000's of 
HMS calls that lead to multiple 1000's of calls to the database).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21664) HMS Translation layer - Thrift API changes

2019-04-29 Thread Naveen Gangam (JIRA)
Naveen Gangam created HIVE-21664:


 Summary: HMS Translation layer - Thrift API changes
 Key: HIVE-21664
 URL: https://issues.apache.org/jira/browse/HIVE-21664
 Project: Hive
  Issue Type: Sub-task
  Components: Standalone Metastore
Reporter: Naveen Gangam
Assignee: Naveen Gangam


This jira is to track the HMS side changes of this feature.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21663) Hive Metastore Translation Layer

2019-04-29 Thread Naveen Gangam (JIRA)
Naveen Gangam created HIVE-21663:


 Summary: Hive Metastore Translation Layer
 Key: HIVE-21663
 URL: https://issues.apache.org/jira/browse/HIVE-21663
 Project: Hive
  Issue Type: New Feature
  Components: Standalone Metastore
Reporter: Naveen Gangam
Assignee: Naveen Gangam


This task is for the implementation of the default provider for translation, 
that is extensible if needed for a custom translator. Please refer the spec for 
additional details on the translation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21533) Nested CTE's with join does not return any data.

2019-03-28 Thread Naveen Gangam (JIRA)
Naveen Gangam created HIVE-21533:


 Summary: Nested CTE's with join does not return any data.
 Key: HIVE-21533
 URL: https://issues.apache.org/jira/browse/HIVE-21533
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Affects Versions: 2.1.0
Reporter: Naveen Gangam
 Attachments: testcase.sql

Attached is the testcase to reproduce the issue. the join on CTE6 is causing 
the problem.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21363) Ldap auth issue: group filter match should be case insensitive

2019-02-28 Thread Naveen Gangam (JIRA)
Naveen Gangam created HIVE-21363:


 Summary: Ldap auth issue: group filter match should be case 
insensitive
 Key: HIVE-21363
 URL: https://issues.apache.org/jira/browse/HIVE-21363
 Project: Hive
  Issue Type: Improvement
  Components: HiveServer2
Affects Versions: 3.0.0
Reporter: Naveen Gangam
Assignee: Naveen Gangam


Configure HiveServer2 with LDAP auth with (enable ldap, ldap URI, baseDN, 
userDNPattern, groupDNPattern and groupFilter). 

If the specified groupFilter case is different than the actual one in 
directory, then Hive cannot find a match and errors out.

For example:
groupFilter value=
group name in directory server=grouptest.

Similar search works by using other ldap clients like ldapsearch (ldap searches 
are case insensitive). 
While it is not a major issue as the workaround would be to configure the exact 
name, it is an easy fix that we should support out of box.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21337) HMS Metadata migration from Postgres/Derby to other DBs fail

2019-02-27 Thread Naveen Gangam (JIRA)
Naveen Gangam created HIVE-21337:


 Summary: HMS Metadata migration from Postgres/Derby to other DBs 
fail
 Key: HIVE-21337
 URL: https://issues.apache.org/jira/browse/HIVE-21337
 Project: Hive
  Issue Type: Improvement
  Components: Metastore
Affects Versions: 3.0.0
Reporter: Naveen Gangam
Assignee: Naveen Gangam


Customer recently was migrating from Postgres to Oracle for HMS metastore. 
During import of the [exported] data from HMS metastore from postgres, failures 
are seen as the COLUMNS_V2.COMMENT is 4000 bytes long whereas oracle and other 
schemas define it to be 256 bytes.
This inconsistency in the schema makes the migration cumbersome and manual. 
This jira makes this column consistent in length across all databases.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21336) HMS Index PCS_STATS_IDX too long for Oracle when NLS_LENGTH_SEMANTICS=char

2019-02-27 Thread Naveen Gangam (JIRA)
Naveen Gangam created HIVE-21336:


 Summary: HMS Index PCS_STATS_IDX too long for Oracle when 
NLS_LENGTH_SEMANTICS=char
 Key: HIVE-21336
 URL: https://issues.apache.org/jira/browse/HIVE-21336
 Project: Hive
  Issue Type: Improvement
  Components: Metastore
Affects Versions: 3.0.0
Reporter: Naveen Gangam
Assignee: Naveen Gangam


CREATE INDEX PCS_STATS_IDX ON PAR T_COL_STATS 
(DB_NAME,TABLE_NAME,COLUMN_NAME,PARTITION_NAME) 
Error: ORA-01450: maximum key length (6398) exceeded (state=72000,code=1450) 

Customer tried the same DDL in SQLDevloper, and got the same error. This could 
be a result of combination of DB level settings like the db_block_size, 
limiting the maximum key length, as per below doc: 
http://www.dba-oracle.com/t_ora_01450_maximum_key_length_exceeded.htm 

Also {{NLS_LENGTH_SEMANTICS}} is by default BYTE, but users can set this at the 
session level to CHAR, thus reducing the max size of the index length. We have 
increased the size of the COLUMN_NAME from 128 to 767 (used to be at 1000) and 
TABLE_NAME from 128 to 256. This by setting 

{code} 
CREATE TABLE PART_COL_STATS ( 
CS_ID NUMBER NOT NULL, 
DB_NAME VARCHAR2(128) NOT NULL, 
TABLE_NAME VARCHAR2(256) NOT NULL, 
PARTITION_NAME VARCHAR2(767) NOT NULL, 
COLUMN_NAME VARCHAR2(767) NOT NULL,  

CREATE INDEX PCS_STATS_IDX ON PART_COL_STATS 
(DB_NAME,TABLE_NAME,COLUMN_NAME,PARTITION_NAME); 
{code} 

Reproducer: 

{code} 
SQL*Plus: Release 11.2.0.2.0 Production on Wed Feb 27 11:02:16 2019 Copyright 
(c) 1982, 2011, Oracle. All rights reserved. 
Connected to: Oracle Database 11g Express Edition Release 11.2.0.2.0 - 64bit 
Production 

SQL> select * from v$nls_parameters where parameter = 'NLS_LENGTH_SEMANTICS'; 
PARAMETER 
 
VALUE 
 
NLS_LENGTH_SEMANTICS 
BYTE 

SQL> alter session set NLS_LENGTH_SEMANTICS=CHAR; Session altered. 

SQL> commit; Commit complete. 

SQL> select * from v$nls_parameters where parameter = 'NLS_LENGTH_SEMANTICS'; 
PARAMETER 
 
VALUE 
 
NLS_LENGTH_SEMANTICS 
CHAR 

SQL> CREATE TABLE PART_COL_STATS (CS_ID NUMBER NOT NULL, DB_NAME VARCHAR2(128) 
NOT NULL, TABLE_NAME VARCHAR2(256) NOT NULL, PARTITION_NAME VARCHAR2(767) NOT 
NULL, COLUMN_NAME VARCHAR2(767) NOT NULL); 
Table created. 

SQL> CREATE INDEX PCS_STATS_IDX ON PART_COL_STATS 
(DB_NAME,TABLE_NAME,COLUMN_NAME,PARTITION_NAME); 

CREATE INDEX PCS_STATS_IDX ON PART_COL_STATS 
(DB_NAME,TABLE_NAME,COLUMN_NAME,PARTITION_NAME) 
* ERROR at line 1: ORA-01450: maximum key length (6398) exceeded 

SQL> alter session set NLS_LENGTH_SEMANTICS=BYTE; 
Session altered. 

SQL> commit; 
Commit complete. 

SQL> drop table PART_COL_STATS; 
Table dropped. 

SQL> commit; 
Commit complete. 

SQL> CREATE TABLE PART_COL_STATS (CS_ID NUMBER NOT NULL, DB_NAME VARCHAR2(128) 
NOT NULL, TABLE_NAME VARCHAR2(256) NOT NULL, PARTITION_NAME VARCHAR2(767) NOT 
NULL, COLUMN_NAME VARCHAR2(767) NOT NULL); 
Table created. 

SQL> CREATE INDEX PCS_STATS_IDX ON PART_COL_STATS 
(DB_NAME,TABLE_NAME,COLUMN_NAME,PARTITION_NAME); 
Index created. 

SQL> commit; 
Commit complete. 

SQL> 
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21209) [Improvement] Exchange partitition to be metadata only change?

2019-02-04 Thread Naveen Gangam (JIRA)
Naveen Gangam created HIVE-21209:


 Summary: [Improvement] Exchange partitition to be metadata only 
change?
 Key: HIVE-21209
 URL: https://issues.apache.org/jira/browse/HIVE-21209
 Project: Hive
  Issue Type: Improvement
  Components: Hive
Affects Versions: 2.1.1
Reporter: Naveen Gangam


https://issues.apache.org/jira/browse/HIVE-14560
Current implementation of the above jira is a metadata and a "copy" of the 
partition data on the DFS. Could possibly take a long time to copy the data for 
large partition data especially different storage clusters. When exchanging a 
partition from a HDFS to S3a or vice versa the data is copied and this is 
client copy operation and it can be very slow if the partition is very large.

The customer would like the "exchange partition" operation to purely metadata.  
I would like to start a discussion on whether this improvement is to be made. 
Obviously, the current behavior will be supported but and option for it to be a 
metadata operation only needs to be evaluated.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20909) Just "MSCK" should throw SemanticException

2018-11-13 Thread Naveen Gangam (JIRA)
Naveen Gangam created HIVE-20909:


 Summary: Just "MSCK" should throw SemanticException
 Key: HIVE-20909
 URL: https://issues.apache.org/jira/browse/HIVE-20909
 Project: Hive
  Issue Type: Bug
  Components: Hive
Affects Versions: 4.0.0
Reporter: Naveen Gangam
Assignee: Naveen Gangam


Per documentation, the syntax for MSCK command is 
{{MSCK [REPAIR] TABLE table_name [ADD/DROP/SYNC PARTITIONS];}}

So just submitting "MSCK" should throw a SemanticException like it does for 
other queries with incorrect syntax. But instead it appears to be attempting to 
do something.

$ hive --hiveconf hive.root.logger=INFO,console -e "msck;"

2018-11-08T15:21:25,016  INFO [main] SessionState: 
2018-11-08T15:21:26,203  INFO [main] session.SessionState: Created HDFS 
directory: /tmp/hive/hive/b1b62e04-5a1c-4c6a-babd-31b4f1d2bd78
2018-11-08T15:21:26,222  INFO [main] session.SessionState: Created local 
directory: /tmp/root/b1b62e04-5a1c-4c6a-babd-31b4f1d2bd78
2018-11-08T15:21:26,229  INFO [main] session.SessionState: Created HDFS 
directory: /tmp/hive/hive/b1b62e04-5a1c-4c6a-babd-31b4f1d2bd78/_tmp_space.db
2018-11-08T15:21:26,244  INFO [main] conf.HiveConf: Using the default value 
passed in for log id: b1b62e04-5a1c-4c6a-babd-31b4f1d2bd78
2018-11-08T15:21:26,246  INFO [main] session.SessionState: Updating thread name 
to b1b62e04-5a1c-4c6a-babd-31b4f1d2bd78 main
2018-11-08T15:21:26,246  INFO [b1b62e04-5a1c-4c6a-babd-31b4f1d2bd78 main] 
conf.HiveConf: Using the default value passed in for log id: 
b1b62e04-5a1c-4c6a-babd-31b4f1d2bd78
2018-11-08T15:21:26,548  INFO [b1b62e04-5a1c-4c6a-babd-31b4f1d2bd78 main] 
ql.Driver: Compiling 
command(queryId=root_20181108152126_3babeb6f-8396-4ef3-8f85-2cbf12ebe9c1): msck
2018-11-08T15:21:28,140  INFO [b1b62e04-5a1c-4c6a-babd-31b4f1d2bd78 main] 
hive.metastore: Trying to connect to metastore with URI 
thrift://nightly61x-1.vpc.cloudera.com:9083
2018-11-08T15:21:28,184  INFO [b1b62e04-5a1c-4c6a-babd-31b4f1d2bd78 main] 
hive.metastore: Opened a connection to metastore, current connections: 1
2018-11-08T15:21:28,185  INFO [b1b62e04-5a1c-4c6a-babd-31b4f1d2bd78 main] 
hive.metastore: Connected to metastore.
FAILED: SemanticException empty table creation??
2018-11-08T15:21:28,339 ERROR [b1b62e04-5a1c-4c6a-babd-31b4f1d2bd78 main] 
ql.Driver: FAILED: SemanticException empty table creation??
org.apache.hadoop.hive.ql.parse.SemanticException: empty table creation??
at 
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.getTable(BaseSemanticAnalyzer.java:1670)
at 
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.getTable(BaseSemanticAnalyzer.java:1652)
at 
org.apache.hadoop.hive.ql.parse.DDLSemanticAnalyzer.analyzeMetastoreCheck(DDLSemanticAnalyzer.java:3118)
at 
org.apache.hadoop.hive.ql.parse.DDLSemanticAnalyzer.analyzeInternal(DDLSemanticAnalyzer.java:414)
at 
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:250)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:600)
at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1414)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1543)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1332)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1321)
at 
org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:239)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:187)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:409)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:342)
at 
org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:802)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:774)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:701)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:313)
at org.apache.hadoop.util.RunJar.main(RunJar.java:227)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: empty table 
creation??
at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:1273)
at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:1234)
at 
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.getTable(BaseSemanticAnalyzer.java:1663)
... 22 more

2018-11-08T15:21:28,340  INFO [b1b62e04-5a1c-4c6a-babd-31b4f1d2bd78 main] 
ql.Driver: Completed compiling 
command(queryId=root_20181108152126_3babeb6f-8396-4ef3-8f85-2cbf12ebe9c1); Time 

[jira] [Created] (HIVE-20205) Upgrade HBase dependencies off alpha4 release

2018-07-18 Thread Naveen Gangam (JIRA)
Naveen Gangam created HIVE-20205:


 Summary: Upgrade HBase dependencies off alpha4 release
 Key: HIVE-20205
 URL: https://issues.apache.org/jira/browse/HIVE-20205
 Project: Hive
  Issue Type: Bug
  Components: Hive
Affects Versions: 3.0.0
Reporter: Naveen Gangam
Assignee: Naveen Gangam


Appears Hive has dependencies on hbase 2.0.0-alpha4 releases. HBase 2.0.0 and 
2.0.1 have been released. HBase team recommends 2.0.1 and says there shouldnt 
be any API surprises. (but we never know)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-19700) Workaround for JLine issue with UnsupportedTerminal

2018-05-24 Thread Naveen Gangam (JIRA)
Naveen Gangam created HIVE-19700:


 Summary: Workaround for JLine issue with UnsupportedTerminal
 Key: HIVE-19700
 URL: https://issues.apache.org/jira/browse/HIVE-19700
 Project: Hive
  Issue Type: Bug
Reporter: Naveen Gangam
Assignee: Naveen Gangam
 Fix For: 2.2.1


>From the JLine's ConsoleReader, readLine(prompt, mask) calls the following 
>beforeReadLine() method.
{code}
try {
// System.out.println("is terminal supported " + 
terminal.isSupported());
if (!terminal.isSupported()) {
beforeReadLine(prompt, mask);
}
{code}

So specifically when using UnsupportedTerminal {{-Djline.terminal}} and 
{{prompt=null}} and {{mask!=null}}, a "null" string gets printed to the console 
before and after the query result. {{UnsupportedTerminal}} is required to be 
used when running beeline as a background process, hangs otherwise.

{code}
private void beforeReadLine(final String prompt, final Character mask) {
if (mask != null && maskThread == null) {
final String fullPrompt = "\r" + prompt
+ " "
+ " "
+ " "
+ "\r" + prompt;

maskThread = new Thread()
{
public void run() {
while (!interrupted()) {
try {
Writer out = getOutput();
out.write(fullPrompt);
{code}

So the {{prompt}} is null and {{mask}} is NOT in atleast 2 scenarios in 
beeline. 
when beeline's silent=true, prompt is null
* 
https://github.com/apache/hive/blob/master/beeline/src/java/org/apache/hive/beeline/BeeLine.java#L1264
when running multiline queries
* 
https://github.com/apache/hive/blob/master/beeline/src/java/org/apache/hive/beeline/Commands.java#L1093

When executing beeline in script mode (commands in a file), there should not be 
any masking while reading lines from the script file. aka, entire line should 
be a beeline command or part of a multiline hive query.

So it should be safe to use a null mask instead of {{ConsoleReader.NULL_MASK}} 
when using UnsupportedTerminal as jline terminal.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-19250) Schema column definitions inconsistencies in MySQL

2018-04-19 Thread Naveen Gangam (JIRA)
Naveen Gangam created HIVE-19250:


 Summary: Schema column definitions inconsistencies in MySQL
 Key: HIVE-19250
 URL: https://issues.apache.org/jira/browse/HIVE-19250
 Project: Hive
  Issue Type: Bug
  Components: Hive
Affects Versions: 2.1.0
Reporter: Naveen Gangam
Assignee: Naveen Gangam


There are some inconsistencies in column definitions in MySQL between a schema 
that was upgraded to 2.1 (from an older release) vs installing the 2.1.0 schema 
directly.
>   `CQ_TBLPROPERTIES` varchar(2048) DEFAULT NULL,
117d117
<   `CQ_TBLPROPERTIES` varchar(2048) DEFAULT NULL,
135a136
>   `CC_TBLPROPERTIES` varchar(2048) DEFAULT NULL,
143d143
<   `CC_TBLPROPERTIES` varchar(2048) DEFAULT NULL,
156c156
<   `CTC_TXNID` bigint(20) DEFAULT NULL,
---
>   `CTC_TXNID` bigint(20) NOT NULL,
158c158
<   `CTC_TABLE` varchar(256) CHARACTER SET latin1 COLLATE latin1_bin DEFAULT 
NULL,
---
>   `CTC_TABLE` varchar(256) DEFAULT NULL,
476c476
<   `TBL_NAME` varchar(256) CHARACTER SET latin1 COLLATE latin1_bin DEFAULT 
NULL,
---
>   `TBL_NAME` varchar(256) DEFAULT NULL,
664c664
<   KEY `PCS_STATS_IDX` (`DB_NAME`,`TABLE_NAME`,`COLUMN_NAME`,`PARTITION_NAME`),
---
>   KEY `PCS_STATS_IDX` (`DB_NAME`,`TABLE_NAME`,`COLUMN_NAME`,`PARTITION_NAME`) 
> USING BTREE,
768c768
<   `PARAM_VALUE` mediumtext,
---
>   `PARAM_VALUE` mediumtext CHARACTER SET latin1 COLLATE latin1_bin,
814c814
<   `PARAM_VALUE` mediumtext,
---
>   `PARAM_VALUE` mediumtext CHARACTER SET latin1 COLLATE latin1_bin,
934c934
<   `PARAM_VALUE` mediumtext,
---
>   `PARAM_VALUE` mediumtext CHARACTER SET latin1 COLLATE latin1_bin,
1066d1065
<   `TXN_HEARTBEAT_COUNT` int(11) DEFAULT NULL,
1067a1067
>   `TXN_HEARTBEAT_COUNT` int(11) DEFAULT NULL,
1080c1080
<   `TC_TXNID` bigint(20) DEFAULT NULL,
---
>   `TC_TXNID` bigint(20) NOT NULL,
1082c1082
<   `TC_TABLE` varchar(128) DEFAULT NULL,
---
>   `TC_TABLE` varchar(128) NOT NULL,
1084c1084
<   `TC_OPERATION_TYPE` char(1) DEFAULT NULL,
---
>   `TC_OPERATION_TYPE` char(1) NOT NULL,



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-19231) Beeline generates garbled output when using UnsupportedTerminal

2018-04-17 Thread Naveen Gangam (JIRA)
Naveen Gangam created HIVE-19231:


 Summary: Beeline generates garbled output when using 
UnsupportedTerminal
 Key: HIVE-19231
 URL: https://issues.apache.org/jira/browse/HIVE-19231
 Project: Hive
  Issue Type: Bug
  Components: Beeline
Affects Versions: 2.1.0
Reporter: Naveen Gangam
Assignee: Naveen Gangam


We had a customer that was using some sort of front end that would invoke 
beeline commands with some query files on a node that that remote to the HS2 
node.

So beeline runs locally on this edge but connects to a remote HS2. Since the 
fix made in HIVE-14342, the beeline started producing garbled line in the 
output. Something like
{code:java}
^Mnull   ^Mnull^Mnull   
^Mnull00-   All Occupations 
135185230   42270
11- Management occupations  6152650 100310{code}
 

I havent been able to reproduce the issue locally as I do not have their 
system, but with some additional instrumentation I have been able to get some 
info regarding the beeline process.

Essentially, such invocation causes beeline process to run with 
{{-Djline.terminal=jline.UnsupportedTerminal}} all the time and thus causes the 
issue. They can run the same beeline command directly in the shell on the same 
host and it does not cause this issue.

PID    S   TTY  TIME COMMAND
44107  S    S  ?    00:00:00 bash beeline -u ...

PID  S TTY  TIME COMMAND
48453  S+   S pts/4    00:00:00 bash beeline -u ...

Somehow that process wasnt attached to any local terminals. So the check made 
for /dev/stdin wouldnt work.

 

Instead an additional check to check the TTY session of the process before 
using the UnsupportedTerminal (which really should only be used for 
backgrounded beeline sessions) seems to resolve the issue.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-19230) Schema column width inconsistency in Oracle

2018-04-17 Thread Naveen Gangam (JIRA)
Naveen Gangam created HIVE-19230:


 Summary: Schema column width inconsistency in Oracle 
 Key: HIVE-19230
 URL: https://issues.apache.org/jira/browse/HIVE-19230
 Project: Hive
  Issue Type: Bug
  Components: Hive
Affects Versions: 2.1.0
Reporter: Naveen Gangam
Assignee: Naveen Gangam


This is for oracle only. Does not appear to be an issue with other DBs. When 
you upgrade hive schema from 2.1.0 to hive 3.0.0, the width of 
TXN_COMPONENTS.TC_TABLE is 256 and COMPLETED_TXN_COMPONENTS.CTC_TABLE is 128.

But if you install hive 3.0 schema directly, their widths are 128 and 256 
respectively. This is consistent with schemas for other databases.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-18829) Inputs/Outputs are not propagated to SA hooks for explain commands.

2018-02-28 Thread Naveen Gangam (JIRA)
Naveen Gangam created HIVE-18829:


 Summary: Inputs/Outputs are not propagated to SA hooks for explain 
commands. 
 Key: HIVE-18829
 URL: https://issues.apache.org/jira/browse/HIVE-18829
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Affects Versions: 2.1.1
Reporter: Naveen Gangam
Assignee: Naveen Gangam


With Sentry enabled, commands like {{explain drop table foo}} fail with
{code:java}
explain drop table foo;
Error: Error while compiling statement: FAILED: SemanticException No valid 
privileges
 Required privilege( Table) not available in input privileges
 The required privileges: (state=42000,code=4)
{code}

Sentry fails to authorize because the ExplainSemanticAnalyzer uses an instance 
of DDLSemanticAnalyzer to analyze the explain query.
{code}
BaseSemanticAnalyzer sem = SemanticAnalyzerFactory.get(conf, input);
sem.analyze(input, ctx);
sem.validate()
{code}

The inputs/outputs entities for this query are set in the above code. However, 
these are never set on the instance of ExplainSemanticAnalyzer itself and thus 
is not propagated into the HookContext in the calling Driver code.
{code}
sem.analyze(tree, ctx); --> this results in calling the above code that uses 
DDLSA
hookCtx.update(sem); --> sem is an instance of ExplainSemanticAnalyzer, this 
code attempts to update the HookContext with the input/output info from ESA 
which is never set.
{code}




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-18501) Typo in beeline code

2018-01-19 Thread Naveen Gangam (JIRA)
Naveen Gangam created HIVE-18501:


 Summary: Typo in beeline code
 Key: HIVE-18501
 URL: https://issues.apache.org/jira/browse/HIVE-18501
 Project: Hive
  Issue Type: Bug
  Components: Beeline
Affects Versions: 3.0.0
Reporter: Naveen Gangam
Assignee: Naveen Gangam


[https://github.com/apache/hive/blob/master/beeline/src/java/org/apache/hive/beeline/BeeLine.java#L744]

the string literal used here should be "silent", not "slient". There is no 
functional bug here just a silly typo.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-18459) hive-exec.jar leaks contents fb303.jar into classpath

2018-01-16 Thread Naveen Gangam (JIRA)
Naveen Gangam created HIVE-18459:


 Summary: hive-exec.jar leaks contents fb303.jar into classpath
 Key: HIVE-18459
 URL: https://issues.apache.org/jira/browse/HIVE-18459
 Project: Hive
  Issue Type: Bug
  Components: Hive
Affects Versions: 2.1.0
 Environment: thrift classes are now in the hive classpath in the 
hive-exec.jar (HIVE-11553). This makes it hard to test with other versions of 
this library. This library is already a declared dependency and is not required 
to be included in the hive-exec.jar.

I am proposing that we not include these classes like we have done in the past 
releases.
Reporter: Naveen Gangam
Assignee: Naveen Gangam






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-18328) Improve schematool validator to report duplicate rows for column statistics

2017-12-21 Thread Naveen Gangam (JIRA)
Naveen Gangam created HIVE-18328:


 Summary: Improve schematool validator to report duplicate rows for 
column statistics
 Key: HIVE-18328
 URL: https://issues.apache.org/jira/browse/HIVE-18328
 Project: Hive
  Issue Type: Improvement
  Components: Hive
Affects Versions: 2.1.1
Reporter: Naveen Gangam
Assignee: Naveen Gangam


By design, in the {{TAB_COL_STATS}} table of the HMS schema, there should be 
ONE AND ONLY ONE row, representing its statistics, for each column defined in 
hive. A combination of DB_NAME, TABLE_NAME and COLUMN_NAME constitute a primary 
key/unique row.
Each time the statistics are computed for a column, this row is updated. 
However, if somehow via  BDR/replication process, we end up with multiple rows 
in this table for a given column, HMS server to recompute the statistics there 
after.
So it would be good to detect this data anamoly via the schema validation tool.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HIVE-17333) Schema changes in HIVE-12274 for Oracle may not work for upgrade

2017-08-16 Thread Naveen Gangam (JIRA)
Naveen Gangam created HIVE-17333:


 Summary: Schema changes in HIVE-12274 for Oracle may not work for 
upgrade
 Key: HIVE-17333
 URL: https://issues.apache.org/jira/browse/HIVE-17333
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 3.0.0
Reporter: Naveen Gangam
Assignee: Naveen Gangam


According to 
https://asktom.oracle.com/pls/asktom/f?p=100:11:0P11_QUESTION_ID:1770086700346491686
 (reported in HIVE-12274)
The alter table command to change the column datatype from {{VARCHAR}} to 
{{CLOB}} may not work. So the correct way to accomplish this is to add a new 
temp column, copy the value from the current column, drop the current column 
and rename the new column to old column.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HIVE-16974) Change the sort key for the schema tool validator to be

2017-06-27 Thread Naveen Gangam (JIRA)
Naveen Gangam created HIVE-16974:


 Summary: Change the sort key for the schema tool validator to be 

 Key: HIVE-16974
 URL: https://issues.apache.org/jira/browse/HIVE-16974
 Project: Hive
  Issue Type: Sub-task
  Components: HiveServer2
Affects Versions: 3.0.0
Reporter: Naveen Gangam
Assignee: Naveen Gangam


In HIVE-16729, we introduced ordering of results/failures returned by 
schematool's validators. This allows fault injection testing to expect results 
that can be verified. However, they were sorted on NAME values which in the HMS 
schema can be NULL. So if the introduced fault has a NULL/BLANK name column 
value, the result could be different depending on the backend database(if they 
sort NULLs first or last).
So I think it is better to sort on a non-null column value.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HIVE-16912) Improve table validator's performance against Oracle

2017-06-15 Thread Naveen Gangam (JIRA)
Naveen Gangam created HIVE-16912:


 Summary: Improve table validator's performance against Oracle
 Key: HIVE-16912
 URL: https://issues.apache.org/jira/browse/HIVE-16912
 Project: Hive
  Issue Type: Sub-task
  Components: Hive
Affects Versions: 3.0.0
Reporter: Naveen Gangam
Assignee: Naveen Gangam
Priority: Minor


Currently, this validator uses DatabaseMetaData.getTables() that takes in the 
order of minutes to return because of the number of SYSTEM tables present in 
Oracle.
Providing a schema name via a system property would limit the number of tables 
being returned and thus improve performance.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HIVE-16729) Improve location validator to check for blank paths.

2017-05-22 Thread Naveen Gangam (JIRA)
Naveen Gangam created HIVE-16729:


 Summary: Improve location validator to check for blank paths.
 Key: HIVE-16729
 URL: https://issues.apache.org/jira/browse/HIVE-16729
 Project: Hive
  Issue Type: Sub-task
  Components: Hive
Affects Versions: 3.0.0
Reporter: Naveen Gangam
Assignee: Naveen Gangam
Priority: Minor


Currently, the schema tool location validator succeeds even when the location 
for hive table/partitions have paths like
hdfs://myhost.com:8020/
hdfs://myhost.com:8020

where there is actually no "real" path. Having the validator report such path 
would be beneficial in preventing runtime errors.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (HIVE-16697) Schema table validator should return a sorted list of missing tables

2017-05-17 Thread Naveen Gangam (JIRA)
Naveen Gangam created HIVE-16697:


 Summary: Schema table validator should return a sorted list of 
missing tables 
 Key: HIVE-16697
 URL: https://issues.apache.org/jira/browse/HIVE-16697
 Project: Hive
  Issue Type: Sub-task
Affects Versions: 3.0.0
Reporter: Naveen Gangam
Assignee: Naveen Gangam
Priority: Minor


SchemaTool's validate feature has a schema table validator that checks to see 
if the HMS schema is missing tables. This validator reports a list of tables 
that are deemed to be missing. This list is currently unsorted (depends on the 
order of create table statements in the schema file, which is different for 
different DB schema files). This makes it hard to write a unit test that parses 
the results.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (HIVE-16316) Prepare master branch for 3.0.0 development.

2017-03-28 Thread Naveen Gangam (JIRA)
Naveen Gangam created HIVE-16316:


 Summary: Prepare master branch for 3.0.0 development.
 Key: HIVE-16316
 URL: https://issues.apache.org/jira/browse/HIVE-16316
 Project: Hive
  Issue Type: Bug
  Components: Hive
Affects Versions: 2.3.0
Reporter: Naveen Gangam
Assignee: Naveen Gangam


branch-2 is now being used for 2.3.0 development. The build files will need to 
reflect this change.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (HIVE-16301) Prepare branch-2 for 2.3 development.

2017-03-26 Thread Naveen Gangam (JIRA)
Naveen Gangam created HIVE-16301:


 Summary: Prepare branch-2 for 2.3 development.
 Key: HIVE-16301
 URL: https://issues.apache.org/jira/browse/HIVE-16301
 Project: Hive
  Issue Type: Bug
  Components: Hive
Affects Versions: 2.3.0
Reporter: Naveen Gangam
Assignee: Naveen Gangam


branch-2 is now being used for 2.3.0 development. The build files will need to 
reflect this change.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (HIVE-16257) Intermittent issue with incorrect resultset with Spark

2017-03-20 Thread Naveen Gangam (JIRA)
Naveen Gangam created HIVE-16257:


 Summary: Intermittent issue with incorrect resultset with Spark
 Key: HIVE-16257
 URL: https://issues.apache.org/jira/browse/HIVE-16257
 Project: Hive
  Issue Type: Bug
  Components: Hive
Affects Versions: 1.1.0
Reporter: Naveen Gangam


This issue is highly intermittent that only seems to occurs with spark engine. 
The following is the testcase.
{code}
drop table if exists test_hos_sample;
create table test_hos_sample (name string, val1 decimal(18,2), val2 
decimal(20,3));
insert into test_hos_sample values 
('test1',101.12,102.123),('test1',101.12,102.123),('test2',102.12,103.234),('test1',101.12,102.123),('test3',103.52,102.345),('test3',103.52,102.345),('test3',103.52,102.345),('test3',103.52,102.345),('test3',103.52,102.345),('test4',104.52,104.456),('test4',104.52,104.456),('test5',105.52,105.567),('test3',103.52,102.345),('test5',105.52,105.567);

set hive.execution.engine=spark;
select  name, val1,val2 from test_hos_sample group by name, val1, val2;
{code}

Expected Results:
{code}
nameval1val2
test5   105.52  105.567
test3   103.52  102.345
test1   101.12  102.123
test4   104.52  104.456
test2   102.12  103.234
{code}

Incorrect results once in a while:
{code}
nameval1val2
test5   105.52  105.567
test3   103.52  102.345
test1   104.52  102.123
test4   104.52  104.456
test2   102.12  103.234
{code}

1) Not reproducible with HoMR.
2) Not an issue when running from spark-shell.
3) Occurs with parquet and text file format as well. (havent tried with other 
formats).
4) Occurs in both scenarios when table data is within encryption zone and 
outside.
5) Even in clusters where this is reproducible, this occurs once in like 20 
times or more.
6) Occurs with both beeline and Hive CLI.




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


  1   2   3   >