[jira] [Commented] (HIVE-8065) Support HDFS encryption functionality on Hive

2015-05-06 Thread Brock Noland (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14529998#comment-14529998
 ] 

Brock Noland commented on HIVE-8065:


In that case the results of the query are staged in ez1. 

 Support HDFS encryption functionality on Hive
 -

 Key: HIVE-8065
 URL: https://issues.apache.org/jira/browse/HIVE-8065
 Project: Hive
  Issue Type: Improvement
Affects Versions: 0.13.1
Reporter: Sergio Peña
Assignee: Sergio Peña
  Labels: Hive-Scrum

 The new encryption support on HDFS makes Hive incompatible and unusable when 
 this feature is used.
 HDFS encryption is designed so that an user can configure different 
 encryption zones (or directories) for multi-tenant environments. An 
 encryption zone has an exclusive encryption key, such as AES-128 or AES-256. 
 Because of security compliance, the HDFS does not allow to move/rename files 
 between encryption zones. Renames are allowed only inside the same encryption 
 zone. A copy is allowed between encryption zones.
 See HDFS-6134 for more details about HDFS encryption design.
 Hive currently uses a scratch directory (like /tmp/$user/$random). This 
 scratch directory is used for the output of intermediate data (between MR 
 jobs) and for the final output of the hive query which is later moved to the 
 table directory location.
 If Hive tables are in different encryption zones than the scratch directory, 
 then Hive won't be able to renames those files/directories, and it will make 
 Hive unusable.
 To handle this problem, we can change the scratch directory of the 
 query/statement to be inside the same encryption zone of the table directory 
 location. This way, the renaming process will be successful. 
 Also, for statements that move files between encryption zones (i.e. LOAD 
 DATA), a copy may be executed instead of a rename. This will cause an 
 overhead when copying large data files, but it won't break the encryption on 
 Hive.
 Another security thing to consider is when using joins selects. If Hive joins 
 different tables with different encryption key strengths, then the results of 
 the select might break the security compliance of the tables. Let's say two 
 tables with 128 bits and 256 bits encryption are joined, then the temporary 
 results might be stored in the 128 bits encryption zone. This will conflict 
 with the table encrypted with 256 bits temporary.
 To fix this, Hive should be able to select the scratch directory that is more 
 secured/encrypted in order to save the intermediate data temporary with no 
 compliance issues.
 For instance:
 {noformat}
 SELECT * FROM table-aes128 t1 JOIN table-aes256 t2 WHERE t1.id == t2.id;
 {noformat}
 - This should use a scratch directory (or staging directory) inside the 
 table-aes256 table location.
 {noformat}
 INSERT OVERWRITE TABLE table-unencrypted SELECT * FROM table-aes1;
 {noformat}
 - This should use a scratch directory inside the table-aes1 location.
 {noformat}
 FROM table-unencrypted
 INSERT OVERWRITE TABLE table-aes128 SELECT id, name
 INSERT OVERWRITE TABLE table-aes256 SELECT id, name
 {noformat}
 - This should use a scratch directory on each of the tables locations.
 - The first SELECT will have its scratch directory on table-aes128 directory.
 - The second SELECT will have its scratch directory on table-aes256 directory.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9644) CASE comparison operator rotation optimization

2015-05-06 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-9644:
---
Attachment: HIVE-9644.2.patch

Extended patch with folding of when udf. 

 CASE comparison operator rotation optimization
 --

 Key: HIVE-9644
 URL: https://issues.apache.org/jira/browse/HIVE-9644
 Project: Hive
  Issue Type: Bug
  Components: Logical Optimizer
Affects Versions: 0.14.0, 1.0.0, 1.1.0
Reporter: Gopal V
Assignee: Ashutosh Chauhan
 Attachments: HIVE-9644.1.patch, HIVE-9644.2.patch, HIVE-9644.patch


 Constant folding for queries don't kick in for some automatically generated 
 query patterns which look like this.
 {code}
 hive explain select count(1) from store_sales where (case ss_sold_date when 
 '1998-01-01' then 1 else null end)=1;
 {code}
 This should get rewritten by pushing the equality into the case branches.
 {code}
 select count(1) from store_sales where (case ss_sold_date when '1998-01-01' 
 then 1=1 else null=1 end);
 {code}
 Ending up with a simplified filter condition, resolving itself as 
 {code}
 select count(1) from store_sales where ss_sold_date= '1998-01-01' ;
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10623) Implement hive cli options using beeline functionality

2015-05-06 Thread Ferdinand Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ferdinand Xu updated HIVE-10623:

Attachment: HIVE-10623.patch

Hi [~xuefuz], could you help review this jira? Thank yoU!

 Implement hive cli options using beeline functionality
 --

 Key: HIVE-10623
 URL: https://issues.apache.org/jira/browse/HIVE-10623
 Project: Hive
  Issue Type: Sub-task
  Components: CLI
Reporter: Ferdinand Xu
Assignee: Ferdinand Xu
 Attachments: HIVE-10623.patch


 We need to support the original hive cli options for the purpose of backwards 
 compatibility. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10625) Handle Authorization for 'select expr' hive queries in SQL Standard Authorization

2015-05-06 Thread Nemon Lou (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14530251#comment-14530251
 ] 

Nemon Lou commented on HIVE-10625:
--

And the error log in Hive Server :
{code}
2015-05-06 17:09:51,935 | ERROR | HiveServer2-Handler-Pool: Thread-114 | 
FAILED: HiveAuthzPluginException Error getting object from metastore for Object 
[type=TABLE_OR_VIEW, name=_dummy_database._dummy_table]
org.apache.hadoop.hive.ql.security.authorization.plugin.HiveAuthzPluginException:
 Error getting object from metastore for Object [type=TABLE_OR_VIEW, 
name=_dummy_database._dummy_table]
at 
org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLAuthorizationUtils.throwGetObjErr(SQLAuthorizationUtils.java:310)
at 
org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLAuthorizationUtils.isOwner(SQLAuthorizationUtils.java:272)
at 
org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLAuthorizationUtils.getPrivilegesFromMetaStore(SQLAuthorizationUtils.java:212)
at 
org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdHiveAuthorizationValidator.checkPrivileges(SQLStdHiveAuthorizationValidator.java:141)
at 
org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdHiveAuthorizationValidator.checkPrivileges(SQLStdHiveAuthorizationValidator.java:93)
at 
org.apache.hadoop.hive.ql.security.authorization.plugin.HiveAuthorizerImpl.checkPrivileges(HiveAuthorizerImpl.java:85)
at org.apache.hadoop.hive.ql.Driver.doAuthorizationV2(Driver.java:770)
at org.apache.hadoop.hive.ql.Driver.doAuthorization(Driver.java:565)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:467)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:307)
at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1112)
at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1106)
at 
org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:102)
at 
org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:202)
at 
org.apache.hive.service.cli.operation.Operation.run(Operation.java:257)
at 
org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:379)
at 
org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:366)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at 
org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:78)
at 
org.apache.hive.service.cli.session.HiveSessionProxy.access$000(HiveSessionProxy.java:36)
at 
org.apache.hive.service.cli.session.HiveSessionProxy$1.run(HiveSessionProxy.java:63)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1672)
at 
org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:59)
at com.sun.proxy.$Proxy18.executeStatementAsync(Unknown Source)
at 
org.apache.hive.service.cli.CLIService.executeStatementAsync(CLIService.java:271)
at 
org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:415)
at 
org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1313)
at 
org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1298)
at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
at 
org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:692)
at 
org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:285)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: NoSuchObjectException(message:_dummy_database._dummy_table table not 
found)
at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_table_result$get_table_resultStandardScheme.read(ThriftHiveMetastore.java:32085)
at 

[jira] [Updated] (HIVE-10625) Handle Authorization for 'select expr' hive queries in SQL Standard Authorization

2015-05-06 Thread Nemon Lou (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nemon Lou updated HIVE-10625:
-
Description: 
Hive internally rewrites this 'select expression' query into 'select 
expression from _dummy_database._dummy_table', where these dummy db and table 
are temp entities for the current query.
The SQL Standard Authorization  need to handle these special objects.

Typing select reverse(123); in beeline,will get this error :
{code}
Error: Error while compiling statement: FAILED: HiveAuthzPluginException Error 
getting object from metastore for Object [type=TABLE_OR_VIEW, 
name=_dummy_database._dummy_table] (state=42000,code=4)
{code}

  was:
Hive internally rewrites this 'select expression' query into 'select 
expression from _dummy_database._dummy_table', where these dummy db and table 
are temp entities for the current query.
The SQL Standard Authorization  need to handle these special objects.

Typing select reverse(123); in beeline : 
,will get this error :
{code}
Error: Error while compiling statement: FAILED: HiveAuthzPluginException Error 
getting object from metastore for Object [type=TABLE_OR_VIEW, 
name=_dummy_database._dummy_table] (state=42000,code=4)
{code}


 Handle Authorization for  'select expr' hive queries in  SQL Standard 
 Authorization
 -

 Key: HIVE-10625
 URL: https://issues.apache.org/jira/browse/HIVE-10625
 Project: Hive
  Issue Type: Bug
  Components: Authorization, SQLStandardAuthorization
Affects Versions: 1.1.0
Reporter: Nemon Lou

 Hive internally rewrites this 'select expression' query into 'select 
 expression from _dummy_database._dummy_table', where these dummy db and 
 table are temp entities for the current query.
 The SQL Standard Authorization  need to handle these special objects.
 Typing select reverse(123); in beeline,will get this error :
 {code}
 Error: Error while compiling statement: FAILED: HiveAuthzPluginException 
 Error getting object from metastore for Object [type=TABLE_OR_VIEW, 
 name=_dummy_database._dummy_table] (state=42000,code=4)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9456) Make Hive support unicode with MSSQL as Metastore backend

2015-05-06 Thread Sushanth Sowmyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14530263#comment-14530263
 ] 

Sushanth Sowmyan commented on HIVE-9456:


Precommit tests skipped this saying that attachment id 12729762 had also 
already been tested. As I prepared to upload a .3.patch identical to the 
.2.patch, I got to thinking - there's no point in the precommit tests running 
on this patch - this patch affects only mssql, which the precommit tests do not 
use.

[~thejas], what do you think? Should we go ahead and submit this as-is?

 Make Hive support unicode with MSSQL as Metastore backend
 -

 Key: HIVE-9456
 URL: https://issues.apache.org/jira/browse/HIVE-9456
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.14.0
Reporter: Xiaobing Zhou
Assignee: Xiaobing Zhou
 Attachments: HIVE-9456.1.patch, HIVE-9456.2.patch, 
 HIVE-9456.branch-1.2.patch


 There are significant issues when Hive uses MSSQL as metastore backend to 
 support unicode, since MSSQL handles varchar and nvarchar datatypes 
 differently. Hive 0.14 metastore mssql script DDL was using varchar as 
 datatype, which can't handle multi-bytes/unicode characters, e.g., Chinese 
 chars. This JIRA is going to track implementation of unicode support in that 
 case.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9582) HCatalog should use IMetaStoreClient interface

2015-05-06 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14530169#comment-14530169
 ] 

Hive QA commented on HIVE-9582:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12730401/HIVE-9582.8.patch

{color:red}ERROR:{color} -1 due to 24 failed/errored test(s), 8900 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_parts
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_insert_partition_dynamic
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_insert_partition_static
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_join_unencrypted_tbl
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_join_with_different_encryption_keys
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_load_data_to_encrypted_tables
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_select_read_only_encrypted_tbl
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_disallow_transform
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_droppartition
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_sba_drop_table
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_alterpart_loc
org.apache.hadoop.hive.ql.security.TestStorageBasedClientSideAuthorizationProvider.testSimplePrivileges
org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationDrops.testDropDatabase
org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationDrops.testDropPartition
org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationDrops.testDropTable
org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationDrops.testDropView
org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationProvider.testSimplePrivileges
org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationProviderWithACL.testSimplePrivileges
org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationReads.testReadDbFailure
org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationReads.testReadDbSuccess
org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationReads.testReadTableFailure
org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationReads.testReadTableSuccess
org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.TestSQLStdHiveAccessControllerHS2.testConfigProcessing
org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.TestSQLStdHiveAccessControllerHS2.testConfigProcessingCustomSetWhitelistAppend
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3751/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3751/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3751/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 24 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12730401 - PreCommit-HIVE-TRUNK-Build

 HCatalog should use IMetaStoreClient interface
 --

 Key: HIVE-9582
 URL: https://issues.apache.org/jira/browse/HIVE-9582
 Project: Hive
  Issue Type: Sub-task
  Components: HCatalog, Metastore
Affects Versions: 0.14.0, 0.13.1
Reporter: Thiruvel Thirumoolan
Assignee: Thiruvel Thirumoolan
  Labels: hcatalog, metastore, rolling_upgrade
 Fix For: 1.2.0

 Attachments: HIVE-9582.1.patch, HIVE-9582.2.patch, HIVE-9582.3.patch, 
 HIVE-9582.4.patch, HIVE-9582.5.patch, HIVE-9582.6.patch, HIVE-9582.7.patch, 
 HIVE-9582.8.patch, HIVE-9583.1.patch


 Hive uses IMetaStoreClient and it makes using RetryingMetaStoreClient easy. 
 Hence during a failure, the client retries and possibly succeeds. But 
 HCatalog has long been using HiveMetaStoreClient directly and hence failures 
 are costly, especially if they are during the commit stage of a job. Its also 
 not possible to do rolling upgrade of MetaStore Server.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10484) Vectorization : RuntimeException Big Table Retained Mapping duplicate column

2015-05-06 Thread Damien Carol (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Damien Carol updated HIVE-10484:

Description: 
With vectorization and tez enabled TPC-DS Q70 fails with 
{code}
Caused by: java.lang.RuntimeException: Big Table Retained Mapping duplicate 
column 6 in ordered column map {6=(value column: 6, type name: int), 21=(value 
column: 21, type name: float), 22=(value column: 22, type name: int)} when 
adding value column 6, type int
at 
org.apache.hadoop.hive.ql.exec.vector.VectorColumnOrderedMap.add(VectorColumnOrderedMap.java:97)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorColumnOutputMapping.add(VectorColumnOutputMapping.java:40)
at 
org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinCommonOperator.determineCommonInfo(VectorMapJoinCommonOperator.java:320)
at 
org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinCommonOperator.init(VectorMapJoinCommonOperator.java:254)
at 
org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinGenerateResultOperator.init(VectorMapJoinGenerateResultOperator.java:89)
at 
org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerGenerateResultOperator.init(VectorMapJoinInnerGenerateResultOperator.java:97)
at 
org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerLongOperator.init(VectorMapJoinInnerLongOperator.java:79)
... 49 more
{code}

Query 
{code:sql}
 select s_state
   from  (select s_state as s_state, sum(ss_net_profit),
 rank() over ( partition by s_state order by 
sum(ss_net_profit) desc) as ranking
  from   store_sales, store, date_dim
  where  d_month_seq between 1193 and 1193+11
and date_dim.d_date_sk = store_sales.ss_sold_date_sk
and store.s_store_sk  = store_sales.ss_store_sk
  group by s_state
 ) tmp1
   where ranking = 5
{code}

  was:
With vectorization and tez enabled TPC-DS Q70 fails with 
{code}
Caused by: java.lang.RuntimeException: Big Table Retained Mapping duplicate 
column 6 in ordered column map {6=(value column: 6, type name: int), 21=(value 
column: 21, type name: float), 22=(value column: 22, type name: int)} when 
adding value column 6, type int
at 
org.apache.hadoop.hive.ql.exec.vector.VectorColumnOrderedMap.add(VectorColumnOrderedMap.java:97)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorColumnOutputMapping.add(VectorColumnOutputMapping.java:40)
at 
org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinCommonOperator.determineCommonInfo(VectorMapJoinCommonOperator.java:320)
at 
org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinCommonOperator.init(VectorMapJoinCommonOperator.java:254)
at 
org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinGenerateResultOperator.init(VectorMapJoinGenerateResultOperator.java:89)
at 
org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerGenerateResultOperator.init(VectorMapJoinInnerGenerateResultOperator.java:97)
at 
org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerLongOperator.init(VectorMapJoinInnerLongOperator.java:79)
... 49 more
{code}

Query 
{code}
 select s_state
   from  (select s_state as s_state, sum(ss_net_profit),
 rank() over ( partition by s_state order by 
sum(ss_net_profit) desc) as ranking
  from   store_sales, store, date_dim
  where  d_month_seq between 1193 and 1193+11
and date_dim.d_date_sk = store_sales.ss_sold_date_sk
and store.s_store_sk  = store_sales.ss_store_sk
  group by s_state
 ) tmp1
   where ranking = 5
{code}


 Vectorization : RuntimeException Big Table Retained Mapping duplicate column
 --

 Key: HIVE-10484
 URL: https://issues.apache.org/jira/browse/HIVE-10484
 Project: Hive
  Issue Type: Bug
  Components: Tez, Vectorization
Affects Versions: 1.2.0
Reporter: Mostafa Mokhtar
Assignee: Matt McCline
 Fix For: 1.2.0

 Attachments: HIVE-10484.01.patch


 With vectorization and tez enabled TPC-DS Q70 fails with 
 {code}
 Caused by: java.lang.RuntimeException: Big Table Retained Mapping duplicate 
 column 6 in ordered column map {6=(value column: 6, type name: int), 
 21=(value column: 21, type name: float), 22=(value column: 22, type name: 
 int)} when adding value column 6, type int
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorColumnOrderedMap.add(VectorColumnOrderedMap.java:97)
   at 
 

[jira] [Commented] (HIVE-9644) CASE comparison operator rotation optimization

2015-05-06 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14530188#comment-14530188
 ] 

Gopal V commented on HIVE-9644:
---

The patch does not seem to have the comparison operator tree rotation, perhaps 
we should leave this JIRA open and open another one to hold the CASE/WHEN 
folding?.

 CASE comparison operator rotation optimization
 --

 Key: HIVE-9644
 URL: https://issues.apache.org/jira/browse/HIVE-9644
 Project: Hive
  Issue Type: Bug
  Components: Logical Optimizer
Affects Versions: 0.14.0, 1.0.0, 1.1.0
Reporter: Gopal V
Assignee: Ashutosh Chauhan
 Attachments: HIVE-9644.1.patch, HIVE-9644.2.patch, HIVE-9644.patch


 Constant folding for queries don't kick in for some automatically generated 
 query patterns which look like this.
 {code}
 hive explain select count(1) from store_sales where (case ss_sold_date when 
 '1998-01-01' then 1 else null end)=1;
 {code}
 This should get rewritten by pushing the equality into the case branches.
 {code}
 select count(1) from store_sales where (case ss_sold_date when '1998-01-01' 
 then 1=1 else null=1 end);
 {code}
 Ending up with a simplified filter condition, resolving itself as 
 {code}
 select count(1) from store_sales where ss_sold_date= '1998-01-01' ;
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10308) Vectorization execution throws java.lang.IllegalArgumentException: Unsupported complex type: MAP

2015-05-06 Thread Matt McCline (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14530238#comment-14530238
 ] 

Matt McCline commented on HIVE-10308:
-

I ended up fixing this problem with one of my other changes because it caused 
some Q file failures.

 Vectorization execution throws java.lang.IllegalArgumentException: 
 Unsupported complex type: MAP
 

 Key: HIVE-10308
 URL: https://issues.apache.org/jira/browse/HIVE-10308
 Project: Hive
  Issue Type: Bug
  Components: Vectorization
Affects Versions: 0.14.0, 0.13.1, 1.2.0, 1.1.0
Reporter: Selina Zhang
Assignee: Matt McCline
 Attachments: HIVE-10308.1.patch


 Steps to reproduce:
 
 CREATE TABLE test_orc (a INT, b MAPINT, STRING) STORED AS ORC;
 INSERT OVERWRITE TABLE test_orc SELECT 1, MAP(1, one, 2, two) FROM src 
 LIMIT 1;
 CREATE TABLE test(key INT) ;
 INSERT OVERWRITE TABLE test SELECT 1 FROM src LIMIT 1;
 set hive.vectorized.execution.enabled=true;
 set hive.auto.convert.join=false;
 select l.key from test l left outer join test_orc r on (l.key= r.a) where r.a 
 is not null;
 Stack trace:
 
 Caused by: java.lang.IllegalArgumentException: Unsupported complex type: MAP
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpressionWriterFactory.genVectorExpressionWritable(VectorExpressionWriterFactory.java:456)
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpressionWriterFactory.processVectorInspector(VectorExpressionWriterFactory.java:1191)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorReduceSinkOperator.initializeOp(VectorReduceSinkOperator.java:58)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:362)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:481)
   at 
 org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:438)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:375)
   at 
 org.apache.hadoop.hive.ql.exec.MapOperator.initializeMapOperator(MapOperator.java:442)
   at 
 org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:198)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9582) HCatalog should use IMetaStoreClient interface

2015-05-06 Thread Sushanth Sowmyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14530266#comment-14530266
 ] 

Sushanth Sowmyan commented on HIVE-9582:


Committed to master and branch-1.2. Thanks Thirvel  Thejas!

 HCatalog should use IMetaStoreClient interface
 --

 Key: HIVE-9582
 URL: https://issues.apache.org/jira/browse/HIVE-9582
 Project: Hive
  Issue Type: Sub-task
  Components: HCatalog, Metastore
Affects Versions: 0.14.0, 0.13.1
Reporter: Thiruvel Thirumoolan
Assignee: Thiruvel Thirumoolan
  Labels: hcatalog, metastore, rolling_upgrade
 Fix For: 1.2.0

 Attachments: HIVE-9582.1.patch, HIVE-9582.2.patch, HIVE-9582.3.patch, 
 HIVE-9582.4.patch, HIVE-9582.5.patch, HIVE-9582.6.patch, HIVE-9582.7.patch, 
 HIVE-9582.8.patch, HIVE-9583.1.patch


 Hive uses IMetaStoreClient and it makes using RetryingMetaStoreClient easy. 
 Hence during a failure, the client retries and possibly succeeds. But 
 HCatalog has long been using HiveMetaStoreClient directly and hence failures 
 are costly, especially if they are during the commit stage of a job. Its also 
 not possible to do rolling upgrade of MetaStore Server.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9845) HCatSplit repeats information making input split data size huge

2015-05-06 Thread Sushanth Sowmyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14530279#comment-14530279
 ] 

Sushanth Sowmyan commented on HIVE-9845:


Note : precommit link when it runs will be at 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3761

 HCatSplit repeats information making input split data size huge
 ---

 Key: HIVE-9845
 URL: https://issues.apache.org/jira/browse/HIVE-9845
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Reporter: Rohini Palaniswamy
Assignee: Mithun Radhakrishnan
 Attachments: HIVE-9845.1.patch, HIVE-9845.3.patch, HIVE-9845.4.patch, 
 HIVE-9845.5.patch


 Pig on Tez jobs with larger tables hit PIG-4443. Running on HDFS data which 
 has even triple the number of splits(100K+ splits and tasks) does not hit 
 that issue.
 {code}
 HCatBaseInputFormat.java:
  //Call getSplit on the InputFormat, create an
   //HCatSplit for each underlying split
   //NumSplits is 0 for our purposes
   org.apache.hadoop.mapred.InputSplit[] baseSplits = 
 inputFormat.getSplits(jobConf, 0);
   for(org.apache.hadoop.mapred.InputSplit split : baseSplits) {
 splits.add(new HCatSplit(
 partitionInfo,
 split,allCols));
   }
 {code}
 Each hcatSplit duplicates partition schema and table schema.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10597) Relative path doesn't work with CREATE TABLE LOCATION 'relative/path'

2015-05-06 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14530294#comment-14530294
 ] 

Hive QA commented on HIVE-10597:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12730515/HIVE-10597.02.patch

{color:red}ERROR:{color} -1 due to 24 failed/errored test(s), 8902 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_parts
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_insert_partition_dynamic
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_insert_partition_static
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_join_unencrypted_tbl
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_join_with_different_encryption_keys
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_load_data_to_encrypted_tables
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_select_read_only_encrypted_tbl
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_disallow_transform
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_droppartition
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_sba_drop_table
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_alterpart_loc
org.apache.hadoop.hive.ql.security.TestStorageBasedClientSideAuthorizationProvider.testSimplePrivileges
org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationDrops.testDropDatabase
org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationDrops.testDropPartition
org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationDrops.testDropTable
org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationDrops.testDropView
org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationProvider.testSimplePrivileges
org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationProviderWithACL.testSimplePrivileges
org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationReads.testReadDbFailure
org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationReads.testReadDbSuccess
org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationReads.testReadTableFailure
org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationReads.testReadTableSuccess
org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.TestSQLStdHiveAccessControllerHS2.testConfigProcessing
org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.TestSQLStdHiveAccessControllerHS2.testConfigProcessingCustomSetWhitelistAppend
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3752/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3752/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3752/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 24 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12730515 - PreCommit-HIVE-TRUNK-Build

 Relative path doesn't work with CREATE TABLE LOCATION 'relative/path'
 -

 Key: HIVE-10597
 URL: https://issues.apache.org/jira/browse/HIVE-10597
 Project: Hive
  Issue Type: Bug
Affects Versions: 1.2.0
Reporter: Reuben Kuhnert
Assignee: Reuben Kuhnert
Priority: Minor
 Attachments: HIVE-10597.01.patch, HIVE-10597.02.patch


 {code}
 0: jdbc:hive2://a2110.halxg.cloudera.com:1000 CREATE EXTERNAL TABLE IF NOT 
 EXISTS mydb.employees3 like mydb.employees LOCATION 'data/stock';
 Error: Error while processing statement: FAILED: Execution Error, return code 
 1 from org.apache.hadoop.hive.ql.exec.DDLTask. 
 MetaException(message:java.lang.NullPointerException) (state=08S01,code=1)
 0: jdbc:hive2://a2110.halxg.cloudera.com:1000 CREATE EXTERNAL TABLE IF NOT 
 EXISTS mydb.employees3 like mydb.employees LOCATION '/user/hive/data/stock';
 No rows affected (0.369 seconds)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10539) set default value of hive.repl.task.factory

2015-05-06 Thread Sushanth Sowmyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14530264#comment-14530264
 ] 

Sushanth Sowmyan commented on HIVE-10539:
-

None of the precommit test failures here are related. Committing.

 set default value of hive.repl.task.factory
 ---

 Key: HIVE-10539
 URL: https://issues.apache.org/jira/browse/HIVE-10539
 Project: Hive
  Issue Type: Bug
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Attachments: HIVE-10539.1.patch, HIVE-10539.2.patch, 
 HIVE-10539.3.patch


 hive.repl.task.factory does not have a default value set. It should be set to 
 org.apache.hive.hcatalog.api.repl.exim.EximReplicationTaskFactory.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10484) Vectorization : RuntimeException Big Table Retained Mapping duplicate column

2015-05-06 Thread Matt McCline (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14530198#comment-14530198
 ] 

Matt McCline commented on HIVE-10484:
-

Found the problem -- relatively simple fix.

 Vectorization : RuntimeException Big Table Retained Mapping duplicate column
 --

 Key: HIVE-10484
 URL: https://issues.apache.org/jira/browse/HIVE-10484
 Project: Hive
  Issue Type: Bug
  Components: Tez, Vectorization
Affects Versions: 1.2.0
Reporter: Mostafa Mokhtar
Assignee: Matt McCline
 Fix For: 1.2.0

 Attachments: HIVE-10484.01.patch


 With vectorization and tez enabled TPC-DS Q70 fails with 
 {code}
 Caused by: java.lang.RuntimeException: Big Table Retained Mapping duplicate 
 column 6 in ordered column map {6=(value column: 6, type name: int), 
 21=(value column: 21, type name: float), 22=(value column: 22, type name: 
 int)} when adding value column 6, type int
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorColumnOrderedMap.add(VectorColumnOrderedMap.java:97)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorColumnOutputMapping.add(VectorColumnOutputMapping.java:40)
   at 
 org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinCommonOperator.determineCommonInfo(VectorMapJoinCommonOperator.java:320)
   at 
 org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinCommonOperator.init(VectorMapJoinCommonOperator.java:254)
   at 
 org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinGenerateResultOperator.init(VectorMapJoinGenerateResultOperator.java:89)
   at 
 org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerGenerateResultOperator.init(VectorMapJoinInnerGenerateResultOperator.java:97)
   at 
 org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerLongOperator.init(VectorMapJoinInnerLongOperator.java:79)
   ... 49 more
 {code}
 Query 
 {code:sql}
  select s_state
from  (select s_state as s_state, sum(ss_net_profit),
  rank() over ( partition by s_state order by 
 sum(ss_net_profit) desc) as ranking
   from   store_sales, store, date_dim
   where  d_month_seq between 1193 and 1193+11
 and date_dim.d_date_sk = 
 store_sales.ss_sold_date_sk
 and store.s_store_sk  = store_sales.ss_store_sk
   group by s_state
  ) tmp1
where ranking = 5
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9582) HCatalog should use IMetaStoreClient interface

2015-05-06 Thread Sushanth Sowmyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14530234#comment-14530234
 ] 

Sushanth Sowmyan commented on HIVE-9582:


The test failures don't seem related to this patch. Will go ahead and commit.

 HCatalog should use IMetaStoreClient interface
 --

 Key: HIVE-9582
 URL: https://issues.apache.org/jira/browse/HIVE-9582
 Project: Hive
  Issue Type: Sub-task
  Components: HCatalog, Metastore
Affects Versions: 0.14.0, 0.13.1
Reporter: Thiruvel Thirumoolan
Assignee: Thiruvel Thirumoolan
  Labels: hcatalog, metastore, rolling_upgrade
 Fix For: 1.2.0

 Attachments: HIVE-9582.1.patch, HIVE-9582.2.patch, HIVE-9582.3.patch, 
 HIVE-9582.4.patch, HIVE-9582.5.patch, HIVE-9582.6.patch, HIVE-9582.7.patch, 
 HIVE-9582.8.patch, HIVE-9583.1.patch


 Hive uses IMetaStoreClient and it makes using RetryingMetaStoreClient easy. 
 Hence during a failure, the client retries and possibly succeeds. But 
 HCatalog has long been using HiveMetaStoreClient directly and hence failures 
 are costly, especially if they are during the commit stage of a job. Its also 
 not possible to do rolling upgrade of MetaStore Server.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10592) ORC file dump in JSON format

2015-05-06 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14529991#comment-14529991
 ] 

Hive QA commented on HIVE-10592:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12730381/HIVE-10592.3.patch

{color:red}ERROR:{color} -1 due to 24 failed/errored test(s), 8901 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_parts
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_insert_partition_dynamic
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_insert_partition_static
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_join_unencrypted_tbl
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_join_with_different_encryption_keys
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_load_data_to_encrypted_tables
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_select_read_only_encrypted_tbl
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_disallow_transform
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_droppartition
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_sba_drop_table
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_alterpart_loc
org.apache.hadoop.hive.ql.security.TestStorageBasedClientSideAuthorizationProvider.testSimplePrivileges
org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationDrops.testDropDatabase
org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationDrops.testDropPartition
org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationDrops.testDropTable
org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationDrops.testDropView
org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationProvider.testSimplePrivileges
org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationProviderWithACL.testSimplePrivileges
org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationReads.testReadDbFailure
org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationReads.testReadDbSuccess
org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationReads.testReadTableFailure
org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationReads.testReadTableSuccess
org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.TestSQLStdHiveAccessControllerHS2.testConfigProcessing
org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.TestSQLStdHiveAccessControllerHS2.testConfigProcessingCustomSetWhitelistAppend
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3747/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3747/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3747/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 24 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12730381 - PreCommit-HIVE-TRUNK-Build

 ORC file dump in JSON format
 

 Key: HIVE-10592
 URL: https://issues.apache.org/jira/browse/HIVE-10592
 Project: Hive
  Issue Type: New Feature
Affects Versions: 1.3.0
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran
 Attachments: HIVE-10592.1.patch, HIVE-10592.2.patch, 
 HIVE-10592.3.patch, HIVE-10592.4.patch


 ORC file dump uses custom format. Will be useful to dump ORC metadata in json 
 format so that other tools can be built on top it. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10568) Select count(distinct()) can have more optimal execution plan

2015-05-06 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-10568:

Attachment: HIVE-10568.1.patch

Rebased on HIVE-10607.  Patch is ready for review.
[~jpullokkaran] can you take a look?

 Select count(distinct()) can have more optimal execution plan
 -

 Key: HIVE-10568
 URL: https://issues.apache.org/jira/browse/HIVE-10568
 Project: Hive
  Issue Type: Improvement
  Components: CBO, Logical Optimizer
Affects Versions: 0.6.0, 0.7.0, 0.8.0, 0.9.0, 0.10.0, 0.11.0, 0.12.0, 
 0.13.0, 0.14.0, 1.0.0, 1.1.0
Reporter: Mostafa Mokhtar
Assignee: Ashutosh Chauhan
 Attachments: HIVE-10568.1.patch, HIVE-10568.patch, HIVE-10568.patch


 {code:sql}
 select count(distinct ss_ticket_number) from store_sales;
 {code}
 can be rewritten as
 {code:sql}
 select count(1) from (select distinct ss_ticket_number from store_sales) a;
 {code}
 which may run upto 3x faster



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9743) Incorrect result set for vectorized left outer join

2015-05-06 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14530084#comment-14530084
 ] 

Hive QA commented on HIVE-9743:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12730688/HIVE-9743.09.patch

{color:red}ERROR:{color} -1 due to 27 failed/errored test(s), 8905 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_parts
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_auto_mult_tables_compact
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_leftsemi_mapjoin_orig
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_insert_partition_dynamic
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_insert_partition_static
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_join_unencrypted_tbl
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_join_with_different_encryption_keys
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_load_data_to_encrypted_tables
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_select_read_only_encrypted_tbl
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_disallow_transform
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_droppartition
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_sba_drop_table
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_alterpart_loc
org.apache.hadoop.hive.ql.security.TestStorageBasedClientSideAuthorizationProvider.testSimplePrivileges
org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationDrops.testDropDatabase
org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationDrops.testDropPartition
org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationDrops.testDropTable
org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationDrops.testDropView
org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationProvider.testSimplePrivileges
org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationProviderWithACL.testSimplePrivileges
org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationReads.testReadDbFailure
org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationReads.testReadDbSuccess
org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationReads.testReadTableFailure
org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationReads.testReadTableSuccess
org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.TestSQLStdHiveAccessControllerHS2.testConfigProcessing
org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.TestSQLStdHiveAccessControllerHS2.testConfigProcessingCustomSetWhitelistAppend
org.apache.hive.hcatalog.streaming.TestStreaming.testTransactionBatchEmptyCommit
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3750/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3750/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3750/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 27 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12730688 - PreCommit-HIVE-TRUNK-Build

 Incorrect result set for vectorized left outer join
 ---

 Key: HIVE-9743
 URL: https://issues.apache.org/jira/browse/HIVE-9743
 Project: Hive
  Issue Type: Bug
  Components: SQL
Affects Versions: 0.14.0
Reporter: N Campbell
Assignee: Matt McCline
 Attachments: HIVE-9743.01.patch, HIVE-9743.02.patch, 
 HIVE-9743.03.patch, HIVE-9743.04.patch, HIVE-9743.05.patch, 
 HIVE-9743.06.patch, HIVE-9743.08.patch, HIVE-9743.09.patch


 This query is supposed to return 3 rows and will when run without Tez but 
 returns 2 rows when run with Tez.
 select tjoin1.rnum, tjoin1.c1, tjoin1.c2, tjoin2.c2 as c2j2 from tjoin1 left 
 outer join tjoin2 on ( tjoin1.c1 = tjoin2.c1 and tjoin1.c2  15 )
 tjoin1.rnum   tjoin1.c1   tjoin1.c2   c2j2
 1 20  25  null
 2 null  50  null
 instead of
 tjoin1.rnum   tjoin1.c1   tjoin1.c2   c2j2
 0 10  15  null
 1 

[jira] [Assigned] (HIVE-10515) Create tests to cover existing (supported) Hive CLI functionality

2015-05-06 Thread Ferdinand Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ferdinand Xu reassigned HIVE-10515:
---

Assignee: Ferdinand Xu

 Create tests to cover existing (supported) Hive CLI functionality
 -

 Key: HIVE-10515
 URL: https://issues.apache.org/jira/browse/HIVE-10515
 Project: Hive
  Issue Type: Sub-task
  Components: CLI
Affects Versions: 0.10.0
Reporter: Xuefu Zhang
Assignee: Ferdinand Xu

 After removing HiveServer1, Hive CLI's functionality is reduced to its 
 original use case, a thick client application. Let's identify this so that we 
 maintain it when implementation is changed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9743) Incorrect result set for vectorized left outer join

2015-05-06 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-9743:
---
Attachment: HIVE-9743.091.patch

 Incorrect result set for vectorized left outer join
 ---

 Key: HIVE-9743
 URL: https://issues.apache.org/jira/browse/HIVE-9743
 Project: Hive
  Issue Type: Bug
  Components: SQL
Affects Versions: 0.14.0
Reporter: N Campbell
Assignee: Matt McCline
 Attachments: HIVE-9743.01.patch, HIVE-9743.02.patch, 
 HIVE-9743.03.patch, HIVE-9743.04.patch, HIVE-9743.05.patch, 
 HIVE-9743.06.patch, HIVE-9743.08.patch, HIVE-9743.09.patch, 
 HIVE-9743.091.patch


 This query is supposed to return 3 rows and will when run without Tez but 
 returns 2 rows when run with Tez.
 select tjoin1.rnum, tjoin1.c1, tjoin1.c2, tjoin2.c2 as c2j2 from tjoin1 left 
 outer join tjoin2 on ( tjoin1.c1 = tjoin2.c1 and tjoin1.c2  15 )
 tjoin1.rnum   tjoin1.c1   tjoin1.c2   c2j2
 1 20  25  null
 2 null  50  null
 instead of
 tjoin1.rnum   tjoin1.c1   tjoin1.c2   c2j2
 0 10  15  null
 1 20  25  null
 2 null  50  null
 create table  if not exists TJOIN1 (RNUM int , C1 int, C2 int)
  STORED AS orc ;
 0|10|15
 1|20|25
 2|\N|50
 create table  if not exists TJOIN2 (RNUM int , C1 int, C2 char(2))
 ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' LINES TERMINATED BY '\n' 
  STORED AS TEXTFILE ;
 0|10|BB
 1|15|DD
 2|\N|EE
 3|10|FF



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10548) Remove dependency to s3 repository in root pom

2015-05-06 Thread Chengxiang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chengxiang Li updated HIVE-10548:
-
Attachment: HIVE-10548.2.patch

 Remove dependency to s3 repository in root pom
 --

 Key: HIVE-10548
 URL: https://issues.apache.org/jira/browse/HIVE-10548
 Project: Hive
  Issue Type: Bug
  Components: Build Infrastructure
Reporter: Szehon Ho
Assignee: Szehon Ho
 Attachments: HIVE-10548.2.patch, HIVE-10548.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10484) Vectorization : RuntimeException Big Table Retained Mapping duplicate column

2015-05-06 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-10484:

Attachment: HIVE-10484.01.patch

 Vectorization : RuntimeException Big Table Retained Mapping duplicate column
 --

 Key: HIVE-10484
 URL: https://issues.apache.org/jira/browse/HIVE-10484
 Project: Hive
  Issue Type: Bug
  Components: Tez, Vectorization
Affects Versions: 1.2.0
Reporter: Mostafa Mokhtar
Assignee: Matt McCline
 Fix For: 1.2.0

 Attachments: HIVE-10484.01.patch


 With vectorization and tez enabled TPC-DS Q70 fails with 
 {code}
 Caused by: java.lang.RuntimeException: Big Table Retained Mapping duplicate 
 column 6 in ordered column map {6=(value column: 6, type name: int), 
 21=(value column: 21, type name: float), 22=(value column: 22, type name: 
 int)} when adding value column 6, type int
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorColumnOrderedMap.add(VectorColumnOrderedMap.java:97)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorColumnOutputMapping.add(VectorColumnOutputMapping.java:40)
   at 
 org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinCommonOperator.determineCommonInfo(VectorMapJoinCommonOperator.java:320)
   at 
 org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinCommonOperator.init(VectorMapJoinCommonOperator.java:254)
   at 
 org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinGenerateResultOperator.init(VectorMapJoinGenerateResultOperator.java:89)
   at 
 org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerGenerateResultOperator.init(VectorMapJoinInnerGenerateResultOperator.java:97)
   at 
 org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerLongOperator.init(VectorMapJoinInnerLongOperator.java:79)
   ... 49 more
 {code}
 Query 
 {code}
  select s_state
from  (select s_state as s_state, sum(ss_net_profit),
  rank() over ( partition by s_state order by 
 sum(ss_net_profit) desc) as ranking
   from   store_sales, store, date_dim
   where  d_month_seq between 1193 and 1193+11
 and date_dim.d_date_sk = 
 store_sales.ss_sold_date_sk
 and store.s_store_sk  = store_sales.ss_store_sk
   group by s_state
  ) tmp1
where ranking = 5
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10435) Make HiveSession implementation pluggable through configuration

2015-05-06 Thread Akshay Goyal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akshay Goyal updated HIVE-10435:

Attachment: HIVE-10435.1.patch

 Make HiveSession implementation pluggable through configuration
 ---

 Key: HIVE-10435
 URL: https://issues.apache.org/jira/browse/HIVE-10435
 Project: Hive
  Issue Type: Improvement
  Components: HiveServer2
Reporter: Amareshwari Sriramadasu
Assignee: Akshay Goyal
 Attachments: HIVE-10435.1.patch


 SessionManager in CLIService creates and keeps track of HiveSession. 
 Right now, it creates HiveSessionImpl which is one implementation of 
 HiveSession. This improvement request is to make it pluggable through a 
 configuration sothat other implementations can be passed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10453) HS2 leaking open file descriptors when using UDFs

2015-05-06 Thread Yongzhi Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongzhi Chen updated HIVE-10453:

Attachment: HIVE-10453.2.patch

 HS2 leaking open file descriptors when using UDFs
 -

 Key: HIVE-10453
 URL: https://issues.apache.org/jira/browse/HIVE-10453
 Project: Hive
  Issue Type: Bug
Reporter: Yongzhi Chen
Assignee: Yongzhi Chen
 Attachments: HIVE-10453.1.patch, HIVE-10453.2.patch


 1. create a custom function by
 CREATE FUNCTION myfunc AS 'someudfclass' using jar 'hdfs:///tmp/myudf.jar';
 2. Create a simple jdbc client, just do 
 connect, 
 run simple query which using the function such as:
 select myfunc(col1) from sometable
 3. Disconnect.
 Check open file for HiveServer2 by:
 lsof -p HSProcID | grep myudf.jar
 You will see the leak as:
 {noformat}
 java  28718 ychen  txt  REG1,4741 212977666 
 /private/var/folders/6p/7_njf13d6h144wldzbbsfpz8gp/T/1bfe3de0-ac63-4eba-a725-6a9840f1f8d5_resources/myudf.jar
 java  28718 ychen  330r REG1,4741 212977666 
 /private/var/folders/6p/7_njf13d6h144wldzbbsfpz8gp/T/1bfe3de0-ac63-4eba-a725-6a9840f1f8d5_resources/myudf.jar
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10626) Spark paln need to be updated [Spark Branch]

2015-05-06 Thread Chinna Rao Lalam (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chinna Rao Lalam updated HIVE-10626:

Attachment: HIVE-10626-spark.patch

Patch contains the diff of basic patch and latest patch.

 Spark paln need to be updated [Spark Branch]
 

 Key: HIVE-10626
 URL: https://issues.apache.org/jira/browse/HIVE-10626
 Project: Hive
  Issue Type: Bug
  Components: Spark
Affects Versions: spark-branch
Reporter: Chinna Rao Lalam
Assignee: Chinna Rao Lalam
 Attachments: HIVE-10626-spark.patch


 [HIVE-8858] basic patch was committed, latest patch need to be committed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10435) Make HiveSession implementation pluggable through configuration

2015-05-06 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14530621#comment-14530621
 ] 

Hive QA commented on HIVE-10435:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12730823/HIVE-10435.1.patch

{color:red}ERROR:{color} -1 due to 28 failed/errored test(s), 8901 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_parts
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udaf_percentile_approx_23
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_insert_partition_dynamic
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_insert_partition_static
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_join_unencrypted_tbl
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_join_with_different_encryption_keys
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_load_data_to_encrypted_tables
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_select_read_only_encrypted_tbl
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_disallow_transform
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_droppartition
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_sba_drop_table
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_alterpart_loc
org.apache.hadoop.hive.ql.security.TestStorageBasedClientSideAuthorizationProvider.testSimplePrivileges
org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationDrops.testDropDatabase
org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationDrops.testDropPartition
org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationDrops.testDropTable
org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationDrops.testDropView
org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationProvider.testSimplePrivileges
org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationProviderWithACL.testSimplePrivileges
org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationReads.testReadDbFailure
org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationReads.testReadDbSuccess
org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationReads.testReadTableFailure
org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationReads.testReadTableSuccess
org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.TestSQLStdHiveAccessControllerHS2.testConfigProcessing
org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.TestSQLStdHiveAccessControllerHS2.testConfigProcessingCustomSetWhitelistAppend
org.apache.hive.hcatalog.streaming.TestStreaming.testAddPartition
org.apache.hive.hcatalog.streaming.TestStreaming.testRemainingTransactions
org.apache.hive.hcatalog.streaming.TestStreaming.testTransactionBatchEmptyCommit
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3779/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3779/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3779/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 28 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12730823 - PreCommit-HIVE-TRUNK-Build

 Make HiveSession implementation pluggable through configuration
 ---

 Key: HIVE-10435
 URL: https://issues.apache.org/jira/browse/HIVE-10435
 Project: Hive
  Issue Type: Improvement
  Components: HiveServer2
Reporter: Amareshwari Sriramadasu
Assignee: Akshay Goyal
 Attachments: HIVE-10435.1.patch


 SessionManager in CLIService creates and keeps track of HiveSession. 
 Right now, it creates HiveSessionImpl which is one implementation of 
 HiveSession. This improvement request is to make it pluggable through a 
 configuration sothat other implementations can be passed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10308) Vectorization execution throws java.lang.IllegalArgumentException: Unsupported complex type: MAP

2015-05-06 Thread Damien Carol (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Damien Carol updated HIVE-10308:

Description: 
Steps to reproduce:
{code:sql}
CREATE TABLE test_orc (a INT, b MAPINT, STRING) STORED AS ORC;
INSERT OVERWRITE TABLE test_orc SELECT 1, MAP(1, one, 2, two) FROM src 
LIMIT 1;
CREATE TABLE test(key INT) ;
INSERT OVERWRITE TABLE test SELECT 1 FROM src LIMIT 1;

set hive.vectorized.execution.enabled=true;
set hive.auto.convert.join=false;

select l.key from test l left outer join test_orc r on (l.key= r.a) where r.a 
is not null;
{code}
Stack trace:
{noformat}
Caused by: java.lang.IllegalArgumentException: Unsupported complex type: MAP
at 
org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpressionWriterFactory.genVectorExpressionWritable(VectorExpressionWriterFactory.java:456)
at 
org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpressionWriterFactory.processVectorInspector(VectorExpressionWriterFactory.java:1191)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorReduceSinkOperator.initializeOp(VectorReduceSinkOperator.java:58)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:362)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:481)
at 
org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:438)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:375)
at 
org.apache.hadoop.hive.ql.exec.MapOperator.initializeMapOperator(MapOperator.java:442)
at 
org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:198)
{noformat}


  was:
Steps to reproduce:

CREATE TABLE test_orc (a INT, b MAPINT, STRING) STORED AS ORC;
INSERT OVERWRITE TABLE test_orc SELECT 1, MAP(1, one, 2, two) FROM src 
LIMIT 1;
CREATE TABLE test(key INT) ;
INSERT OVERWRITE TABLE test SELECT 1 FROM src LIMIT 1;

set hive.vectorized.execution.enabled=true;
set hive.auto.convert.join=false;

select l.key from test l left outer join test_orc r on (l.key= r.a) where r.a 
is not null;

Stack trace:

Caused by: java.lang.IllegalArgumentException: Unsupported complex type: MAP
at 
org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpressionWriterFactory.genVectorExpressionWritable(VectorExpressionWriterFactory.java:456)
at 
org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpressionWriterFactory.processVectorInspector(VectorExpressionWriterFactory.java:1191)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorReduceSinkOperator.initializeOp(VectorReduceSinkOperator.java:58)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:362)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:481)
at 
org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:438)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:375)
at 
org.apache.hadoop.hive.ql.exec.MapOperator.initializeMapOperator(MapOperator.java:442)
at 
org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:198)



 Vectorization execution throws java.lang.IllegalArgumentException: 
 Unsupported complex type: MAP
 

 Key: HIVE-10308
 URL: https://issues.apache.org/jira/browse/HIVE-10308
 Project: Hive
  Issue Type: Bug
  Components: Vectorization
Affects Versions: 0.14.0, 0.13.1, 1.2.0, 1.1.0
Reporter: Selina Zhang
Assignee: Matt McCline
 Attachments: HIVE-10308.1.patch


 Steps to reproduce:
 {code:sql}
 CREATE TABLE test_orc (a INT, b MAPINT, STRING) STORED AS ORC;
 INSERT OVERWRITE TABLE test_orc SELECT 1, MAP(1, one, 2, two) FROM src 
 LIMIT 1;
 CREATE TABLE test(key INT) ;
 INSERT OVERWRITE TABLE test SELECT 1 FROM src LIMIT 1;
 set hive.vectorized.execution.enabled=true;
 set hive.auto.convert.join=false;
 select l.key from test l left outer join test_orc r on (l.key= r.a) where r.a 
 is not null;
 {code}
 Stack trace:
 {noformat}
 Caused by: java.lang.IllegalArgumentException: Unsupported complex type: MAP
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpressionWriterFactory.genVectorExpressionWritable(VectorExpressionWriterFactory.java:456)
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpressionWriterFactory.processVectorInspector(VectorExpressionWriterFactory.java:1191)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorReduceSinkOperator.initializeOp(VectorReduceSinkOperator.java:58)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:362)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:481)
   at 
 

[jira] [Commented] (HIVE-10484) Vectorization : RuntimeException Big Table Retained Mapping duplicate column

2015-05-06 Thread Matt McCline (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14531019#comment-14531019
 ] 

Matt McCline commented on HIVE-10484:
-

I just tried creating a smaller repro and did not succeed.  I'll try taking the 
monster query and making a Q file...

 Vectorization : RuntimeException Big Table Retained Mapping duplicate column
 --

 Key: HIVE-10484
 URL: https://issues.apache.org/jira/browse/HIVE-10484
 Project: Hive
  Issue Type: Bug
  Components: Tez, Vectorization
Affects Versions: 1.2.0
Reporter: Mostafa Mokhtar
Assignee: Matt McCline
 Fix For: 1.2.0

 Attachments: HIVE-10484.01.patch, HIVE-10484.02.patch


 With vectorization and tez enabled TPC-DS Q70 fails with 
 {code}
 Caused by: java.lang.RuntimeException: Big Table Retained Mapping duplicate 
 column 6 in ordered column map {6=(value column: 6, type name: int), 
 21=(value column: 21, type name: float), 22=(value column: 22, type name: 
 int)} when adding value column 6, type int
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorColumnOrderedMap.add(VectorColumnOrderedMap.java:97)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorColumnOutputMapping.add(VectorColumnOutputMapping.java:40)
   at 
 org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinCommonOperator.determineCommonInfo(VectorMapJoinCommonOperator.java:320)
   at 
 org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinCommonOperator.init(VectorMapJoinCommonOperator.java:254)
   at 
 org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinGenerateResultOperator.init(VectorMapJoinGenerateResultOperator.java:89)
   at 
 org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerGenerateResultOperator.init(VectorMapJoinInnerGenerateResultOperator.java:97)
   at 
 org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerLongOperator.init(VectorMapJoinInnerLongOperator.java:79)
   ... 49 more
 {code}
 Query 
 {code:sql}
  select s_state
from  (select s_state as s_state, sum(ss_net_profit),
  rank() over ( partition by s_state order by 
 sum(ss_net_profit) desc) as ranking
   from   store_sales, store, date_dim
   where  d_month_seq between 1193 and 1193+11
 and date_dim.d_date_sk = 
 store_sales.ss_sold_date_sk
 and store.s_store_sk  = store_sales.ss_store_sk
   group by s_state
  ) tmp1
where ranking = 5
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8696) HCatClientHMSImpl doesn't use a Retrying-HiveMetastoreClient.

2015-05-06 Thread Sushanth Sowmyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14531523#comment-14531523
 ] 

Sushanth Sowmyan commented on HIVE-8696:


Also, on a related note, this section is way too brittle, and will likely break 
because of another exception chaining change sometime in the future, and is 
also very difficult to maintain from a readability perspective.

{code}
-  assertTrue(e.getCause().getMessage().contains(
+  
assertTrue(((InvocationTargetException)e.getCause().getCause().getCause()).getTargetException().getMessage().contains(
   Could not connect to meta store using any of the URIs provided));
{code}

I'm not going to ask to change that now, given that this patch itself is very 
important, and a blocker, but could you please file a follow up jira to clean 
up this testcase regarding this. :)

 HCatClientHMSImpl doesn't use a Retrying-HiveMetastoreClient.
 -

 Key: HIVE-8696
 URL: https://issues.apache.org/jira/browse/HIVE-8696
 Project: Hive
  Issue Type: Sub-task
  Components: HCatalog, Metastore
Affects Versions: 0.12.0, 0.13.1
Reporter: Mithun Radhakrishnan
Assignee: Thiruvel Thirumoolan
 Fix For: 1.2.0

 Attachments: HIVE-8696.1.patch, HIVE-8696.2.patch, HIVE-8696.3.patch, 
 HIVE-8696.4.patch, HIVE-8696.poc.patch


 The HCatClientHMSImpl doesn't use a RetryingHiveMetastoreClient. Users of the 
 HCatClient API that log in through keytabs will fail without retry, when 
 their TGTs expire.
 The fix is inbound. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8696) HCatClientHMSImpl doesn't use a Retrying-HiveMetastoreClient.

2015-05-06 Thread Thiruvel Thirumoolan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14531598#comment-14531598
 ] 

Thiruvel Thirumoolan commented on HIVE-8696:



Agree. Raised HIVE-10637 to fix it.

 HCatClientHMSImpl doesn't use a Retrying-HiveMetastoreClient.
 -

 Key: HIVE-8696
 URL: https://issues.apache.org/jira/browse/HIVE-8696
 Project: Hive
  Issue Type: Sub-task
  Components: HCatalog, Metastore
Affects Versions: 0.12.0, 0.13.1
Reporter: Mithun Radhakrishnan
Assignee: Thiruvel Thirumoolan
 Fix For: 1.2.0

 Attachments: HIVE-8696.1.patch, HIVE-8696.2.patch, HIVE-8696.3.patch, 
 HIVE-8696.4.patch, HIVE-8696.5.patch, HIVE-8696.poc.patch


 The HCatClientHMSImpl doesn't use a RetryingHiveMetastoreClient. Users of the 
 HCatClient API that log in through keytabs will fail without retry, when 
 their TGTs expire.
 The fix is inbound. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9743) Incorrect result set for vectorized left outer join

2015-05-06 Thread Vikram Dixit K (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14531628#comment-14531628
 ] 

Vikram Dixit K commented on HIVE-9743:
--

+1

 Incorrect result set for vectorized left outer join
 ---

 Key: HIVE-9743
 URL: https://issues.apache.org/jira/browse/HIVE-9743
 Project: Hive
  Issue Type: Bug
  Components: SQL
Affects Versions: 0.14.0
Reporter: N Campbell
Assignee: Matt McCline
 Attachments: HIVE-9743.01.patch, HIVE-9743.02.patch, 
 HIVE-9743.03.patch, HIVE-9743.04.patch, HIVE-9743.05.patch, 
 HIVE-9743.06.patch, HIVE-9743.08.patch, HIVE-9743.09.patch, 
 HIVE-9743.091.patch


 This query is supposed to return 3 rows and will when run without Tez but 
 returns 2 rows when run with Tez.
 select tjoin1.rnum, tjoin1.c1, tjoin1.c2, tjoin2.c2 as c2j2 from tjoin1 left 
 outer join tjoin2 on ( tjoin1.c1 = tjoin2.c1 and tjoin1.c2  15 )
 tjoin1.rnum   tjoin1.c1   tjoin1.c2   c2j2
 1 20  25  null
 2 null  50  null
 instead of
 tjoin1.rnum   tjoin1.c1   tjoin1.c2   c2j2
 0 10  15  null
 1 20  25  null
 2 null  50  null
 create table  if not exists TJOIN1 (RNUM int , C1 int, C2 int)
  STORED AS orc ;
 0|10|15
 1|20|25
 2|\N|50
 create table  if not exists TJOIN2 (RNUM int , C1 int, C2 char(2))
 ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' LINES TERMINATED BY '\n' 
  STORED AS TEXTFILE ;
 0|10|BB
 1|15|DD
 2|\N|EE
 3|10|FF



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10453) HS2 leaking open file descriptors when using UDFs

2015-05-06 Thread Szehon Ho (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14531646#comment-14531646
 ] 

Szehon Ho commented on HIVE-10453:
--

OK, as long as its not a typo and means something :)  No problem.

 HS2 leaking open file descriptors when using UDFs
 -

 Key: HIVE-10453
 URL: https://issues.apache.org/jira/browse/HIVE-10453
 Project: Hive
  Issue Type: Bug
Reporter: Yongzhi Chen
Assignee: Yongzhi Chen
 Attachments: HIVE-10453.1.patch, HIVE-10453.2.patch


 1. create a custom function by
 CREATE FUNCTION myfunc AS 'someudfclass' using jar 'hdfs:///tmp/myudf.jar';
 2. Create a simple jdbc client, just do 
 connect, 
 run simple query which using the function such as:
 select myfunc(col1) from sometable
 3. Disconnect.
 Check open file for HiveServer2 by:
 lsof -p HSProcID | grep myudf.jar
 You will see the leak as:
 {noformat}
 java  28718 ychen  txt  REG1,4741 212977666 
 /private/var/folders/6p/7_njf13d6h144wldzbbsfpz8gp/T/1bfe3de0-ac63-4eba-a725-6a9840f1f8d5_resources/myudf.jar
 java  28718 ychen  330r REG1,4741 212977666 
 /private/var/folders/6p/7_njf13d6h144wldzbbsfpz8gp/T/1bfe3de0-ac63-4eba-a725-6a9840f1f8d5_resources/myudf.jar
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9644) Fold case when udfs

2015-05-06 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14531543#comment-14531543
 ] 

Ashutosh Chauhan commented on HIVE-9644:


Created HIVE-10636 as follow-up

 Fold case  when udfs
 -

 Key: HIVE-9644
 URL: https://issues.apache.org/jira/browse/HIVE-9644
 Project: Hive
  Issue Type: Bug
  Components: Logical Optimizer
Affects Versions: 0.14.0, 1.0.0, 1.1.0
Reporter: Gopal V
Assignee: Ashutosh Chauhan
 Attachments: HIVE-9644.1.patch, HIVE-9644.2.patch, HIVE-9644.3.patch, 
 HIVE-9644.patch


 Constant folding for queries don't kick in for some automatically generated 
 query patterns which look like this.
 {code}
 hive explain select count(1) from store_sales where (case ss_sold_date when 
 '1998-01-01' then 1 else null end)=1;
 {code}
 This should get rewritten by pushing the equality into the case branches.
 {code}
 select count(1) from store_sales where (case ss_sold_date when '1998-01-01' 
 then 1=1 else null=1 end);
 {code}
 Ending up with a simplified filter condition, resolving itself as 
 {code}
 select count(1) from store_sales where ss_sold_date= '1998-01-01' ;
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8696) HCatClientHMSImpl doesn't use a Retrying-HiveMetastoreClient.

2015-05-06 Thread Sushanth Sowmyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14531607#comment-14531607
 ] 

Sushanth Sowmyan commented on HIVE-8696:


Awesome, thanks!

 HCatClientHMSImpl doesn't use a Retrying-HiveMetastoreClient.
 -

 Key: HIVE-8696
 URL: https://issues.apache.org/jira/browse/HIVE-8696
 Project: Hive
  Issue Type: Sub-task
  Components: HCatalog, Metastore
Affects Versions: 0.12.0, 0.13.1
Reporter: Mithun Radhakrishnan
Assignee: Thiruvel Thirumoolan
 Fix For: 1.2.0

 Attachments: HIVE-8696.1.patch, HIVE-8696.2.patch, HIVE-8696.3.patch, 
 HIVE-8696.4.patch, HIVE-8696.5.patch, HIVE-8696.poc.patch


 The HCatClientHMSImpl doesn't use a RetryingHiveMetastoreClient. Users of the 
 HCatClient API that log in through keytabs will fail without retry, when 
 their TGTs expire.
 The fix is inbound. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9743) Incorrect result set for vectorized left outer join

2015-05-06 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14531621#comment-14531621
 ] 

Hive QA commented on HIVE-9743:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12730767/HIVE-9743.091.patch

{color:red}ERROR:{color} -1 due to 24 failed/errored test(s), 8904 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_parts
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_insert_partition_dynamic
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_insert_partition_static
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_join_unencrypted_tbl
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_join_with_different_encryption_keys
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_load_data_to_encrypted_tables
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_select_read_only_encrypted_tbl
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_disallow_transform
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_droppartition
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_sba_drop_table
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_alterpart_loc
org.apache.hadoop.hive.ql.security.TestStorageBasedClientSideAuthorizationProvider.testSimplePrivileges
org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationDrops.testDropDatabase
org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationDrops.testDropPartition
org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationDrops.testDropTable
org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationDrops.testDropView
org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationProvider.testSimplePrivileges
org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationProviderWithACL.testSimplePrivileges
org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationReads.testReadDbFailure
org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationReads.testReadDbSuccess
org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationReads.testReadTableFailure
org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationReads.testReadTableSuccess
org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.TestSQLStdHiveAccessControllerHS2.testConfigProcessing
org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.TestSQLStdHiveAccessControllerHS2.testConfigProcessingCustomSetWhitelistAppend
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3785/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3785/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3785/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 24 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12730767 - PreCommit-HIVE-TRUNK-Build

 Incorrect result set for vectorized left outer join
 ---

 Key: HIVE-9743
 URL: https://issues.apache.org/jira/browse/HIVE-9743
 Project: Hive
  Issue Type: Bug
  Components: SQL
Affects Versions: 0.14.0
Reporter: N Campbell
Assignee: Matt McCline
 Attachments: HIVE-9743.01.patch, HIVE-9743.02.patch, 
 HIVE-9743.03.patch, HIVE-9743.04.patch, HIVE-9743.05.patch, 
 HIVE-9743.06.patch, HIVE-9743.08.patch, HIVE-9743.09.patch, 
 HIVE-9743.091.patch


 This query is supposed to return 3 rows and will when run without Tez but 
 returns 2 rows when run with Tez.
 select tjoin1.rnum, tjoin1.c1, tjoin1.c2, tjoin2.c2 as c2j2 from tjoin1 left 
 outer join tjoin2 on ( tjoin1.c1 = tjoin2.c1 and tjoin1.c2  15 )
 tjoin1.rnum   tjoin1.c1   tjoin1.c2   c2j2
 1 20  25  null
 2 null  50  null
 instead of
 tjoin1.rnum   tjoin1.c1   tjoin1.c2   c2j2
 0 10  15  null
 1 20  25  null
 2 null  50  null
 create table  if not exists TJOIN1 (RNUM int , C1 int, C2 int)
  STORED AS orc ;
 0|10|15
 1|20|25
 2|\N|50
 create table  if not exists TJOIN2 (RNUM int , C1 int, C2 char(2))
 

[jira] [Commented] (HIVE-9743) Incorrect result set for vectorized left outer join

2015-05-06 Thread Matt McCline (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14531638#comment-14531638
 ] 

Matt McCline commented on HIVE-9743:


None of the test failures are related to my changes.

 Incorrect result set for vectorized left outer join
 ---

 Key: HIVE-9743
 URL: https://issues.apache.org/jira/browse/HIVE-9743
 Project: Hive
  Issue Type: Bug
  Components: SQL
Affects Versions: 0.14.0
Reporter: N Campbell
Assignee: Matt McCline
 Attachments: HIVE-9743.01.patch, HIVE-9743.02.patch, 
 HIVE-9743.03.patch, HIVE-9743.04.patch, HIVE-9743.05.patch, 
 HIVE-9743.06.patch, HIVE-9743.08.patch, HIVE-9743.09.patch, 
 HIVE-9743.091.patch


 This query is supposed to return 3 rows and will when run without Tez but 
 returns 2 rows when run with Tez.
 select tjoin1.rnum, tjoin1.c1, tjoin1.c2, tjoin2.c2 as c2j2 from tjoin1 left 
 outer join tjoin2 on ( tjoin1.c1 = tjoin2.c1 and tjoin1.c2  15 )
 tjoin1.rnum   tjoin1.c1   tjoin1.c2   c2j2
 1 20  25  null
 2 null  50  null
 instead of
 tjoin1.rnum   tjoin1.c1   tjoin1.c2   c2j2
 0 10  15  null
 1 20  25  null
 2 null  50  null
 create table  if not exists TJOIN1 (RNUM int , C1 int, C2 int)
  STORED AS orc ;
 0|10|15
 1|20|25
 2|\N|50
 create table  if not exists TJOIN2 (RNUM int , C1 int, C2 char(2))
 ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' LINES TERMINATED BY '\n' 
  STORED AS TEXTFILE ;
 0|10|BB
 1|15|DD
 2|\N|EE
 3|10|FF



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9743) Incorrect result set for vectorized left outer join

2015-05-06 Thread Vikram Dixit K (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14531735#comment-14531735
 ] 

Vikram Dixit K commented on HIVE-9743:
--

Thanks Matt and Jason.

 Incorrect result set for vectorized left outer join
 ---

 Key: HIVE-9743
 URL: https://issues.apache.org/jira/browse/HIVE-9743
 Project: Hive
  Issue Type: Bug
  Components: SQL
Affects Versions: 0.14.0
Reporter: N Campbell
Assignee: Matt McCline
 Fix For: 1.2.0

 Attachments: HIVE-9743.01.patch, HIVE-9743.02.patch, 
 HIVE-9743.03.patch, HIVE-9743.04.patch, HIVE-9743.05.patch, 
 HIVE-9743.06.patch, HIVE-9743.08.patch, HIVE-9743.09.patch, 
 HIVE-9743.091.patch


 This query is supposed to return 3 rows and will when run without Tez but 
 returns 2 rows when run with Tez.
 select tjoin1.rnum, tjoin1.c1, tjoin1.c2, tjoin2.c2 as c2j2 from tjoin1 left 
 outer join tjoin2 on ( tjoin1.c1 = tjoin2.c1 and tjoin1.c2  15 )
 tjoin1.rnum   tjoin1.c1   tjoin1.c2   c2j2
 1 20  25  null
 2 null  50  null
 instead of
 tjoin1.rnum   tjoin1.c1   tjoin1.c2   c2j2
 0 10  15  null
 1 20  25  null
 2 null  50  null
 create table  if not exists TJOIN1 (RNUM int , C1 int, C2 int)
  STORED AS orc ;
 0|10|15
 1|20|25
 2|\N|50
 create table  if not exists TJOIN2 (RNUM int , C1 int, C2 char(2))
 ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' LINES TERMINATED BY '\n' 
  STORED AS TEXTFILE ;
 0|10|BB
 1|15|DD
 2|\N|EE
 3|10|FF



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10628) Incorrect result when vectorized native mapjoin is enabled

2015-05-06 Thread Sushanth Sowmyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14531739#comment-14531739
 ] 

Sushanth Sowmyan commented on HIVE-10628:
-

After discussion with Gunther, marking this for 1.2.1 instead of 1.2.0, since 
vectorization is optional by default, and so, there is a workaround. If this 
makes it before we're done with the RC process, we will include it in 1.2.0 
itself, but we will not consider it a blocker for the 1.2.0 release.

 Incorrect result when vectorized native mapjoin is enabled
 --

 Key: HIVE-10628
 URL: https://issues.apache.org/jira/browse/HIVE-10628
 Project: Hive
  Issue Type: Bug
  Components: Hive
Affects Versions: 1.2.0
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical
 Fix For: 1.2.0, 1.3.0


 Incorrect results for this query:
 {noformat}
 select count(*) from store_sales ss join store_returns sr on (sr.sr_item_sk 
 = ss.ss_item_sk and sr.sr_customer_sk = ss.ss_customer_sk and 
 sr.sr_item_sk = ss.ss_item_sk) where ss.ss_net_paid  1000;
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9644) CASE comparison operator rotation optimization

2015-05-06 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-9644:
---
Attachment: HIVE-9644.3.patch

Updated patch.

 CASE comparison operator rotation optimization
 --

 Key: HIVE-9644
 URL: https://issues.apache.org/jira/browse/HIVE-9644
 Project: Hive
  Issue Type: Bug
  Components: Logical Optimizer
Affects Versions: 0.14.0, 1.0.0, 1.1.0
Reporter: Gopal V
Assignee: Ashutosh Chauhan
 Attachments: HIVE-9644.1.patch, HIVE-9644.2.patch, HIVE-9644.3.patch, 
 HIVE-9644.patch


 Constant folding for queries don't kick in for some automatically generated 
 query patterns which look like this.
 {code}
 hive explain select count(1) from store_sales where (case ss_sold_date when 
 '1998-01-01' then 1 else null end)=1;
 {code}
 This should get rewritten by pushing the equality into the case branches.
 {code}
 select count(1) from store_sales where (case ss_sold_date when '1998-01-01' 
 then 1=1 else null=1 end);
 {code}
 Ending up with a simplified filter condition, resolving itself as 
 {code}
 select count(1) from store_sales where ss_sold_date= '1998-01-01' ;
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10559) IndexOutOfBoundsException with RemoveDynamicPruningBySize

2015-05-06 Thread Wei Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Zheng updated HIVE-10559:
-
Attachment: HIVE-10559.03.patch

Upload 3rd patch for testing.

[~hagleitn] Can you take another look?

 IndexOutOfBoundsException with RemoveDynamicPruningBySize
 -

 Key: HIVE-10559
 URL: https://issues.apache.org/jira/browse/HIVE-10559
 Project: Hive
  Issue Type: Bug
  Components: Tez
Affects Versions: 1.2.0, 1.3.0
Reporter: Wei Zheng
Assignee: Wei Zheng
 Attachments: HIVE-10559.01.patch, HIVE-10559.02.patch, 
 HIVE-10559.03.patch, q85.q


 The problem can be reproduced by running the script attached.
 Backtrace
 {code}
 2015-04-29 10:34:36,390 ERROR [main]: ql.Driver 
 (SessionState.java:printError(956)) - FAILED: IndexOutOfBoundsException 
 Index: 0, Size: 0
 java.lang.IndexOutOfBoundsException: Index: 0, Size: 0
   at java.util.ArrayList.rangeCheck(ArrayList.java:635)
   at java.util.ArrayList.get(ArrayList.java:411)
   at 
 org.apache.hadoop.hive.ql.optimizer.RemoveDynamicPruningBySize.process(RemoveDynamicPruningBySize.java:61)
   at 
 org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
   at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:95)
   at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:79)
   at 
 org.apache.hadoop.hive.ql.lib.ForwardWalker.walk(ForwardWalker.java:77)
   at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:110)
   at 
 org.apache.hadoop.hive.ql.parse.TezCompiler.runStatsDependentOptimizations(TezCompiler.java:281)
   at 
 org.apache.hadoop.hive.ql.parse.TezCompiler.optimizeOperatorPlan(TezCompiler.java:123)
   at 
 org.apache.hadoop.hive.ql.parse.TaskCompiler.compile(TaskCompiler.java:102)
   at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10092)
   at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:9932)
   at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:227)
   at 
 org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:74)
   at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:227)
   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:424)
   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:308)
   at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1122)
   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1170)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1059)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049)
   at 
 org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:213)
   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165)
   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376)
   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:311)
   at 
 org.apache.hadoop.hive.ql.QTestUtil.executeClientInternal(QTestUtil.java:1026)
   at 
 org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:1000)
   at 
 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.runTest(TestMiniTezCliDriver.java:139)
   at 
 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_q85(TestMiniTezCliDriver.java:123)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
   at junit.framework.TestCase.runTest(TestCase.java:176)
   at junit.framework.TestCase.runBare(TestCase.java:141)
   at junit.framework.TestResult$1.protect(TestResult.java:122)
   at junit.framework.TestResult.runProtected(TestResult.java:142)
   at junit.framework.TestResult.run(TestResult.java:125)
   at junit.framework.TestCase.run(TestCase.java:129)
   at junit.framework.TestSuite.runTest(TestSuite.java:255)
   at junit.framework.TestSuite.run(TestSuite.java:250)
   at 
 org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:84)
   at 
 org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:264)
   at 
 org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153)
   at 
 org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:124)
   at 
 

[jira] [Updated] (HIVE-8696) HCatClientHMSImpl doesn't use a Retrying-HiveMetastoreClient.

2015-05-06 Thread Thiruvel Thirumoolan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvel Thirumoolan updated HIVE-8696:
---
Attachment: HIVE-8696.5.patch

Sorry, updating patch 5 which should be complete. I was going through many 
iterations to test it locally, looks like I missed the complete patch.

 HCatClientHMSImpl doesn't use a Retrying-HiveMetastoreClient.
 -

 Key: HIVE-8696
 URL: https://issues.apache.org/jira/browse/HIVE-8696
 Project: Hive
  Issue Type: Sub-task
  Components: HCatalog, Metastore
Affects Versions: 0.12.0, 0.13.1
Reporter: Mithun Radhakrishnan
Assignee: Thiruvel Thirumoolan
 Fix For: 1.2.0

 Attachments: HIVE-8696.1.patch, HIVE-8696.2.patch, HIVE-8696.3.patch, 
 HIVE-8696.4.patch, HIVE-8696.5.patch, HIVE-8696.poc.patch


 The HCatClientHMSImpl doesn't use a RetryingHiveMetastoreClient. Users of the 
 HCatClient API that log in through keytabs will fail without retry, when 
 their TGTs expire.
 The fix is inbound. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10453) HS2 leaking open file descriptors when using UDFs

2015-05-06 Thread Yongzhi Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14531637#comment-14531637
 ] 

Yongzhi Chen commented on HIVE-10453:
-

Thanks [~szehon] for reviewing it. The CUDFLoader means classloader related to 
load udf jars. C means class. I am not good with names. 

 HS2 leaking open file descriptors when using UDFs
 -

 Key: HIVE-10453
 URL: https://issues.apache.org/jira/browse/HIVE-10453
 Project: Hive
  Issue Type: Bug
Reporter: Yongzhi Chen
Assignee: Yongzhi Chen
 Attachments: HIVE-10453.1.patch, HIVE-10453.2.patch


 1. create a custom function by
 CREATE FUNCTION myfunc AS 'someudfclass' using jar 'hdfs:///tmp/myudf.jar';
 2. Create a simple jdbc client, just do 
 connect, 
 run simple query which using the function such as:
 select myfunc(col1) from sometable
 3. Disconnect.
 Check open file for HiveServer2 by:
 lsof -p HSProcID | grep myudf.jar
 You will see the leak as:
 {noformat}
 java  28718 ychen  txt  REG1,4741 212977666 
 /private/var/folders/6p/7_njf13d6h144wldzbbsfpz8gp/T/1bfe3de0-ac63-4eba-a725-6a9840f1f8d5_resources/myudf.jar
 java  28718 ychen  330r REG1,4741 212977666 
 /private/var/folders/6p/7_njf13d6h144wldzbbsfpz8gp/T/1bfe3de0-ac63-4eba-a725-6a9840f1f8d5_resources/myudf.jar
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9736) StorageBasedAuthProvider should batch namenode-calls where possible.

2015-05-06 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HIVE-9736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14531704#comment-14531704
 ] 

Sergio Peña commented on HIVE-9736:
---

Hi [~mithun]

This patch is causing the above tests to fail due to the change on 
{{Hadoop23Shims.checkFileAccess(FileSystem fs, IteratorFileStatus statuses, 
EnumSetFsAction actions)}}. 

The line that fails is {{accessMethod.invoke(fs, statuses.next(), 
combine(actions));}}

I an running hadoop 2.6.0, and the FileSystem.access() object accepts a Path 
and FsAction. When I run the code that checks patch permissions, I get this 
error: 
{noformat}
hive explain select * from a join b on a.id = b.id;
FAILED: SemanticException Unable to determine if 
hdfs://localhost:9000/user/hive/warehouse/a is read only: 
java.lang.IllegalArgumentException: argument type mismatch
{noformat}

Is there a follow-up jira for this error?






 StorageBasedAuthProvider should batch namenode-calls where possible.
 

 Key: HIVE-9736
 URL: https://issues.apache.org/jira/browse/HIVE-9736
 Project: Hive
  Issue Type: Bug
  Components: Metastore, Security
Reporter: Mithun Radhakrishnan
Assignee: Mithun Radhakrishnan
 Fix For: 1.2.0

 Attachments: HIVE-9736.1.patch, HIVE-9736.2.patch, HIVE-9736.3.patch, 
 HIVE-9736.4.patch, HIVE-9736.5.patch, HIVE-9736.6.patch


 Consider a table partitioned by 2 keys (dt, region). Say a dt partition could 
 have 1 associated regions. Consider that the user does:
 {code:sql}
 ALTER TABLE my_table DROP PARTITION (dt='20150101');
 {code}
 As things stand now, {{StorageBasedAuthProvider}} will make individual 
 {{DistributedFileSystem.listStatus()}} calls for each partition-directory, 
 and authorize each one separately. It'd be faster to batch the calls, and 
 examine multiple FileStatus objects at once.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10564) webhcat should use webhcat-site.xml properties for controller job submission

2015-05-06 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14531707#comment-14531707
 ] 

Thejas M Nair commented on HIVE-10564:
--

[~ekoifman] Thanks a lot for identifying the issue, reviewing the patch and 
verifying the fix! 

 webhcat should use webhcat-site.xml properties for controller job submission
 

 Key: HIVE-10564
 URL: https://issues.apache.org/jira/browse/HIVE-10564
 Project: Hive
  Issue Type: Bug
Reporter: Thejas M Nair
Assignee: Thejas M Nair
  Labels: TODOC1.2
 Fix For: 1.2.0

 Attachments: HIVE-10564.1.patch, HIVE-10564.2.patch


 webhcat should use webhcat-site.xml in configuration for the 
 TempletonController map-only job that it launches. This will allow users to 
 set any MR/hdfs properties that want to see used for the controller job.
 NO PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10556) ORC PPD schema on read related changes

2015-05-06 Thread Sushanth Sowmyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14531539#comment-14531539
 ] 

Sushanth Sowmyan commented on HIVE-10556:
-

Per discussion with [~hagleitn], this is okay to defer out to 1.2.1, so doing 
so - still marking it in the tentative list, so if it is reviewed and in by the 
time we finish our RC process, it'll be in, but otherwise, it'll track for 1.2.1

 ORC PPD schema on read related changes
 --

 Key: HIVE-10556
 URL: https://issues.apache.org/jira/browse/HIVE-10556
 Project: Hive
  Issue Type: Bug
Affects Versions: 1.2.0, 1.3.0
Reporter: Prasanth Jayachandran
Assignee: Gopal V

 Follow up for HIVE-10286. Some fixes needs to be done for schema on read. 
 Like Predicate.STRING with value 15 and integer min/max stats of 10,100 
 should return YES_NO truth value.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10559) IndexOutOfBoundsException with RemoveDynamicPruningBySize

2015-05-06 Thread Sushanth Sowmyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14531537#comment-14531537
 ] 

Sushanth Sowmyan commented on HIVE-10559:
-

Per discussion with [~hagleitn], this is okay to defer out to 1.2.1, so doing 
so - still marking it in the tentative list, so if it is reviewed and in by the 
time we finish our RC process, it'll be in, but otherwise, it'll track for 1.2.1

 IndexOutOfBoundsException with RemoveDynamicPruningBySize
 -

 Key: HIVE-10559
 URL: https://issues.apache.org/jira/browse/HIVE-10559
 Project: Hive
  Issue Type: Bug
  Components: Tez
Affects Versions: 1.2.0, 1.3.0
Reporter: Wei Zheng
Assignee: Wei Zheng
 Attachments: HIVE-10559.01.patch, HIVE-10559.02.patch, 
 HIVE-10559.03.patch, q85.q


 The problem can be reproduced by running the script attached.
 Backtrace
 {code}
 2015-04-29 10:34:36,390 ERROR [main]: ql.Driver 
 (SessionState.java:printError(956)) - FAILED: IndexOutOfBoundsException 
 Index: 0, Size: 0
 java.lang.IndexOutOfBoundsException: Index: 0, Size: 0
   at java.util.ArrayList.rangeCheck(ArrayList.java:635)
   at java.util.ArrayList.get(ArrayList.java:411)
   at 
 org.apache.hadoop.hive.ql.optimizer.RemoveDynamicPruningBySize.process(RemoveDynamicPruningBySize.java:61)
   at 
 org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
   at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:95)
   at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:79)
   at 
 org.apache.hadoop.hive.ql.lib.ForwardWalker.walk(ForwardWalker.java:77)
   at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:110)
   at 
 org.apache.hadoop.hive.ql.parse.TezCompiler.runStatsDependentOptimizations(TezCompiler.java:281)
   at 
 org.apache.hadoop.hive.ql.parse.TezCompiler.optimizeOperatorPlan(TezCompiler.java:123)
   at 
 org.apache.hadoop.hive.ql.parse.TaskCompiler.compile(TaskCompiler.java:102)
   at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10092)
   at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:9932)
   at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:227)
   at 
 org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:74)
   at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:227)
   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:424)
   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:308)
   at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1122)
   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1170)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1059)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049)
   at 
 org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:213)
   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165)
   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376)
   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:311)
   at 
 org.apache.hadoop.hive.ql.QTestUtil.executeClientInternal(QTestUtil.java:1026)
   at 
 org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:1000)
   at 
 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.runTest(TestMiniTezCliDriver.java:139)
   at 
 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_q85(TestMiniTezCliDriver.java:123)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
   at junit.framework.TestCase.runTest(TestCase.java:176)
   at junit.framework.TestCase.runBare(TestCase.java:141)
   at junit.framework.TestResult$1.protect(TestResult.java:122)
   at junit.framework.TestResult.runProtected(TestResult.java:142)
   at junit.framework.TestResult.run(TestResult.java:125)
   at junit.framework.TestCase.run(TestCase.java:129)
   at junit.framework.TestSuite.runTest(TestSuite.java:255)
   at junit.framework.TestSuite.run(TestSuite.java:250)
   at 
 org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:84)
   at 
 org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:264)
   at 
 

[jira] [Commented] (HIVE-10564) webhcat should use webhcat-site.xml properties for controller job submission

2015-05-06 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14531588#comment-14531588
 ] 

Eugene Koifman commented on HIVE-10564:
---

[~thejas] I tested patch 2 - it runs clean.
+1

 webhcat should use webhcat-site.xml properties for controller job submission
 

 Key: HIVE-10564
 URL: https://issues.apache.org/jira/browse/HIVE-10564
 Project: Hive
  Issue Type: Bug
Reporter: Thejas M Nair
Assignee: Thejas M Nair
  Labels: TODOC1.2
 Fix For: 1.2.0

 Attachments: HIVE-10564.1.patch, HIVE-10564.2.patch


 webhcat should use webhcat-site.xml in configuration for the 
 TempletonController map-only job that it launches. This will allow users to 
 set any MR/hdfs properties that want to see used for the controller job.
 NO PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8696) HCatClientHMSImpl doesn't use a Retrying-HiveMetastoreClient.

2015-05-06 Thread Thiruvel Thirumoolan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14531670#comment-14531670
 ] 

Thiruvel Thirumoolan commented on HIVE-8696:


I ran all hcatalog tests locally and they passed. Will wait for precommit build 
to run. Hopefully no surprises.

I also updated the review board entry with the latest patch.

 HCatClientHMSImpl doesn't use a Retrying-HiveMetastoreClient.
 -

 Key: HIVE-8696
 URL: https://issues.apache.org/jira/browse/HIVE-8696
 Project: Hive
  Issue Type: Sub-task
  Components: HCatalog, Metastore
Affects Versions: 0.12.0, 0.13.1
Reporter: Mithun Radhakrishnan
Assignee: Thiruvel Thirumoolan
 Fix For: 1.2.0

 Attachments: HIVE-8696.1.patch, HIVE-8696.2.patch, HIVE-8696.3.patch, 
 HIVE-8696.4.patch, HIVE-8696.5.patch, HIVE-8696.poc.patch


 The HCatClientHMSImpl doesn't use a RetryingHiveMetastoreClient. Users of the 
 HCatClient API that log in through keytabs will fail without retry, when 
 their TGTs expire.
 The fix is inbound. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10638) HIVE-9736 introduces issues with Hadoop23Shims.checkFileAccess

2015-05-06 Thread Sushanth Sowmyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14531723#comment-14531723
 ] 

Sushanth Sowmyan commented on HIVE-10638:
-

Marking as blocker for branch-1.2.

 HIVE-9736 introduces issues with Hadoop23Shims.checkFileAccess
 --

 Key: HIVE-10638
 URL: https://issues.apache.org/jira/browse/HIVE-10638
 Project: Hive
  Issue Type: Bug
Reporter: Sushanth Sowmyan

 Copy-pasting [~spena]'s comment in HIVE-9736:
 Hi [~mithun]
 This patch is causing the above tests to fail due to the change on 
 {{Hadoop23Shims.checkFileAccess(FileSystem fs, IteratorFileStatus statuses, 
 EnumSetFsAction actions)}}. 
 The line that fails is {{accessMethod.invoke(fs, statuses.next(), 
 combine(actions));}}
 I an running hadoop 2.6.0, and the FileSystem.access() object accepts a Path 
 and FsAction. When I run the code that checks patch permissions, I get this 
 error: 
 {noformat}
 hive explain select * from a join b on a.id = b.id;
 FAILED: SemanticException Unable to determine if 
 hdfs://localhost:9000/user/hive/warehouse/a is read only: 
 java.lang.IllegalArgumentException: argument type mismatch
 {noformat}
 Is there a follow-up jira for this error?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9736) StorageBasedAuthProvider should batch namenode-calls where possible.

2015-05-06 Thread Sushanth Sowmyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14531722#comment-14531722
 ] 

Sushanth Sowmyan commented on HIVE-9736:


Hi Sergio, thanks for the catch, have filed 
https://issues.apache.org/jira/browse/HIVE-10638 for the same. [~mithun], could 
you please look at that issue? I will look through it too.

 StorageBasedAuthProvider should batch namenode-calls where possible.
 

 Key: HIVE-9736
 URL: https://issues.apache.org/jira/browse/HIVE-9736
 Project: Hive
  Issue Type: Bug
  Components: Metastore, Security
Reporter: Mithun Radhakrishnan
Assignee: Mithun Radhakrishnan
 Fix For: 1.2.0

 Attachments: HIVE-9736.1.patch, HIVE-9736.2.patch, HIVE-9736.3.patch, 
 HIVE-9736.4.patch, HIVE-9736.5.patch, HIVE-9736.6.patch


 Consider a table partitioned by 2 keys (dt, region). Say a dt partition could 
 have 1 associated regions. Consider that the user does:
 {code:sql}
 ALTER TABLE my_table DROP PARTITION (dt='20150101');
 {code}
 As things stand now, {{StorageBasedAuthProvider}} will make individual 
 {{DistributedFileSystem.listStatus()}} calls for each partition-directory, 
 and authorize each one separately. It'd be faster to batch the calls, and 
 examine multiple FileStatus objects at once.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9644) CASE comparison operator rotation optimization

2015-05-06 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14531519#comment-14531519
 ] 

Ashutosh Chauhan commented on HIVE-9644:


I will update the description of this jira (about folding udfs) since it has 
captured a bit of discussion. Will open another jira for operator rotation.

 CASE comparison operator rotation optimization
 --

 Key: HIVE-9644
 URL: https://issues.apache.org/jira/browse/HIVE-9644
 Project: Hive
  Issue Type: Bug
  Components: Logical Optimizer
Affects Versions: 0.14.0, 1.0.0, 1.1.0
Reporter: Gopal V
Assignee: Ashutosh Chauhan
 Attachments: HIVE-9644.1.patch, HIVE-9644.2.patch, HIVE-9644.patch


 Constant folding for queries don't kick in for some automatically generated 
 query patterns which look like this.
 {code}
 hive explain select count(1) from store_sales where (case ss_sold_date when 
 '1998-01-01' then 1 else null end)=1;
 {code}
 This should get rewritten by pushing the equality into the case branches.
 {code}
 select count(1) from store_sales where (case ss_sold_date when '1998-01-01' 
 then 1=1 else null=1 end);
 {code}
 Ending up with a simplified filter condition, resolving itself as 
 {code}
 select count(1) from store_sales where ss_sold_date= '1998-01-01' ;
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9644) Fold case when udfs

2015-05-06 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-9644:
---
Summary: Fold case  when udfs  (was: CASE comparison operator rotation 
optimization)

 Fold case  when udfs
 -

 Key: HIVE-9644
 URL: https://issues.apache.org/jira/browse/HIVE-9644
 Project: Hive
  Issue Type: Bug
  Components: Logical Optimizer
Affects Versions: 0.14.0, 1.0.0, 1.1.0
Reporter: Gopal V
Assignee: Ashutosh Chauhan
 Attachments: HIVE-9644.1.patch, HIVE-9644.2.patch, HIVE-9644.3.patch, 
 HIVE-9644.patch


 Constant folding for queries don't kick in for some automatically generated 
 query patterns which look like this.
 {code}
 hive explain select count(1) from store_sales where (case ss_sold_date when 
 '1998-01-01' then 1 else null end)=1;
 {code}
 This should get rewritten by pushing the equality into the case branches.
 {code}
 select count(1) from store_sales where (case ss_sold_date when '1998-01-01' 
 then 1=1 else null=1 end);
 {code}
 Ending up with a simplified filter condition, resolving itself as 
 {code}
 select count(1) from store_sales where ss_sold_date= '1998-01-01' ;
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10565) LLAP: Native Vector Map Join doesn't handle filtering and matching on LEFT OUTER JOIN repeated key correctly

2015-05-06 Thread Sushanth Sowmyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14531536#comment-14531536
 ] 

Sushanth Sowmyan commented on HIVE-10565:
-

Per discussion with [~hagleitn], this is okay to defer out to 1.2.1, so doing 
so - still marking it in the tentative list, so if it is reviewed and in by the 
time we finish our RC process, it'll be in, but otherwise, it'll track for 1.2.1

 LLAP: Native Vector Map Join doesn't handle filtering and matching on LEFT 
 OUTER JOIN repeated key correctly
 

 Key: HIVE-10565
 URL: https://issues.apache.org/jira/browse/HIVE-10565
 Project: Hive
  Issue Type: Sub-task
  Components: Hive
Affects Versions: 1.2.0
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical
 Fix For: 1.2.0, 1.3.0

 Attachments: HIVE-10565.01.patch, HIVE-10565.02.patch, 
 HIVE-10565.03.patch, HIVE-10565.04.patch, HIVE-10565.05.patch, 
 HIVE-10565.06.patch, HIVE-10565.07.patch, HIVE-10565.08.patch


 Filtering can knock out some of the rows for a repeated key, but those 
 knocked out rows need to be included in the LEFT OUTER JOIN result and are 
 currently not when only some rows are filtered out.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10530) Aggregate stats cache: bug fixes for RDBMS path

2015-05-06 Thread Vaibhav Gumashta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-10530:

Attachment: HIVE-10530.2.patch

 Aggregate stats cache: bug fixes for RDBMS path
 ---

 Key: HIVE-10530
 URL: https://issues.apache.org/jira/browse/HIVE-10530
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 1.2.0
Reporter: Vaibhav Gumashta
Assignee: Vaibhav Gumashta
 Fix For: 1.2.0

 Attachments: HIVE-10530.1.patch, HIVE-10530.2.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10526) CBO (Calcite Return Path): HiveCost epsilon comparison should take row count in to account

2015-05-06 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14531344#comment-14531344
 ] 

Ashutosh Chauhan commented on HIVE-10526:
-

+1 LGTM

 CBO (Calcite Return Path): HiveCost epsilon comparison should take row count 
 in to account
 --

 Key: HIVE-10526
 URL: https://issues.apache.org/jira/browse/HIVE-10526
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Affects Versions: 0.12.0
Reporter: Laljo John Pullokkaran
Assignee: Laljo John Pullokkaran
 Fix For: 1.2.0

 Attachments: HIVE-10526.1.patch, HIVE-10526.2.patch, 
 HIVE-10526.3.patch, HIVE-10526.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10626) Spark paln need to be updated [Spark Branch]

2015-05-06 Thread Chinna Rao Lalam (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chinna Rao Lalam updated HIVE-10626:

Attachment: HIVE-10626.2-spark.patch

If it is not used out of this class scope it make sense. Updated the patch with 
local variable.

 Spark paln need to be updated [Spark Branch]
 

 Key: HIVE-10626
 URL: https://issues.apache.org/jira/browse/HIVE-10626
 Project: Hive
  Issue Type: Bug
  Components: Spark
Affects Versions: spark-branch
Reporter: Chinna Rao Lalam
Assignee: Chinna Rao Lalam
 Attachments: HIVE-10626-spark.patch, HIVE-10626.1-spark.patch, 
 HIVE-10626.2-spark.patch


 [HIVE-8858] basic patch was committed, latest patch need to be committed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-6679) HiveServer2 should support configurable the server side socket timeout and keepalive for various transports types where applicable

2015-05-06 Thread Sushanth Sowmyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14531463#comment-14531463
 ] 

Sushanth Sowmyan commented on HIVE-6679:


Per discussion with Vaibhav, confirming deferring out of branch-1.2

 HiveServer2 should support configurable the server side socket timeout and 
 keepalive for various transports types where applicable
 --

 Key: HIVE-6679
 URL: https://issues.apache.org/jira/browse/HIVE-6679
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Affects Versions: 0.13.0, 0.14.0, 1.0.0, 1.2.0, 1.1.0
Reporter: Prasad Mujumdar
Assignee: Navis
  Labels: TODOC1.0, TODOC15
 Fix For: 1.3.0

 Attachments: HIVE-6679.1.patch.txt, HIVE-6679.2.patch.txt, 
 HIVE-6679.3.patch, HIVE-6679.4.patch, HIVE-6679.5.patch, HIVE-6679.6.patch


  HiveServer2 should support configurable the server side socket read timeout 
 and TCP keep-alive option. Metastore server already support this (and the so 
 is the old hive server). 
 We now have multiple client connectivity options like Kerberos, Delegation 
 Token (Digest-MD5), Plain SASL, Plain SASL with SSL and raw sockets. The 
 configuration should be applicable to all types (if possible).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HIVE-10611) Mini tez tests wait for 5 minutes before shutting down

2015-05-06 Thread Vikram Dixit K (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14531462#comment-14531462
 ] 

Vikram Dixit K edited comment on HIVE-10611 at 5/6/15 9:31 PM:
---

Committed to trunk to alleviate some of the pressure. Thanks Ashutosh for the 
review.


was (Author: vikram.dixit):
Committed to trunk to alleviate some of the pressure.

 Mini tez tests wait for 5 minutes before shutting down
 --

 Key: HIVE-10611
 URL: https://issues.apache.org/jira/browse/HIVE-10611
 Project: Hive
  Issue Type: Bug
  Components: Tests
Affects Versions: 1.3.0
Reporter: Vikram Dixit K
Assignee: Vikram Dixit K
 Attachments: HIVE-10611.1.patch


 Currently, at shutdown, the tez mini cluster waits for the session to close 
 before shutting down the cluster. This ends up being 5 minutes - the default 
 value. We can shut down the session to alleviate this situation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10612) HIVE-10578 broke TestSQLStdHiveAccessControllerHS2 tests

2015-05-06 Thread Sushanth Sowmyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-10612:

Attachment: HIVE-10612.2.patch

Attaching duplicate HIVE-10612.2.patch in case the precommit break yesterday 
night marks the .1.patch as processed.

 HIVE-10578 broke TestSQLStdHiveAccessControllerHS2 tests
 

 Key: HIVE-10612
 URL: https://issues.apache.org/jira/browse/HIVE-10612
 Project: Hive
  Issue Type: Bug
  Components: Authorization
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Attachments: HIVE-10612.1.patch, HIVE-10612.2.patch


 The change in HIVE-10578 has broken two tests in 
 TestSQLStdHiveAccessControllerHS2 - testConfigProcessing and 
 testConfigProcessingCustomSetWhitelistAppend.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9845) HCatSplit repeats information making input split data size huge

2015-05-06 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14531306#comment-14531306
 ] 

Hive QA commented on HIVE-9845:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12730874/HIVE-9845.6.patch

{color:red}ERROR:{color} -1 due to 25 failed/errored test(s), 8900 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_parts
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_insert_partition_dynamic
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_insert_partition_static
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_join_unencrypted_tbl
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_join_with_different_encryption_keys
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_load_data_to_encrypted_tables
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_select_read_only_encrypted_tbl
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_disallow_transform
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_droppartition
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_sba_drop_table
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_alterpart_loc
org.apache.hadoop.hive.ql.security.TestStorageBasedClientSideAuthorizationProvider.testSimplePrivileges
org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationDrops.testDropDatabase
org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationDrops.testDropPartition
org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationDrops.testDropTable
org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationDrops.testDropView
org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationProvider.testSimplePrivileges
org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationProviderWithACL.testSimplePrivileges
org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationReads.testReadDbFailure
org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationReads.testReadDbSuccess
org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationReads.testReadTableFailure
org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationReads.testReadTableSuccess
org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.TestSQLStdHiveAccessControllerHS2.testConfigProcessing
org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.TestSQLStdHiveAccessControllerHS2.testConfigProcessingCustomSetWhitelistAppend
org.apache.hive.hcatalog.streaming.TestStreaming.testTransactionBatchEmptyCommit
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3783/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3783/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3783/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 25 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12730874 - PreCommit-HIVE-TRUNK-Build

 HCatSplit repeats information making input split data size huge
 ---

 Key: HIVE-9845
 URL: https://issues.apache.org/jira/browse/HIVE-9845
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Reporter: Rohini Palaniswamy
Assignee: Mithun Radhakrishnan
 Attachments: HIVE-9845.1.patch, HIVE-9845.3.patch, HIVE-9845.4.patch, 
 HIVE-9845.5.patch, HIVE-9845.6.patch


 Pig on Tez jobs with larger tables hit PIG-4443. Running on HDFS data which 
 has even triple the number of splits(100K+ splits and tasks) does not hit 
 that issue.
 {code}
 HCatBaseInputFormat.java:
  //Call getSplit on the InputFormat, create an
   //HCatSplit for each underlying split
   //NumSplits is 0 for our purposes
   org.apache.hadoop.mapred.InputSplit[] baseSplits = 
 inputFormat.getSplits(jobConf, 0);
   for(org.apache.hadoop.mapred.InputSplit split : baseSplits) {
 splits.add(new HCatSplit(
 partitionInfo,
 split,allCols));
   }
 {code}
 Each hcatSplit duplicates partition schema and table schema.




[jira] [Commented] (HIVE-10626) Spark paln need to be updated [Spark Branch]

2015-05-06 Thread Jimmy Xiang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14531320#comment-14531320
 ] 

Jimmy Xiang commented on HIVE-10626:


I see. In order to use toString, we have to add an extra field, which may not 
be good. Probably it is better to create a StringBuilder local variable in 
logSparkPlan. What do you think?

 Spark paln need to be updated [Spark Branch]
 

 Key: HIVE-10626
 URL: https://issues.apache.org/jira/browse/HIVE-10626
 Project: Hive
  Issue Type: Bug
  Components: Spark
Affects Versions: spark-branch
Reporter: Chinna Rao Lalam
Assignee: Chinna Rao Lalam
 Attachments: HIVE-10626-spark.patch, HIVE-10626.1-spark.patch


 [HIVE-8858] basic patch was committed, latest patch need to be committed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10563) MiniTezCliDriver tests ordering issues

2015-05-06 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14531330#comment-14531330
 ] 

Ashutosh Chauhan commented on HIVE-10563:
-

I have not verified in code, but thats my understanding too. Only once in q 
file should be sufficient to cover for all queries in it. 

 MiniTezCliDriver tests ordering issues
 --

 Key: HIVE-10563
 URL: https://issues.apache.org/jira/browse/HIVE-10563
 Project: Hive
  Issue Type: Bug
Reporter: Hari Sankar Sivarama Subramaniyan
Assignee: Hari Sankar Sivarama Subramaniyan
 Attachments: HIVE-10563.1.patch, HIVE-10563.2.patch


 There are a bunch of tests related to TestMiniTezCliDriver which gives 
 ordering issues when run on Centos/Windows/OSX



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10614) schemaTool upgrade from 0.14.0 to 1.3.0 causes failure

2015-05-06 Thread Sushanth Sowmyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14531360#comment-14531360
 ] 

Sushanth Sowmyan commented on HIVE-10614:
-

Cool, will go ahead and commit HIVE-10614.1.patch to master and 
HIVE-10614.1.branch-0.12.patch to branch-1.2.

[~hsubramaniyan], could you please file another jira and link it to HIVE-7018 
and this jira as a follow-up jira to address removal of LINK_TARGET_ID in a 
manner that is schematool compliant?

 schemaTool upgrade from 0.14.0 to 1.3.0 causes failure
 --

 Key: HIVE-10614
 URL: https://issues.apache.org/jira/browse/HIVE-10614
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Reporter: Hari Sankar Sivarama Subramaniyan
Assignee: Hari Sankar Sivarama Subramaniyan
Priority: Critical
 Attachments: HIVE-10614.1-branch-0.12.patch, HIVE-10614.1.patch


 ./schematool -dbType mysql -upgradeSchemaFrom 0.14.0 -verbose
 {code}
 ++--+
 | 
|
 ++--+
 |  HIVE-7018 Remove Table and Partition tables column LINK_TARGET_ID from 
 Mysql for other DBs do not have it   |
 ++--+
 1 row selected (0.004 seconds)
 0: jdbc:mysql://node-1.example.com/hive DROP PROCEDURE IF EXISTS 
 RM_TLBS_LINKID
 No rows affected (0.005 seconds)
 0: jdbc:mysql://node-1.example.com/hive DROP PROCEDURE IF EXISTS 
 RM_PARTITIONS_LINKID
 No rows affected (0.006 seconds)
 0: jdbc:mysql://node-1.example.com/hive DROP PROCEDURE IF EXISTS RM_LINKID
 No rows affected (0.002 seconds)
 0: jdbc:mysql://node-1.example.com/hive CREATE PROCEDURE RM_TLBS_LINKID() 
 BEGIN IF EXISTS (SELECT * FROM `INFORMATION_SCHEMA`.`COLUMNS` WHERE 
 `TABLE_NAME` = 'TBLS' AND `COLUMN_NAME` = 'LINK_TARGET_ID') THEN ALTER TABLE 
 `TBLS` DROP FOREIGN KEY `TBLS_FK3` ; ALTER TABLE `TBLS` DROP KEY `TBLS_N51` ; 
 ALTER TABLE `TBLS` DROP COLUMN `LINK_TARGET_ID` ; END IF; END
 Error: You have an error in your SQL syntax; check the manual that 
 corresponds to your MySQL server version for the right syntax to use near '' 
 at line 1 (state=42000,code=1064)
 Closing: 0: jdbc:mysql://node-1.example.com/hive?createDatabaseIfNotExist=true
 org.apache.hadoop.hive.metastore.HiveMetaException: Upgrade FAILED! Metastore 
 state would be inconsistent !!
 org.apache.hadoop.hive.metastore.HiveMetaException: Upgrade FAILED! Metastore 
 state would be inconsistent !!
   at 
 org.apache.hive.beeline.HiveSchemaTool.doUpgrade(HiveSchemaTool.java:229)
   at org.apache.hive.beeline.HiveSchemaTool.main(HiveSchemaTool.java:468)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
   at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
   at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
 Caused by: java.io.IOException: Schema script failed, errorcode 2
   at 
 org.apache.hive.beeline.HiveSchemaTool.runBeeLine(HiveSchemaTool.java:355)
   at 
 org.apache.hive.beeline.HiveSchemaTool.runBeeLine(HiveSchemaTool.java:326)
   at 
 org.apache.hive.beeline.HiveSchemaTool.doUpgrade(HiveSchemaTool.java:224)
 {code}
 Looks like HIVE-7018 has introduced stored procedure as part of mysql upgrade 
 script and it is causing issues with schematool upgrade.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8696) HCatClientHMSImpl doesn't use a Retrying-HiveMetastoreClient.

2015-05-06 Thread Thiruvel Thirumoolan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvel Thirumoolan updated HIVE-8696:
---
Attachment: HIVE-8696.3.patch

I think TestPassProperties would fail because the error message changed in the 
stack frame. Updating with test changes included.

 HCatClientHMSImpl doesn't use a Retrying-HiveMetastoreClient.
 -

 Key: HIVE-8696
 URL: https://issues.apache.org/jira/browse/HIVE-8696
 Project: Hive
  Issue Type: Sub-task
  Components: HCatalog, Metastore
Affects Versions: 0.12.0, 0.13.1
Reporter: Mithun Radhakrishnan
Assignee: Thiruvel Thirumoolan
 Fix For: 1.2.0

 Attachments: HIVE-8696.1.patch, HIVE-8696.2.patch, HIVE-8696.3.patch, 
 HIVE-8696.poc.patch


 The HCatClientHMSImpl doesn't use a RetryingHiveMetastoreClient. Users of the 
 HCatClient API that log in through keytabs will fail without retry, when 
 their TGTs expire.
 The fix is inbound. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10563) MiniTezCliDriver tests ordering issues

2015-05-06 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-10563:
-
Attachment: HIVE-10563.3.patch

 MiniTezCliDriver tests ordering issues
 --

 Key: HIVE-10563
 URL: https://issues.apache.org/jira/browse/HIVE-10563
 Project: Hive
  Issue Type: Bug
Reporter: Hari Sankar Sivarama Subramaniyan
Assignee: Hari Sankar Sivarama Subramaniyan
 Attachments: HIVE-10563.1.patch, HIVE-10563.2.patch, 
 HIVE-10563.3.patch


 There are a bunch of tests related to TestMiniTezCliDriver which gives 
 ordering issues when run on Centos/Windows/OSX



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9845) HCatSplit repeats information making input split data size huge

2015-05-06 Thread Sushanth Sowmyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14531404#comment-14531404
 ] 

Sushanth Sowmyan commented on HIVE-9845:


None of the test failure appear related, will go ahead and commit, +1.

 HCatSplit repeats information making input split data size huge
 ---

 Key: HIVE-9845
 URL: https://issues.apache.org/jira/browse/HIVE-9845
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Reporter: Rohini Palaniswamy
Assignee: Mithun Radhakrishnan
 Attachments: HIVE-9845.1.patch, HIVE-9845.3.patch, HIVE-9845.4.patch, 
 HIVE-9845.5.patch, HIVE-9845.6.patch


 Pig on Tez jobs with larger tables hit PIG-4443. Running on HDFS data which 
 has even triple the number of splits(100K+ splits and tasks) does not hit 
 that issue.
 {code}
 HCatBaseInputFormat.java:
  //Call getSplit on the InputFormat, create an
   //HCatSplit for each underlying split
   //NumSplits is 0 for our purposes
   org.apache.hadoop.mapred.InputSplit[] baseSplits = 
 inputFormat.getSplits(jobConf, 0);
   for(org.apache.hadoop.mapred.InputSplit split : baseSplits) {
 splits.add(new HCatSplit(
 partitionInfo,
 split,allCols));
   }
 {code}
 Each hcatSplit duplicates partition schema and table schema.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10484) Vectorization : RuntimeException Big Table Retained Mapping duplicate column

2015-05-06 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14531467#comment-14531467
 ] 

Hive QA commented on HIVE-10484:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12730881/HIVE-10484.02.patch

{color:red}ERROR:{color} -1 due to 24 failed/errored test(s), 8900 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_parts
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_insert_partition_dynamic
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_insert_partition_static
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_join_unencrypted_tbl
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_join_with_different_encryption_keys
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_load_data_to_encrypted_tables
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_select_read_only_encrypted_tbl
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_disallow_transform
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_droppartition
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_sba_drop_table
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_alterpart_loc
org.apache.hadoop.hive.ql.security.TestStorageBasedClientSideAuthorizationProvider.testSimplePrivileges
org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationDrops.testDropDatabase
org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationDrops.testDropPartition
org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationDrops.testDropTable
org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationDrops.testDropView
org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationProvider.testSimplePrivileges
org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationProviderWithACL.testSimplePrivileges
org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationReads.testReadDbFailure
org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationReads.testReadDbSuccess
org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationReads.testReadTableFailure
org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationReads.testReadTableSuccess
org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.TestSQLStdHiveAccessControllerHS2.testConfigProcessing
org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.TestSQLStdHiveAccessControllerHS2.testConfigProcessingCustomSetWhitelistAppend
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3784/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3784/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3784/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 24 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12730881 - PreCommit-HIVE-TRUNK-Build

 Vectorization : RuntimeException Big Table Retained Mapping duplicate column
 --

 Key: HIVE-10484
 URL: https://issues.apache.org/jira/browse/HIVE-10484
 Project: Hive
  Issue Type: Bug
  Components: Tez, Vectorization
Affects Versions: 1.2.0
Reporter: Mostafa Mokhtar
Assignee: Matt McCline
 Fix For: 1.2.0

 Attachments: HIVE-10484.01.patch, HIVE-10484.02.patch


 With vectorization and tez enabled TPC-DS Q70 fails with 
 {code}
 Caused by: java.lang.RuntimeException: Big Table Retained Mapping duplicate 
 column 6 in ordered column map {6=(value column: 6, type name: int), 
 21=(value column: 21, type name: float), 22=(value column: 22, type name: 
 int)} when adding value column 6, type int
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorColumnOrderedMap.add(VectorColumnOrderedMap.java:97)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorColumnOutputMapping.add(VectorColumnOutputMapping.java:40)
   at 
 org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinCommonOperator.determineCommonInfo(VectorMapJoinCommonOperator.java:320)
   at 
 

[jira] [Commented] (HIVE-10521) TxnHandler.timeOutTxns only times out some of the expired transactions

2015-05-06 Thread Sushanth Sowmyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14531481#comment-14531481
 ] 

Sushanth Sowmyan commented on HIVE-10521:
-

Given that precommit tests have now run, [~ekoifman]/[~alangates], could you 
please verify and commit this patch?

 TxnHandler.timeOutTxns only times out some of the expired transactions
 --

 Key: HIVE-10521
 URL: https://issues.apache.org/jira/browse/HIVE-10521
 Project: Hive
  Issue Type: Bug
  Components: Transactions
Affects Versions: 0.14.0, 1.0.0, 1.1.0
Reporter: Alan Gates
Assignee: Alan Gates
 Attachments: HIVE-10521.2.patch, HIVE-10521.3.patch, 
 HIVE-10521.4.patch, HIVE-10521.patch


 {code}
   for (int i = 0; i  20  rs.next(); i++) deadTxns.add(rs.getLong(1));
   // We don't care whether all of the transactions get deleted or not,
   // if some didn't it most likely means someone else deleted them in the 
 interum
   if (deadTxns.size()  0) abortTxns(dbConn, deadTxns);
 {code}
 While it makes sense to limit the number of transactions aborted in one pass 
 (since this get's translated to an IN clause) we should still make sure all 
 are timed out.  Also, 20 seems pretty small as a batch size.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8696) HCatClientHMSImpl doesn't use a Retrying-HiveMetastoreClient.

2015-05-06 Thread Thiruvel Thirumoolan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvel Thirumoolan updated HIVE-8696:
---
Attachment: HIVE-8696.4.patch

 HCatClientHMSImpl doesn't use a Retrying-HiveMetastoreClient.
 -

 Key: HIVE-8696
 URL: https://issues.apache.org/jira/browse/HIVE-8696
 Project: Hive
  Issue Type: Sub-task
  Components: HCatalog, Metastore
Affects Versions: 0.12.0, 0.13.1
Reporter: Mithun Radhakrishnan
Assignee: Thiruvel Thirumoolan
 Fix For: 1.2.0

 Attachments: HIVE-8696.1.patch, HIVE-8696.2.patch, HIVE-8696.3.patch, 
 HIVE-8696.4.patch, HIVE-8696.poc.patch


 The HCatClientHMSImpl doesn't use a RetryingHiveMetastoreClient. Users of the 
 HCatClient API that log in through keytabs will fail without retry, when 
 their TGTs expire.
 The fix is inbound. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-7018) Table and Partition tables have column LINK_TARGET_ID in Mysql scripts but not others

2015-05-06 Thread Sushanth Sowmyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-7018:
---
Fix Version/s: (was: 1.2.0)

 Table and Partition tables have column LINK_TARGET_ID in Mysql scripts but 
 not others
 -

 Key: HIVE-7018
 URL: https://issues.apache.org/jira/browse/HIVE-7018
 Project: Hive
  Issue Type: Bug
Reporter: Brock Noland
Assignee: Yongzhi Chen
 Attachments: HIVE-7018.1.patch, HIVE-7018.2.patch


 It appears that at least postgres and oracle do not have the LINK_TARGET_ID 
 column while mysql does.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10635) Redo HIVE-7018 in a schematool compatible manner

2015-05-06 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-10635:
-
Description: In HIVE-10614, we had to revert HIVE-7018 because it was not 
schematool compatible and it would prevent upgrade from 0.14.0 to 1.3.0 when 
run via schematool. We need to redo HIVE-7018 work once the script introduced 
for HIVE-7018 is schematool compliant.  (was: In HIVE-10614, we had to revert 
HIVE-7018 because it was not schematool compatible and it would prevent upgrade 
from 0.14.0 to 1.3.0 when run via schematool.)

 Redo HIVE-7018 in a schematool compatible manner
 

 Key: HIVE-10635
 URL: https://issues.apache.org/jira/browse/HIVE-10635
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Reporter: Hari Sankar Sivarama Subramaniyan

 In HIVE-10614, we had to revert HIVE-7018 because it was not schematool 
 compatible and it would prevent upgrade from 0.14.0 to 1.3.0 when run via 
 schematool. We need to redo HIVE-7018 work once the script introduced for 
 HIVE-7018 is schematool compliant.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10626) Spark paln need to be updated [Spark Branch]

2015-05-06 Thread Jimmy Xiang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14531382#comment-14531382
 ] 

Jimmy Xiang commented on HIVE-10626:


Thanks for making the change. +1 pending on test.

 Spark paln need to be updated [Spark Branch]
 

 Key: HIVE-10626
 URL: https://issues.apache.org/jira/browse/HIVE-10626
 Project: Hive
  Issue Type: Bug
  Components: Spark
Affects Versions: spark-branch
Reporter: Chinna Rao Lalam
Assignee: Chinna Rao Lalam
 Attachments: HIVE-10626-spark.patch, HIVE-10626.1-spark.patch, 
 HIVE-10626.2-spark.patch


 [HIVE-8858] basic patch was committed, latest patch need to be committed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8065) Support HDFS encryption functionality on Hive

2015-05-06 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HIVE-8065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14531443#comment-14531443
 ] 

Sergio Peña commented on HIVE-8065:
---

Hey [~thejas]

Here's some answers about the issues:

1. If the encrypted zone where the results will be written is read-only, then 
Hive will try to use the directory set by {{hive.exec.scratchdir}} only if the 
scratch directory is encrypted as well (see HIVE-8945). This might create a 
performance issue if the encrypted scratch directory is in a different 
encryption zone. The user may change that directory to a writable directory 
inside the same encryption zone to make the move faster. This might be a little 
tedious for users, but it is the only way to protect their data.

2. This is a little tricky. Currently, hive selects the encryption zone that 
has the most strength cipher (aes128 vs aes256), and uses that location to 
store all final and intermediate results. This avoids writing intermediate data 
(aes256 to aes128), and then writing back the  final result to aes256. Here we 
have another performance issue where final result files would be copied (and 
not renamed) to the destination table as encryption zones might be different.

We did not do any work to deny access to stored results in another encryption 
zone. The solution only avoids that encrypted data touches non-encrypted zones, 
or weaker encrypted zones. Maybe other solutions, like Sentry, may work on this 
access control. But without an access control mechanism, this issue exists on 
the scratch directory, doesn't it?



 Support HDFS encryption functionality on Hive
 -

 Key: HIVE-8065
 URL: https://issues.apache.org/jira/browse/HIVE-8065
 Project: Hive
  Issue Type: Improvement
Affects Versions: 0.13.1
Reporter: Sergio Peña
Assignee: Sergio Peña
  Labels: Hive-Scrum

 The new encryption support on HDFS makes Hive incompatible and unusable when 
 this feature is used.
 HDFS encryption is designed so that an user can configure different 
 encryption zones (or directories) for multi-tenant environments. An 
 encryption zone has an exclusive encryption key, such as AES-128 or AES-256. 
 Because of security compliance, the HDFS does not allow to move/rename files 
 between encryption zones. Renames are allowed only inside the same encryption 
 zone. A copy is allowed between encryption zones.
 See HDFS-6134 for more details about HDFS encryption design.
 Hive currently uses a scratch directory (like /tmp/$user/$random). This 
 scratch directory is used for the output of intermediate data (between MR 
 jobs) and for the final output of the hive query which is later moved to the 
 table directory location.
 If Hive tables are in different encryption zones than the scratch directory, 
 then Hive won't be able to renames those files/directories, and it will make 
 Hive unusable.
 To handle this problem, we can change the scratch directory of the 
 query/statement to be inside the same encryption zone of the table directory 
 location. This way, the renaming process will be successful. 
 Also, for statements that move files between encryption zones (i.e. LOAD 
 DATA), a copy may be executed instead of a rename. This will cause an 
 overhead when copying large data files, but it won't break the encryption on 
 Hive.
 Another security thing to consider is when using joins selects. If Hive joins 
 different tables with different encryption key strengths, then the results of 
 the select might break the security compliance of the tables. Let's say two 
 tables with 128 bits and 256 bits encryption are joined, then the temporary 
 results might be stored in the 128 bits encryption zone. This will conflict 
 with the table encrypted with 256 bits temporary.
 To fix this, Hive should be able to select the scratch directory that is more 
 secured/encrypted in order to save the intermediate data temporary with no 
 compliance issues.
 For instance:
 {noformat}
 SELECT * FROM table-aes128 t1 JOIN table-aes256 t2 WHERE t1.id == t2.id;
 {noformat}
 - This should use a scratch directory (or staging directory) inside the 
 table-aes256 table location.
 {noformat}
 INSERT OVERWRITE TABLE table-unencrypted SELECT * FROM table-aes1;
 {noformat}
 - This should use a scratch directory inside the table-aes1 location.
 {noformat}
 FROM table-unencrypted
 INSERT OVERWRITE TABLE table-aes128 SELECT id, name
 INSERT OVERWRITE TABLE table-aes256 SELECT id, name
 {noformat}
 - This should use a scratch directory on each of the tables locations.
 - The first SELECT will have its scratch directory on table-aes128 directory.
 - The second SELECT will have its scratch directory on table-aes256 directory.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-6679) HiveServer2 should support configurable the server side socket timeout and keepalive for various transports types where applicable

2015-05-06 Thread Vaibhav Gumashta (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14531451#comment-14531451
 ] 

Vaibhav Gumashta commented on HIVE-6679:


Deferring this to 1.3 since it'll need some testing.

 HiveServer2 should support configurable the server side socket timeout and 
 keepalive for various transports types where applicable
 --

 Key: HIVE-6679
 URL: https://issues.apache.org/jira/browse/HIVE-6679
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Affects Versions: 0.13.0, 0.14.0, 1.0.0, 1.2.0, 1.1.0
Reporter: Prasad Mujumdar
Assignee: Navis
  Labels: TODOC1.0, TODOC15
 Fix For: 1.3.0

 Attachments: HIVE-6679.1.patch.txt, HIVE-6679.2.patch.txt, 
 HIVE-6679.3.patch, HIVE-6679.4.patch, HIVE-6679.5.patch, HIVE-6679.6.patch


  HiveServer2 should support configurable the server side socket read timeout 
 and TCP keep-alive option. Metastore server already support this (and the so 
 is the old hive server). 
 We now have multiple client connectivity options like Kerberos, Delegation 
 Token (Digest-MD5), Plain SASL, Plain SASL with SSL and raw sockets. The 
 configuration should be applicable to all types (if possible).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10564) webhcat should use webhcat-site.xml properties for controller job submission

2015-05-06 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-10564:
-
Attachment: HIVE-10564.2.patch

 webhcat should use webhcat-site.xml properties for controller job submission
 

 Key: HIVE-10564
 URL: https://issues.apache.org/jira/browse/HIVE-10564
 Project: Hive
  Issue Type: Bug
Reporter: Thejas M Nair
Assignee: Thejas M Nair
  Labels: TODOC1.2
 Fix For: 1.2.0

 Attachments: HIVE-10564.1.patch, HIVE-10564.2.patch


 webhcat should use webhcat-site.xml in configuration for the 
 TempletonController map-only job that it launches. This will allow users to 
 set any MR/hdfs properties that want to see used for the controller job.
 NO PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-10635) Redo HIVE-7018 in a schematool compatible manner

2015-05-06 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair resolved HIVE-10635.
--
Resolution: Duplicate

Closing this, will track in HIVE-7018 itself.


 Redo HIVE-7018 in a schematool compatible manner
 

 Key: HIVE-10635
 URL: https://issues.apache.org/jira/browse/HIVE-10635
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Reporter: Hari Sankar Sivarama Subramaniyan

 In HIVE-10614, we had to revert HIVE-7018 because it was not schematool 
 compatible and it would prevent upgrade from 0.14.0 to 1.3.0 when run via 
 schematool. We need to redo HIVE-7018 work once the script introduced for 
 HIVE-7018 is schematool compliant.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-7018) Table and Partition tables have column LINK_TARGET_ID in Mysql scripts but not others

2015-05-06 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14531497#comment-14531497
 ] 

Thejas M Nair commented on HIVE-7018:
-

As mentioned in HIVE-10635 -
In HIVE-10614, we had to revert HIVE-7018 because it was not schematool 
compatible and it would prevent upgrade from 0.14.0 to 1.3.0 when run via 
schematool. We need to redo HIVE-7018 work once the script introduced for 
HIVE-7018 is schematool compliant.

 Table and Partition tables have column LINK_TARGET_ID in Mysql scripts but 
 not others
 -

 Key: HIVE-7018
 URL: https://issues.apache.org/jira/browse/HIVE-7018
 Project: Hive
  Issue Type: Bug
Reporter: Brock Noland
Assignee: Yongzhi Chen
 Fix For: 1.2.0

 Attachments: HIVE-7018.1.patch, HIVE-7018.2.patch


 It appears that at least postgres and oracle do not have the LINK_TARGET_ID 
 column while mysql does.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (HIVE-7018) Table and Partition tables have column LINK_TARGET_ID in Mysql scripts but not others

2015-05-06 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair reopened HIVE-7018:
-

As this patch has not gone into a release, it is easier to track the issue by 
reopening this jira.
Closing HIVE-10635

 Table and Partition tables have column LINK_TARGET_ID in Mysql scripts but 
 not others
 -

 Key: HIVE-7018
 URL: https://issues.apache.org/jira/browse/HIVE-7018
 Project: Hive
  Issue Type: Bug
Reporter: Brock Noland
Assignee: Yongzhi Chen
 Fix For: 1.2.0

 Attachments: HIVE-7018.1.patch, HIVE-7018.2.patch


 It appears that at least postgres and oracle do not have the LINK_TARGET_ID 
 column while mysql does.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10530) Aggregate stats cache: bug fixes for RDBMS path

2015-05-06 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14531348#comment-14531348
 ] 

Thejas M Nair commented on HIVE-10530:
--

+1 pending tests


 Aggregate stats cache: bug fixes for RDBMS path
 ---

 Key: HIVE-10530
 URL: https://issues.apache.org/jira/browse/HIVE-10530
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 1.2.0
Reporter: Vaibhav Gumashta
Assignee: Vaibhav Gumashta
 Fix For: 1.2.0

 Attachments: HIVE-10530.1.patch, HIVE-10530.2.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10614) schemaTool upgrade from 0.14.0 to 1.3.0 causes failure

2015-05-06 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14531384#comment-14531384
 ] 

Hari Sankar Sivarama Subramaniyan commented on HIVE-10614:
--

[~sushanth] Thanks, HIVE-10635 is the follow-up jira.

Thanks
Hari

 schemaTool upgrade from 0.14.0 to 1.3.0 causes failure
 --

 Key: HIVE-10614
 URL: https://issues.apache.org/jira/browse/HIVE-10614
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Reporter: Hari Sankar Sivarama Subramaniyan
Assignee: Hari Sankar Sivarama Subramaniyan
Priority: Critical
 Fix For: 1.2.0

 Attachments: HIVE-10614.1-branch-0.12.patch, HIVE-10614.1.patch


 ./schematool -dbType mysql -upgradeSchemaFrom 0.14.0 -verbose
 {code}
 ++--+
 | 
|
 ++--+
 |  HIVE-7018 Remove Table and Partition tables column LINK_TARGET_ID from 
 Mysql for other DBs do not have it   |
 ++--+
 1 row selected (0.004 seconds)
 0: jdbc:mysql://node-1.example.com/hive DROP PROCEDURE IF EXISTS 
 RM_TLBS_LINKID
 No rows affected (0.005 seconds)
 0: jdbc:mysql://node-1.example.com/hive DROP PROCEDURE IF EXISTS 
 RM_PARTITIONS_LINKID
 No rows affected (0.006 seconds)
 0: jdbc:mysql://node-1.example.com/hive DROP PROCEDURE IF EXISTS RM_LINKID
 No rows affected (0.002 seconds)
 0: jdbc:mysql://node-1.example.com/hive CREATE PROCEDURE RM_TLBS_LINKID() 
 BEGIN IF EXISTS (SELECT * FROM `INFORMATION_SCHEMA`.`COLUMNS` WHERE 
 `TABLE_NAME` = 'TBLS' AND `COLUMN_NAME` = 'LINK_TARGET_ID') THEN ALTER TABLE 
 `TBLS` DROP FOREIGN KEY `TBLS_FK3` ; ALTER TABLE `TBLS` DROP KEY `TBLS_N51` ; 
 ALTER TABLE `TBLS` DROP COLUMN `LINK_TARGET_ID` ; END IF; END
 Error: You have an error in your SQL syntax; check the manual that 
 corresponds to your MySQL server version for the right syntax to use near '' 
 at line 1 (state=42000,code=1064)
 Closing: 0: jdbc:mysql://node-1.example.com/hive?createDatabaseIfNotExist=true
 org.apache.hadoop.hive.metastore.HiveMetaException: Upgrade FAILED! Metastore 
 state would be inconsistent !!
 org.apache.hadoop.hive.metastore.HiveMetaException: Upgrade FAILED! Metastore 
 state would be inconsistent !!
   at 
 org.apache.hive.beeline.HiveSchemaTool.doUpgrade(HiveSchemaTool.java:229)
   at org.apache.hive.beeline.HiveSchemaTool.main(HiveSchemaTool.java:468)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
   at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
   at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
 Caused by: java.io.IOException: Schema script failed, errorcode 2
   at 
 org.apache.hive.beeline.HiveSchemaTool.runBeeLine(HiveSchemaTool.java:355)
   at 
 org.apache.hive.beeline.HiveSchemaTool.runBeeLine(HiveSchemaTool.java:326)
   at 
 org.apache.hive.beeline.HiveSchemaTool.doUpgrade(HiveSchemaTool.java:224)
 {code}
 Looks like HIVE-7018 has introduced stored procedure as part of mysql upgrade 
 script and it is causing issues with schematool upgrade.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8065) Support HDFS encryption functionality on Hive

2015-05-06 Thread Brock Noland (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14531401#comment-14531401
 ] 

Brock Noland commented on HIVE-8065:


bq. Write permissions are now required to read from these tables.

Sergio can comment how read-only tables are handled. We did think of this case.

bq. Sensitive data from one zone will be stored in another.

Note that file permissions are still enforced and zones are not meant to be an 
access control mechanism. For example, a user with appropriate permissions 
could copy data from one ez to another ez1. Nothing in this change, changes 
that fact.

 Support HDFS encryption functionality on Hive
 -

 Key: HIVE-8065
 URL: https://issues.apache.org/jira/browse/HIVE-8065
 Project: Hive
  Issue Type: Improvement
Affects Versions: 0.13.1
Reporter: Sergio Peña
Assignee: Sergio Peña
  Labels: Hive-Scrum

 The new encryption support on HDFS makes Hive incompatible and unusable when 
 this feature is used.
 HDFS encryption is designed so that an user can configure different 
 encryption zones (or directories) for multi-tenant environments. An 
 encryption zone has an exclusive encryption key, such as AES-128 or AES-256. 
 Because of security compliance, the HDFS does not allow to move/rename files 
 between encryption zones. Renames are allowed only inside the same encryption 
 zone. A copy is allowed between encryption zones.
 See HDFS-6134 for more details about HDFS encryption design.
 Hive currently uses a scratch directory (like /tmp/$user/$random). This 
 scratch directory is used for the output of intermediate data (between MR 
 jobs) and for the final output of the hive query which is later moved to the 
 table directory location.
 If Hive tables are in different encryption zones than the scratch directory, 
 then Hive won't be able to renames those files/directories, and it will make 
 Hive unusable.
 To handle this problem, we can change the scratch directory of the 
 query/statement to be inside the same encryption zone of the table directory 
 location. This way, the renaming process will be successful. 
 Also, for statements that move files between encryption zones (i.e. LOAD 
 DATA), a copy may be executed instead of a rename. This will cause an 
 overhead when copying large data files, but it won't break the encryption on 
 Hive.
 Another security thing to consider is when using joins selects. If Hive joins 
 different tables with different encryption key strengths, then the results of 
 the select might break the security compliance of the tables. Let's say two 
 tables with 128 bits and 256 bits encryption are joined, then the temporary 
 results might be stored in the 128 bits encryption zone. This will conflict 
 with the table encrypted with 256 bits temporary.
 To fix this, Hive should be able to select the scratch directory that is more 
 secured/encrypted in order to save the intermediate data temporary with no 
 compliance issues.
 For instance:
 {noformat}
 SELECT * FROM table-aes128 t1 JOIN table-aes256 t2 WHERE t1.id == t2.id;
 {noformat}
 - This should use a scratch directory (or staging directory) inside the 
 table-aes256 table location.
 {noformat}
 INSERT OVERWRITE TABLE table-unencrypted SELECT * FROM table-aes1;
 {noformat}
 - This should use a scratch directory inside the table-aes1 location.
 {noformat}
 FROM table-unencrypted
 INSERT OVERWRITE TABLE table-aes128 SELECT id, name
 INSERT OVERWRITE TABLE table-aes256 SELECT id, name
 {noformat}
 - This should use a scratch directory on each of the tables locations.
 - The first SELECT will have its scratch directory on table-aes128 directory.
 - The second SELECT will have its scratch directory on table-aes256 directory.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HIVE-8065) Support HDFS encryption functionality on Hive

2015-05-06 Thread Brock Noland (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14531401#comment-14531401
 ] 

Brock Noland edited comment on HIVE-8065 at 5/6/15 9:00 PM:


bq. Write permissions are now required to read from these tables.

Sergio can comment how read-only tables are handled. We did think of this case.

bq. Sensitive data from one zone will be stored in another.

Note that permissions are still enforced and zones are not meant to be an 
access control mechanism. For example, a user with appropriate permissions 
could copy data from one ez to another ez1. Nothing in this change, changes 
that fact.


was (Author: brocknoland):
bq. Write permissions are now required to read from these tables.

Sergio can comment how read-only tables are handled. We did think of this case.

bq. Sensitive data from one zone will be stored in another.

Note that file permissions are still enforced and zones are not meant to be an 
access control mechanism. For example, a user with appropriate permissions 
could copy data from one ez to another ez1. Nothing in this change, changes 
that fact.

 Support HDFS encryption functionality on Hive
 -

 Key: HIVE-8065
 URL: https://issues.apache.org/jira/browse/HIVE-8065
 Project: Hive
  Issue Type: Improvement
Affects Versions: 0.13.1
Reporter: Sergio Peña
Assignee: Sergio Peña
  Labels: Hive-Scrum

 The new encryption support on HDFS makes Hive incompatible and unusable when 
 this feature is used.
 HDFS encryption is designed so that an user can configure different 
 encryption zones (or directories) for multi-tenant environments. An 
 encryption zone has an exclusive encryption key, such as AES-128 or AES-256. 
 Because of security compliance, the HDFS does not allow to move/rename files 
 between encryption zones. Renames are allowed only inside the same encryption 
 zone. A copy is allowed between encryption zones.
 See HDFS-6134 for more details about HDFS encryption design.
 Hive currently uses a scratch directory (like /tmp/$user/$random). This 
 scratch directory is used for the output of intermediate data (between MR 
 jobs) and for the final output of the hive query which is later moved to the 
 table directory location.
 If Hive tables are in different encryption zones than the scratch directory, 
 then Hive won't be able to renames those files/directories, and it will make 
 Hive unusable.
 To handle this problem, we can change the scratch directory of the 
 query/statement to be inside the same encryption zone of the table directory 
 location. This way, the renaming process will be successful. 
 Also, for statements that move files between encryption zones (i.e. LOAD 
 DATA), a copy may be executed instead of a rename. This will cause an 
 overhead when copying large data files, but it won't break the encryption on 
 Hive.
 Another security thing to consider is when using joins selects. If Hive joins 
 different tables with different encryption key strengths, then the results of 
 the select might break the security compliance of the tables. Let's say two 
 tables with 128 bits and 256 bits encryption are joined, then the temporary 
 results might be stored in the 128 bits encryption zone. This will conflict 
 with the table encrypted with 256 bits temporary.
 To fix this, Hive should be able to select the scratch directory that is more 
 secured/encrypted in order to save the intermediate data temporary with no 
 compliance issues.
 For instance:
 {noformat}
 SELECT * FROM table-aes128 t1 JOIN table-aes256 t2 WHERE t1.id == t2.id;
 {noformat}
 - This should use a scratch directory (or staging directory) inside the 
 table-aes256 table location.
 {noformat}
 INSERT OVERWRITE TABLE table-unencrypted SELECT * FROM table-aes1;
 {noformat}
 - This should use a scratch directory inside the table-aes1 location.
 {noformat}
 FROM table-unencrypted
 INSERT OVERWRITE TABLE table-aes128 SELECT id, name
 INSERT OVERWRITE TABLE table-aes256 SELECT id, name
 {noformat}
 - This should use a scratch directory on each of the tables locations.
 - The first SELECT will have its scratch directory on table-aes128 directory.
 - The second SELECT will have its scratch directory on table-aes256 directory.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HIVE-10611) Mini tez tests wait for 5 minutes before shutting down

2015-05-06 Thread Vikram Dixit K (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14531462#comment-14531462
 ] 

Vikram Dixit K edited comment on HIVE-10611 at 5/6/15 9:32 PM:
---

Committed to trunk to alleviate some of the HiveQA pressure. Thanks Ashutosh 
for the review.


was (Author: vikram.dixit):
Committed to trunk to alleviate some of the pressure. Thanks Ashutosh for the 
review.

 Mini tez tests wait for 5 minutes before shutting down
 --

 Key: HIVE-10611
 URL: https://issues.apache.org/jira/browse/HIVE-10611
 Project: Hive
  Issue Type: Bug
  Components: Tests
Affects Versions: 1.3.0
Reporter: Vikram Dixit K
Assignee: Vikram Dixit K
 Attachments: HIVE-10611.1.patch


 Currently, at shutdown, the tez mini cluster waits for the session to close 
 before shutting down the cluster. This ends up being 5 minutes - the default 
 value. We can shut down the session to alleviate this situation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10506) CBO (Calcite Return Path): Disallow return path to be enable if CBO is off

2015-05-06 Thread Sushanth Sowmyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14531487#comment-14531487
 ] 

Sushanth Sowmyan commented on HIVE-10506:
-

Hi, given that precommit tests have run, and this has been +1ed, could we 
please get this patch committed in?

 CBO (Calcite Return Path): Disallow return path to be enable if CBO is off
 --

 Key: HIVE-10506
 URL: https://issues.apache.org/jira/browse/HIVE-10506
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez
 Fix For: 1.2.0

 Attachments: HIVE-10506.01.patch, HIVE-10506.patch


 If hive.cbo.enable=false and hive.cbo.returnpath=true then some optimizations 
 would kick in. It's quite possible that in customer environment, they might 
 end up in these scenarios; we should prevent it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10484) Vectorization : RuntimeException Big Table Retained Mapping duplicate column

2015-05-06 Thread Matt McCline (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14531494#comment-14531494
 ] 

Matt McCline commented on HIVE-10484:
-

[~vikram.dixit] Thank You!

 Vectorization : RuntimeException Big Table Retained Mapping duplicate column
 --

 Key: HIVE-10484
 URL: https://issues.apache.org/jira/browse/HIVE-10484
 Project: Hive
  Issue Type: Bug
  Components: Tez, Vectorization
Affects Versions: 1.2.0
Reporter: Mostafa Mokhtar
Assignee: Matt McCline
 Fix For: 1.2.0

 Attachments: HIVE-10484.01.patch, HIVE-10484.02.patch


 With vectorization and tez enabled TPC-DS Q70 fails with 
 {code}
 Caused by: java.lang.RuntimeException: Big Table Retained Mapping duplicate 
 column 6 in ordered column map {6=(value column: 6, type name: int), 
 21=(value column: 21, type name: float), 22=(value column: 22, type name: 
 int)} when adding value column 6, type int
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorColumnOrderedMap.add(VectorColumnOrderedMap.java:97)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorColumnOutputMapping.add(VectorColumnOutputMapping.java:40)
   at 
 org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinCommonOperator.determineCommonInfo(VectorMapJoinCommonOperator.java:320)
   at 
 org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinCommonOperator.init(VectorMapJoinCommonOperator.java:254)
   at 
 org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinGenerateResultOperator.init(VectorMapJoinGenerateResultOperator.java:89)
   at 
 org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerGenerateResultOperator.init(VectorMapJoinInnerGenerateResultOperator.java:97)
   at 
 org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerLongOperator.init(VectorMapJoinInnerLongOperator.java:79)
   ... 49 more
 {code}
 Query 
 {code:sql}
  select s_state
from  (select s_state as s_state, sum(ss_net_profit),
  rank() over ( partition by s_state order by 
 sum(ss_net_profit) desc) as ranking
   from   store_sales, store, date_dim
   where  d_month_seq between 1193 and 1193+11
 and date_dim.d_date_sk = 
 store_sales.ss_sold_date_sk
 and store.s_store_sk  = store_sales.ss_store_sk
   group by s_state
  ) tmp1
where ranking = 5
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10626) Spark paln need to be updated [Spark Branch]

2015-05-06 Thread Chinna Rao Lalam (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14531310#comment-14531310
 ] 

Chinna Rao Lalam commented on HIVE-10626:
-

This is one expection form [HIVE-8858]
{quote}
3. It would be better if we log this graph in one line. The easiest way is to 
have a toString() method in SparkPlan and then we can just log the string 
representation of SparkPlan.
{quote}

 Spark paln need to be updated [Spark Branch]
 

 Key: HIVE-10626
 URL: https://issues.apache.org/jira/browse/HIVE-10626
 Project: Hive
  Issue Type: Bug
  Components: Spark
Affects Versions: spark-branch
Reporter: Chinna Rao Lalam
Assignee: Chinna Rao Lalam
 Attachments: HIVE-10626-spark.patch, HIVE-10626.1-spark.patch


 [HIVE-8858] basic patch was committed, latest patch need to be committed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-8524) When table is renamed stats are lost as changes are not propagated to metastore tables TAB_COL_STATS and PART_COL_STATS

2015-05-06 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan resolved HIVE-8524.

   Resolution: Fixed
Fix Version/s: 1.2.0

Fixed via HIVE-9720

 When table is renamed stats are lost as changes are not propagated to 
 metastore tables TAB_COL_STATS and PART_COL_STATS 
 

 Key: HIVE-8524
 URL: https://issues.apache.org/jira/browse/HIVE-8524
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.14.0, 1.0.0, 1.1.0
Reporter: Mostafa Mokhtar
Assignee: Ashutosh Chauhan
  Labels: hive
 Fix For: 1.2.0


 When a Hive table is renamed that the name is not updated in TAB_COL_STATS 
 and PART_COL_STATS.
 Repro 
 1) Create table 
 2) insert rows
 3) Analyze table t1 compute statistics for columns;
 4) set hive.stats.fetch.column.stats=true;
 5) Explain select * from t1 where c1  x 
 6) ALTER TABLE t1 RENAME TO 2;
 7) Explain select * from t2 where c1  x ; /* stats will be missing */
 8) Query the Metastore tables to validate 
 According to the documentation Metastore should be updated
 {code}
 This statement lets you change the name of a table to a different name.
 As of version 0.6, a rename on a managed table moves its HDFS location as 
 well. (Older Hive versions just renamed the table in the metastore without 
 moving the HDFS location.)
 {code}
 Another related issue is that the schema of  the stats table is not 
 consistent with TBLS and DBS as these two table are normalized while 
 TAB_COL_STATS and PART_COL_STATS have TABLE_NAME and DB_NAME denormalized in 
 them.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8524) When table is renamed stats are lost as changes are not propagated to metastore tables TAB_COL_STATS and PART_COL_STATS

2015-05-06 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-8524:
---
Affects Version/s: 1.1.0
   1.0.0

 When table is renamed stats are lost as changes are not propagated to 
 metastore tables TAB_COL_STATS and PART_COL_STATS 
 

 Key: HIVE-8524
 URL: https://issues.apache.org/jira/browse/HIVE-8524
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.14.0, 1.0.0, 1.1.0
Reporter: Mostafa Mokhtar
Assignee: Ashutosh Chauhan
  Labels: hive
 Fix For: 1.2.0


 When a Hive table is renamed that the name is not updated in TAB_COL_STATS 
 and PART_COL_STATS.
 Repro 
 1) Create table 
 2) insert rows
 3) Analyze table t1 compute statistics for columns;
 4) set hive.stats.fetch.column.stats=true;
 5) Explain select * from t1 where c1  x 
 6) ALTER TABLE t1 RENAME TO 2;
 7) Explain select * from t2 where c1  x ; /* stats will be missing */
 8) Query the Metastore tables to validate 
 According to the documentation Metastore should be updated
 {code}
 This statement lets you change the name of a table to a different name.
 As of version 0.6, a rename on a managed table moves its HDFS location as 
 well. (Older Hive versions just renamed the table in the metastore without 
 moving the HDFS location.)
 {code}
 Another related issue is that the schema of  the stats table is not 
 consistent with TBLS and DBS as these two table are normalized while 
 TAB_COL_STATS and PART_COL_STATS have TABLE_NAME and DB_NAME denormalized in 
 them.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10635) Redo HIVE-7018 in a schematool compatible manner

2015-05-06 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-10635:
-
Component/s: Metastore

 Redo HIVE-7018 in a schematool compatible manner
 

 Key: HIVE-10635
 URL: https://issues.apache.org/jira/browse/HIVE-10635
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Reporter: Hari Sankar Sivarama Subramaniyan

 In HIVE-10614, we had to revert HIVE-7018 because it was not schematool 
 compatible and it would prevent upgrade from 0.14.0 to 1.3.0 when run via 
 schematool.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9451) Add max size of column dictionaries to ORC metadata

2015-05-06 Thread Sushanth Sowmyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14531432#comment-14531432
 ] 

Sushanth Sowmyan commented on HIVE-9451:


After discussion with Owen, marking as tentative for 1.2 - i.e. this will not 
hold up the RC process for 1.2.0, but if it makes it before we release, it'll 
be part of 1.2.0.

This will still be honoured for inclusion in a 1.2.1 when we do it.

 Add max size of column dictionaries to ORC metadata
 ---

 Key: HIVE-9451
 URL: https://issues.apache.org/jira/browse/HIVE-9451
 Project: Hive
  Issue Type: Improvement
Reporter: Owen O'Malley
Assignee: Owen O'Malley
  Labels: ORC
 Fix For: 1.2.0

 Attachments: HIVE-9451.patch, HIVE-9451.patch


 To predict the amount of memory required to read an ORC file we need to know 
 the size of the dictionaries for the columns that we are reading. I propose 
 adding the number of bytes for each column's dictionary to the stripe's 
 column statistics. The file's column statistics would have the maximum 
 dictionary size for each column.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10595) Dropping a table can cause NPEs in the compactor

2015-05-06 Thread Sushanth Sowmyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-10595:

Attachment: HIVE-10595.1.patch

Duplicating HIVE-10595.patch as HIVE-10595.1.patch to submit through precommit.

 Dropping a table can cause NPEs in the compactor
 

 Key: HIVE-10595
 URL: https://issues.apache.org/jira/browse/HIVE-10595
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.14.0, 1.0.0, 1.1.0
Reporter: Alan Gates
Assignee: Alan Gates
 Attachments: HIVE-10595.1.patch, HIVE-10595.patch


 Reproduction:
 # start metastore with compactor off
 # insert enough entries in a table to trigger a compaction
 # drop the table
 # stop metastore
 # restart metastore with compactor on
 Result:  NPE in the compactor threads.  I suspect this would also happen if 
 the inserts and drops were done in between a run of the compactor, but I 
 haven't proven it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10614) schemaTool upgrade from 0.14.0 to 1.3.0 causes failure

2015-05-06 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-10614:
-
Attachment: (was: HIVE-10614.1-master.patch)

 schemaTool upgrade from 0.14.0 to 1.3.0 causes failure
 --

 Key: HIVE-10614
 URL: https://issues.apache.org/jira/browse/HIVE-10614
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Reporter: Hari Sankar Sivarama Subramaniyan
Assignee: Hari Sankar Sivarama Subramaniyan
Priority: Critical

 ./schematool -dbType mysql -upgradeSchemaFrom 0.14.0 -verbose
 {code}
 ++--+
 | 
|
 ++--+
 |  HIVE-7018 Remove Table and Partition tables column LINK_TARGET_ID from 
 Mysql for other DBs do not have it   |
 ++--+
 1 row selected (0.004 seconds)
 0: jdbc:mysql://node-1.example.com/hive DROP PROCEDURE IF EXISTS 
 RM_TLBS_LINKID
 No rows affected (0.005 seconds)
 0: jdbc:mysql://node-1.example.com/hive DROP PROCEDURE IF EXISTS 
 RM_PARTITIONS_LINKID
 No rows affected (0.006 seconds)
 0: jdbc:mysql://node-1.example.com/hive DROP PROCEDURE IF EXISTS RM_LINKID
 No rows affected (0.002 seconds)
 0: jdbc:mysql://node-1.example.com/hive CREATE PROCEDURE RM_TLBS_LINKID() 
 BEGIN IF EXISTS (SELECT * FROM `INFORMATION_SCHEMA`.`COLUMNS` WHERE 
 `TABLE_NAME` = 'TBLS' AND `COLUMN_NAME` = 'LINK_TARGET_ID') THEN ALTER TABLE 
 `TBLS` DROP FOREIGN KEY `TBLS_FK3` ; ALTER TABLE `TBLS` DROP KEY `TBLS_N51` ; 
 ALTER TABLE `TBLS` DROP COLUMN `LINK_TARGET_ID` ; END IF; END
 Error: You have an error in your SQL syntax; check the manual that 
 corresponds to your MySQL server version for the right syntax to use near '' 
 at line 1 (state=42000,code=1064)
 Closing: 0: jdbc:mysql://node-1.example.com/hive?createDatabaseIfNotExist=true
 org.apache.hadoop.hive.metastore.HiveMetaException: Upgrade FAILED! Metastore 
 state would be inconsistent !!
 org.apache.hadoop.hive.metastore.HiveMetaException: Upgrade FAILED! Metastore 
 state would be inconsistent !!
   at 
 org.apache.hive.beeline.HiveSchemaTool.doUpgrade(HiveSchemaTool.java:229)
   at org.apache.hive.beeline.HiveSchemaTool.main(HiveSchemaTool.java:468)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
   at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
   at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
 Caused by: java.io.IOException: Schema script failed, errorcode 2
   at 
 org.apache.hive.beeline.HiveSchemaTool.runBeeLine(HiveSchemaTool.java:355)
   at 
 org.apache.hive.beeline.HiveSchemaTool.runBeeLine(HiveSchemaTool.java:326)
   at 
 org.apache.hive.beeline.HiveSchemaTool.doUpgrade(HiveSchemaTool.java:224)
 {code}
 Looks like HIVE-7018 has introduced stored procedure as part of mysql upgrade 
 script and it is causing issues with schematool upgrade.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10614) schemaTool upgrade from 0.14.0 to 1.3.0 causes failure

2015-05-06 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-10614:
-
Attachment: (was: HIVE-10614.1.patch)

 schemaTool upgrade from 0.14.0 to 1.3.0 causes failure
 --

 Key: HIVE-10614
 URL: https://issues.apache.org/jira/browse/HIVE-10614
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Reporter: Hari Sankar Sivarama Subramaniyan
Assignee: Hari Sankar Sivarama Subramaniyan
Priority: Critical

 ./schematool -dbType mysql -upgradeSchemaFrom 0.14.0 -verbose
 {code}
 ++--+
 | 
|
 ++--+
 |  HIVE-7018 Remove Table and Partition tables column LINK_TARGET_ID from 
 Mysql for other DBs do not have it   |
 ++--+
 1 row selected (0.004 seconds)
 0: jdbc:mysql://node-1.example.com/hive DROP PROCEDURE IF EXISTS 
 RM_TLBS_LINKID
 No rows affected (0.005 seconds)
 0: jdbc:mysql://node-1.example.com/hive DROP PROCEDURE IF EXISTS 
 RM_PARTITIONS_LINKID
 No rows affected (0.006 seconds)
 0: jdbc:mysql://node-1.example.com/hive DROP PROCEDURE IF EXISTS RM_LINKID
 No rows affected (0.002 seconds)
 0: jdbc:mysql://node-1.example.com/hive CREATE PROCEDURE RM_TLBS_LINKID() 
 BEGIN IF EXISTS (SELECT * FROM `INFORMATION_SCHEMA`.`COLUMNS` WHERE 
 `TABLE_NAME` = 'TBLS' AND `COLUMN_NAME` = 'LINK_TARGET_ID') THEN ALTER TABLE 
 `TBLS` DROP FOREIGN KEY `TBLS_FK3` ; ALTER TABLE `TBLS` DROP KEY `TBLS_N51` ; 
 ALTER TABLE `TBLS` DROP COLUMN `LINK_TARGET_ID` ; END IF; END
 Error: You have an error in your SQL syntax; check the manual that 
 corresponds to your MySQL server version for the right syntax to use near '' 
 at line 1 (state=42000,code=1064)
 Closing: 0: jdbc:mysql://node-1.example.com/hive?createDatabaseIfNotExist=true
 org.apache.hadoop.hive.metastore.HiveMetaException: Upgrade FAILED! Metastore 
 state would be inconsistent !!
 org.apache.hadoop.hive.metastore.HiveMetaException: Upgrade FAILED! Metastore 
 state would be inconsistent !!
   at 
 org.apache.hive.beeline.HiveSchemaTool.doUpgrade(HiveSchemaTool.java:229)
   at org.apache.hive.beeline.HiveSchemaTool.main(HiveSchemaTool.java:468)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
   at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
   at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
 Caused by: java.io.IOException: Schema script failed, errorcode 2
   at 
 org.apache.hive.beeline.HiveSchemaTool.runBeeLine(HiveSchemaTool.java:355)
   at 
 org.apache.hive.beeline.HiveSchemaTool.runBeeLine(HiveSchemaTool.java:326)
   at 
 org.apache.hive.beeline.HiveSchemaTool.doUpgrade(HiveSchemaTool.java:224)
 {code}
 Looks like HIVE-7018 has introduced stored procedure as part of mysql upgrade 
 script and it is causing issues with schematool upgrade.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10453) HS2 leaking open file descriptors when using UDFs

2015-05-06 Thread Szehon Ho (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14531112#comment-14531112
 ] 

Szehon Ho commented on HIVE-10453:
--

Good catch, +1.  One minor question is 'CUDFLoader' method name a typo, or 
intentional?

 HS2 leaking open file descriptors when using UDFs
 -

 Key: HIVE-10453
 URL: https://issues.apache.org/jira/browse/HIVE-10453
 Project: Hive
  Issue Type: Bug
Reporter: Yongzhi Chen
Assignee: Yongzhi Chen
 Attachments: HIVE-10453.1.patch, HIVE-10453.2.patch


 1. create a custom function by
 CREATE FUNCTION myfunc AS 'someudfclass' using jar 'hdfs:///tmp/myudf.jar';
 2. Create a simple jdbc client, just do 
 connect, 
 run simple query which using the function such as:
 select myfunc(col1) from sometable
 3. Disconnect.
 Check open file for HiveServer2 by:
 lsof -p HSProcID | grep myudf.jar
 You will see the leak as:
 {noformat}
 java  28718 ychen  txt  REG1,4741 212977666 
 /private/var/folders/6p/7_njf13d6h144wldzbbsfpz8gp/T/1bfe3de0-ac63-4eba-a725-6a9840f1f8d5_resources/myudf.jar
 java  28718 ychen  330r REG1,4741 212977666 
 /private/var/folders/6p/7_njf13d6h144wldzbbsfpz8gp/T/1bfe3de0-ac63-4eba-a725-6a9840f1f8d5_resources/myudf.jar
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10614) schemaTool upgrade from 0.14.0 to 1.3.0 causes failure

2015-05-06 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-10614:
-
Attachment: HIVE-10614.1.patch

 schemaTool upgrade from 0.14.0 to 1.3.0 causes failure
 --

 Key: HIVE-10614
 URL: https://issues.apache.org/jira/browse/HIVE-10614
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Reporter: Hari Sankar Sivarama Subramaniyan
Assignee: Hari Sankar Sivarama Subramaniyan
Priority: Critical
 Attachments: HIVE-10614.1-branch-0.12.patch, HIVE-10614.1.patch


 ./schematool -dbType mysql -upgradeSchemaFrom 0.14.0 -verbose
 {code}
 ++--+
 | 
|
 ++--+
 |  HIVE-7018 Remove Table and Partition tables column LINK_TARGET_ID from 
 Mysql for other DBs do not have it   |
 ++--+
 1 row selected (0.004 seconds)
 0: jdbc:mysql://node-1.example.com/hive DROP PROCEDURE IF EXISTS 
 RM_TLBS_LINKID
 No rows affected (0.005 seconds)
 0: jdbc:mysql://node-1.example.com/hive DROP PROCEDURE IF EXISTS 
 RM_PARTITIONS_LINKID
 No rows affected (0.006 seconds)
 0: jdbc:mysql://node-1.example.com/hive DROP PROCEDURE IF EXISTS RM_LINKID
 No rows affected (0.002 seconds)
 0: jdbc:mysql://node-1.example.com/hive CREATE PROCEDURE RM_TLBS_LINKID() 
 BEGIN IF EXISTS (SELECT * FROM `INFORMATION_SCHEMA`.`COLUMNS` WHERE 
 `TABLE_NAME` = 'TBLS' AND `COLUMN_NAME` = 'LINK_TARGET_ID') THEN ALTER TABLE 
 `TBLS` DROP FOREIGN KEY `TBLS_FK3` ; ALTER TABLE `TBLS` DROP KEY `TBLS_N51` ; 
 ALTER TABLE `TBLS` DROP COLUMN `LINK_TARGET_ID` ; END IF; END
 Error: You have an error in your SQL syntax; check the manual that 
 corresponds to your MySQL server version for the right syntax to use near '' 
 at line 1 (state=42000,code=1064)
 Closing: 0: jdbc:mysql://node-1.example.com/hive?createDatabaseIfNotExist=true
 org.apache.hadoop.hive.metastore.HiveMetaException: Upgrade FAILED! Metastore 
 state would be inconsistent !!
 org.apache.hadoop.hive.metastore.HiveMetaException: Upgrade FAILED! Metastore 
 state would be inconsistent !!
   at 
 org.apache.hive.beeline.HiveSchemaTool.doUpgrade(HiveSchemaTool.java:229)
   at org.apache.hive.beeline.HiveSchemaTool.main(HiveSchemaTool.java:468)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
   at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
   at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
 Caused by: java.io.IOException: Schema script failed, errorcode 2
   at 
 org.apache.hive.beeline.HiveSchemaTool.runBeeLine(HiveSchemaTool.java:355)
   at 
 org.apache.hive.beeline.HiveSchemaTool.runBeeLine(HiveSchemaTool.java:326)
   at 
 org.apache.hive.beeline.HiveSchemaTool.doUpgrade(HiveSchemaTool.java:224)
 {code}
 Looks like HIVE-7018 has introduced stored procedure as part of mysql upgrade 
 script and it is causing issues with schematool upgrade.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10595) Dropping a table can cause NPEs in the compactor

2015-05-06 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14531154#comment-14531154
 ] 

Eugene Koifman commented on HIVE-10595:
---

I filed a ticket to investigate the more general issue.
+1 this patch

 Dropping a table can cause NPEs in the compactor
 

 Key: HIVE-10595
 URL: https://issues.apache.org/jira/browse/HIVE-10595
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.14.0, 1.0.0, 1.1.0
Reporter: Alan Gates
Assignee: Alan Gates
 Attachments: HIVE-10595.patch


 Reproduction:
 # start metastore with compactor off
 # insert enough entries in a table to trigger a compaction
 # drop the table
 # stop metastore
 # restart metastore with compactor on
 Result:  NPE in the compactor threads.  I suspect this would also happen if 
 the inserts and drops were done in between a run of the compactor, but I 
 haven't proven it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-10633) LLAP: remove GC setting from runLlapDaemon

2015-05-06 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin reassigned HIVE-10633:
---

Assignee: Sergey Shelukhin

 LLAP: remove GC setting from runLlapDaemon
 --

 Key: HIVE-10633
 URL: https://issues.apache.org/jira/browse/HIVE-10633
 Project: Hive
  Issue Type: Bug
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Fix For: llap

 Attachments: HIVE-10633.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10634) The HMS upgrade test script on LXC is exiting with error even if the test were run successfuly

2015-05-06 Thread Szehon Ho (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14531251#comment-14531251
 ] 

Szehon Ho commented on HIVE-10634:
--

+1

 The HMS upgrade test script on LXC is exiting with error even if the test 
 were run successfuly
 --

 Key: HIVE-10634
 URL: https://issues.apache.org/jira/browse/HIVE-10634
 Project: Hive
  Issue Type: Bug
  Components: Testing Infrastructure
Reporter: Sergio Peña
Assignee: Sergio Peña
 Attachments: HIVE-10634.1.patch


 The execute-test-on-lxc.sh script is exiting with '1' error code after the 
 tests were executed even if the test did not fail.
 This is causing that PreCommit-HIVE-METASTORE-Test publishes invalid results 
 to Jira.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10626) Spark paln need to be updated [Spark Branch]

2015-05-06 Thread Jimmy Xiang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14531275#comment-14531275
 ] 

Jimmy Xiang commented on HIVE-10626:


Why do we need SparkPlan.toString()?

 Spark paln need to be updated [Spark Branch]
 

 Key: HIVE-10626
 URL: https://issues.apache.org/jira/browse/HIVE-10626
 Project: Hive
  Issue Type: Bug
  Components: Spark
Affects Versions: spark-branch
Reporter: Chinna Rao Lalam
Assignee: Chinna Rao Lalam
 Attachments: HIVE-10626-spark.patch, HIVE-10626.1-spark.patch


 [HIVE-8858] basic patch was committed, latest patch need to be committed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   3   >