[jira] [Commented] (HIVE-9937) LLAP: Vectorized Field-By-Field Serialize / Deserialize to support new Vectorized Map Join

2015-03-27 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14383465#comment-14383465
 ] 

Hive QA commented on HIVE-9937:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12707584/HIVE-9937.07.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 8682 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_bitmap_auto_partitioned
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3177/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3177/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3177/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12707584 - PreCommit-HIVE-TRUNK-Build

 LLAP: Vectorized Field-By-Field Serialize / Deserialize to support new 
 Vectorized Map Join
 --

 Key: HIVE-9937
 URL: https://issues.apache.org/jira/browse/HIVE-9937
 Project: Hive
  Issue Type: Sub-task
Reporter: Matt McCline
Assignee: Matt McCline
 Attachments: HIVE-9937.01.patch, HIVE-9937.02.patch, 
 HIVE-9937.03.patch, HIVE-9937.04.patch, HIVE-9937.05.patch, 
 HIVE-9937.06.patch, HIVE-9937.07.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10073) Runtime exception when querying HBase with Spark [Spark Branch]

2015-03-27 Thread Chengxiang Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14383406#comment-14383406
 ] 

Chengxiang Li commented on HIVE-10073:
--

Committed to spark branch, thanks jimmy for this contribution.

 Runtime exception when querying HBase with Spark [Spark Branch]
 ---

 Key: HIVE-10073
 URL: https://issues.apache.org/jira/browse/HIVE-10073
 Project: Hive
  Issue Type: Bug
  Components: Spark
Affects Versions: spark-branch
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
 Fix For: spark-branch

 Attachments: HIVE-10073.1-spark.patch, HIVE-10073.2-spark.patch, 
 HIVE-10073.3-spark.patch


 When querying HBase with Spark, we got 
 {noformat}
  Caused by: java.lang.IllegalArgumentException: Must specify table name
 at 
 org.apache.hadoop.hbase.mapreduce.TableOutputFormat.setConf(TableOutputFormat.java:188)
 at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:73)
 at 
 org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
 at 
 org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveOutputFormat(HiveFileFormatUtils.java:276)
 at 
 org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveOutputFormat(HiveFileFormatUtils.java:266)
 at 
 org.apache.hadoop.hive.ql.exec.FileSinkOperator.initializeOp(FileSinkOperator.java:331)
 {noformat}
 But it works fine for MapReduce.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10106) Regression : Dynamic partition pruning not working after HIVE-9976

2015-03-27 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated HIVE-10106:
--
Attachment: HIVE-10106.1.patch

I believe this is caused by the way mapWork is setup. It's now created in the 
constructor of HiveSplitGenerator. The constructor and the initialize method 
may not be invoked in the same thread. As a result, the initialize method ends 
up seing a different copy of mapWork from the one modified in the pruner.
Attaching a patch to fix this - by setting the mapWork in the initialize method.
[~hagleitn] - please review, and validate the theory.
[~mmokhtar] - I wasn't able to reproduce this. Seing pruning work as it should 
for the simple query that you'd sent me offline. May need help reproducing the 
issue and validating the patch. Thanks

 Regression : Dynamic partition pruning not working after HIVE-9976
 --

 Key: HIVE-10106
 URL: https://issues.apache.org/jira/browse/HIVE-10106
 Project: Hive
  Issue Type: Bug
  Components: Hive
Affects Versions: 1.2.0
Reporter: Mostafa Mokhtar
Assignee: Siddharth Seth
 Fix For: 1.2.0

 Attachments: HIVE-10106.1.patch


 After HIVE-9976 got checked in dynamic partition pruning doesn't work.
 Partitions are pruned and later show up in splits.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10027) Use descriptions from Avro schema files in column comments

2015-03-27 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14383841#comment-14383841
 ] 

Hive QA commented on HIVE-10027:




{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12707607/HIVE-10027.1.patch

{color:green}SUCCESS:{color} +1 8678 tests passed

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3180/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3180/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3180/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12707607 - PreCommit-HIVE-TRUNK-Build

 Use descriptions from Avro schema files in column comments
 --

 Key: HIVE-10027
 URL: https://issues.apache.org/jira/browse/HIVE-10027
 Project: Hive
  Issue Type: Improvement
  Components: Metastore
Affects Versions: 0.13.1
Reporter: Jeremy Beard
Assignee: Chaoyu Tang
Priority: Minor
 Attachments: HIVE-10027.1.patch, HIVE-10027.patch


 Avro schema files can include field descriptions using the doc tag. It 
 would be helpful if the Hive metastore would use these descriptions as the 
 comments for a field when the table is backed by such a schema file, instead 
 of the default from deserializer.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10098) HS2 local task for map join fails in KMS encrypted cluster

2015-03-27 Thread Yongzhi Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14383890#comment-14383890
 ] 

Yongzhi Chen commented on HIVE-10098:
-

The failures are not related:
TestPigHBaseStorageHandler:
Caused by: javax.jdo.JDOException: Couldnt obtain a new sequence (unique id) : 
Container 656 not found.
at 
org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:596)
at 
org.datanucleus.api.jdo.JDOPersistenceManager.jdoMakePersistent(JDOPersistenceManager.java:732)
at 
org.datanucleus.api.jdo.JDOPersistenceManager.makePersistent(JDOPersistenceManager.java:752)
at 
org.apache.hadoop.hive.metastore.ObjectStore.setMetaStoreSchemaVersion(ObjectStore.java:6729)
at 
org.apache.hadoop.hive.metastore.ObjectStore.checkSchema(ObjectStore.java:6626)
at 
org.apache.hadoop.hive.metastore.ObjectStore.verifySchema(ObjectStore.java:6601)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

testSSLFetchHttp:
testSSLFetchHttp(org.apache.hive.jdbc.TestSSL)  Time elapsed: 72.524 sec   
ERROR!
java.sql.SQLException: Could not open client transport with JDBC Uri: 
jhive.server2.transport.mode=http;hive.server2.thrift.http.path=cliservice;. 
org.apache.http.conn.HttpHostConnectException: Connection to 
https://localhost:52258 refused
at java.net.PlainSocketImpl.socketConnect(Native Method)

testSyncRpc:
testSyncRpc(org.apache.hive.spark.client.TestSparkClient)  Time elapsed: 31.91 
sec   ERROR!
java.util.concurrent.TimeoutException: null
at io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:49)
at 
org.apache.hive.spark.client.TestSparkClient$4.call(TestSparkClient.java:144)
at 
org.apache.hive.spark.client.TestSparkClient.runTest(TestSparkClient.java:275)
at 
org.apache.hive.spark.client.TestSparkClient.testSyncRpc(TestSparkClient.java:140)





 HS2 local task for map join fails in KMS encrypted cluster
 --

 Key: HIVE-10098
 URL: https://issues.apache.org/jira/browse/HIVE-10098
 Project: Hive
  Issue Type: Bug
Reporter: Yongzhi Chen
Assignee: Yongzhi Chen
 Attachments: HIVE-10098.1.patch


 Env: KMS was enabled after cluster was kerberos secured. 
 Problem: PROBLEM: Any Hive query via beeline that performs a MapJoin fails 
 with a java.lang.reflect.UndeclaredThrowableException  from 
 KMSClientProvider.addDelegationTokens.
 {code}
 2015-03-18 08:49:17,948 INFO [main]: Configuration.deprecation 
 (Configuration.java:warnOnceIfDeprecated(1022)) - mapred.input.dir is 
 deprecated. Instead, use mapreduce.input.fileinputformat.inputdir 
 2015-03-18 08:49:19,048 WARN [main]: security.UserGroupInformation 
 (UserGroupInformation.java:doAs(1645)) - PriviledgedActionException as:hive 
 (auth:KERBEROS) 
 cause:org.apache.hadoop.security.authentication.client.AuthenticationException:
  GSSException: No valid credentials provided (Mechanism level: Failed to find 
 any Kerberos tgt) 
 2015-03-18 08:49:19,050 ERROR [main]: mr.MapredLocalTask 
 (MapredLocalTask.java:executeFromChildJVM(314)) - Hive Runtime Error: Map 
 local work failed 
 java.io.IOException: java.io.IOException: 
 java.lang.reflect.UndeclaredThrowableException 
 at 
 org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:634)
  
 at 
 org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask.startForward(MapredLocalTask.java:363)
  
 at 
 org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask.startForward(MapredLocalTask.java:337)
  
 at 
 org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask.executeFromChildJVM(MapredLocalTask.java:303)
 at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.main(ExecDriver.java:735) 
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) 
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
  
 at java.lang.reflect.Method.invoke(Method.java:606) 
 at org.apache.hadoop.util.RunJar.main(RunJar.java:212) 
 Caused by: java.io.IOException: 
 java.lang.reflect.UndeclaredThrowableException 
 at 
 org.apache.hadoop.crypto.key.kms.KMSClientProvider.addDelegationTokens(KMSClientProvider.java:826)
  
 at 
 org.apache.hadoop.crypto.key.KeyProviderDelegationTokenExtension.addDelegationTokens(KeyProviderDelegationTokenExtension.java:86)
  
 at 
 org.apache.hadoop.hdfs.DistributedFileSystem.addDelegationTokens(DistributedFileSystem.java:2017)
  
 at 
 org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:121)
  
 at 
 org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:100)
  
 at 
 

[jira] [Commented] (HIVE-10098) HS2 local task for map join fails in KMS encrypted cluster

2015-03-27 Thread Yongzhi Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14383893#comment-14383893
 ] 

Yongzhi Chen commented on HIVE-10098:
-

[~prasadm], could you review the change? Thanks

 HS2 local task for map join fails in KMS encrypted cluster
 --

 Key: HIVE-10098
 URL: https://issues.apache.org/jira/browse/HIVE-10098
 Project: Hive
  Issue Type: Bug
Reporter: Yongzhi Chen
Assignee: Yongzhi Chen
 Attachments: HIVE-10098.1.patch


 Env: KMS was enabled after cluster was kerberos secured. 
 Problem: PROBLEM: Any Hive query via beeline that performs a MapJoin fails 
 with a java.lang.reflect.UndeclaredThrowableException  from 
 KMSClientProvider.addDelegationTokens.
 {code}
 2015-03-18 08:49:17,948 INFO [main]: Configuration.deprecation 
 (Configuration.java:warnOnceIfDeprecated(1022)) - mapred.input.dir is 
 deprecated. Instead, use mapreduce.input.fileinputformat.inputdir 
 2015-03-18 08:49:19,048 WARN [main]: security.UserGroupInformation 
 (UserGroupInformation.java:doAs(1645)) - PriviledgedActionException as:hive 
 (auth:KERBEROS) 
 cause:org.apache.hadoop.security.authentication.client.AuthenticationException:
  GSSException: No valid credentials provided (Mechanism level: Failed to find 
 any Kerberos tgt) 
 2015-03-18 08:49:19,050 ERROR [main]: mr.MapredLocalTask 
 (MapredLocalTask.java:executeFromChildJVM(314)) - Hive Runtime Error: Map 
 local work failed 
 java.io.IOException: java.io.IOException: 
 java.lang.reflect.UndeclaredThrowableException 
 at 
 org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:634)
  
 at 
 org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask.startForward(MapredLocalTask.java:363)
  
 at 
 org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask.startForward(MapredLocalTask.java:337)
  
 at 
 org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask.executeFromChildJVM(MapredLocalTask.java:303)
 at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.main(ExecDriver.java:735) 
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) 
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
  
 at java.lang.reflect.Method.invoke(Method.java:606) 
 at org.apache.hadoop.util.RunJar.main(RunJar.java:212) 
 Caused by: java.io.IOException: 
 java.lang.reflect.UndeclaredThrowableException 
 at 
 org.apache.hadoop.crypto.key.kms.KMSClientProvider.addDelegationTokens(KMSClientProvider.java:826)
  
 at 
 org.apache.hadoop.crypto.key.KeyProviderDelegationTokenExtension.addDelegationTokens(KeyProviderDelegationTokenExtension.java:86)
  
 at 
 org.apache.hadoop.hdfs.DistributedFileSystem.addDelegationTokens(DistributedFileSystem.java:2017)
  
 at 
 org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:121)
  
 at 
 org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:100)
  
 at 
 org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodes(TokenCache.java:80)
  
 at 
 org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:205) 
 at 
 org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:313) 
 at 
 org.apache.hadoop.hive.ql.exec.FetchOperator.getRecordReader(FetchOperator.java:413)
  
 at 
 org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:559)
  
 ... 9 more 
 Caused by: java.lang.reflect.UndeclaredThrowableException 
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1655)
  
 at 
 org.apache.hadoop.crypto.key.kms.KMSClientProvider.addDelegationTokens(KMSClientProvider.java:808)
  
 ... 18 more 
 Caused by: 
 org.apache.hadoop.security.authentication.client.AuthenticationException: 
 GSSException: No valid credentials provided (Mechanism level: Failed to find 
 any Kerberos tgt) 
 at 
 org.apache.hadoop.security.authentication.client.KerberosAuthenticator.doSpnegoSequence(KerberosAuthenticator.java:306)
  
 at 
 org.apache.hadoop.security.authentication.client.KerberosAuthenticator.authenticate(KerberosAuthenticator.java:196)
  
 at 
 org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticator.authenticate(DelegationTokenAuthenticator.java:127)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10106) Regression : Dynamic partition pruning not working after HIVE-9976

2015-03-27 Thread Gunther Hagleitner (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14384249#comment-14384249
 ] 

Gunther Hagleitner commented on HIVE-10106:
---

That sounds correct to me. +1.

 Regression : Dynamic partition pruning not working after HIVE-9976
 --

 Key: HIVE-10106
 URL: https://issues.apache.org/jira/browse/HIVE-10106
 Project: Hive
  Issue Type: Bug
  Components: Hive
Affects Versions: 1.2.0
Reporter: Mostafa Mokhtar
Assignee: Siddharth Seth
 Fix For: 1.2.0

 Attachments: HIVE-10106.1.patch


 After HIVE-9976 got checked in dynamic partition pruning doesn't work.
 Partitions are pruned and later show up in splits.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10116) CBO (Calcite Return Path): RelMdSize throws an Exception when Join is actually a Semijoin [CBO branch]

2015-03-27 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-10116:
---
Affects Version/s: cbo-branch

 CBO (Calcite Return Path): RelMdSize throws an Exception when Join is 
 actually a Semijoin [CBO branch]
 --

 Key: HIVE-10116
 URL: https://issues.apache.org/jira/browse/HIVE-10116
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Affects Versions: cbo-branch
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez
 Fix For: cbo-branch


 {{cbo_semijoin.q}} reproduces the error.
 Stacktrace:
 {noformat}
 2015-03-26 09:55:20,652 ERROR [main]: parse.CalcitePlanner 
 (CalcitePlanner.java:genOPTree(269)) - CBO failed, skipping CBO.
 java.lang.ArrayIndexOutOfBoundsException: 3
 at 
 org.apache.calcite.rel.metadata.RelMdSize.averageColumnSizes(RelMdSize.java:193)
 at sun.reflect.GeneratedMethodAccessor134.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at 
 org.apache.calcite.rel.metadata.ReflectiveRelMetadataProvider$2$1.invoke(ReflectiveRelMetadataProvider.java:194)
 at com.sun.proxy.$Proxy30.averageColumnSizes(Unknown Source)
 at sun.reflect.GeneratedMethodAccessor64.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at 
 at java.lang.reflect.Method.invoke(Method.java:606)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10113) LLAP: reducers running in LLAP starve out map retries

2015-03-27 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14384247#comment-14384247
 ] 

Siddharth Seth commented on HIVE-10113:
---

Related: https://issues.apache.org/jira/browse/HIVE-10029

This is expected at the moment. Until we support pre-empting tasks / removal of 
tasks from queues.

 LLAP: reducers running in LLAP starve out map retries
 -

 Key: HIVE-10113
 URL: https://issues.apache.org/jira/browse/HIVE-10113
 Project: Hive
  Issue Type: Sub-task
Reporter: Sergey Shelukhin
Assignee: Gunther Hagleitner

 When query 17 is run, some mappers from Map 1 currently fail (due to unwrap 
 issue, and also due to  HIVE-10112).
 This query has 1000+ reducers; if they are ran in llap, they all queue up, 
 and the query locks up.
 If only mappers run in LLAP, query completes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10066) Hive on Tez job submission through WebHCat doesn't ship Tez artifacts

2015-03-27 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14384246#comment-14384246
 ] 

Thejas M Nair commented on HIVE-10066:
--

+1

 Hive on Tez job submission through WebHCat doesn't ship Tez artifacts
 -

 Key: HIVE-10066
 URL: https://issues.apache.org/jira/browse/HIVE-10066
 Project: Hive
  Issue Type: Bug
  Components: Tez, WebHCat
Affects Versions: 1.0.0
Reporter: Eugene Koifman
Assignee: Eugene Koifman
 Attachments: HIVE-10066.2.patch, HIVE-10066.3.patch, HIVE-10066.patch


 From [~hitesh]:
 Tez is a client-side only component ( no daemons, etc ) and therefore it is 
 meant to be installed on the gateway box ( or where its client libraries are 
 needed by any other services’ daemons). It does not have any cluster 
 dependencies both in terms of libraries/jars as well as configs. When it runs 
 on a worker node, everything was pre-packaged and made available to the 
 worker node via the distributed cache via the client code. Hence, its 
 client-side configs are also only needed on the same (client) node as where 
 it is installed. The only other install step needed is to have the tez 
 tarball be uploaded to HDFS and the config has an entry “tez.lib.uris” which 
 points to the HDFS path. 
 We need a way to pass client jars and tez-site.xml to the LaunchMapper.
 We should create a general purpose mechanism here which can supply additional 
 artifacts per job type.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10116) CBO (Calcite Return Path): RelMdSize throws an Exception when Join is actually a Semijoin [CBO branch]

2015-03-27 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-10116:
---
Attachment: HIVE-10116.cbo.patch

[~ashutoshc], could you take a look? Thanks

 CBO (Calcite Return Path): RelMdSize throws an Exception when Join is 
 actually a Semijoin [CBO branch]
 --

 Key: HIVE-10116
 URL: https://issues.apache.org/jira/browse/HIVE-10116
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Affects Versions: cbo-branch
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez
 Fix For: cbo-branch

 Attachments: HIVE-10116.cbo.patch


 {{cbo_semijoin.q}} reproduces the error.
 Stacktrace:
 {noformat}
 2015-03-26 09:55:20,652 ERROR [main]: parse.CalcitePlanner 
 (CalcitePlanner.java:genOPTree(269)) - CBO failed, skipping CBO.
 java.lang.ArrayIndexOutOfBoundsException: 3
 at 
 org.apache.calcite.rel.metadata.RelMdSize.averageColumnSizes(RelMdSize.java:193)
 at sun.reflect.GeneratedMethodAccessor134.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at 
 org.apache.calcite.rel.metadata.ReflectiveRelMetadataProvider$2$1.invoke(ReflectiveRelMetadataProvider.java:194)
 at com.sun.proxy.$Proxy30.averageColumnSizes(Unknown Source)
 at sun.reflect.GeneratedMethodAccessor64.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at 
 at java.lang.reflect.Method.invoke(Method.java:606)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10086) Hive throws error when accessing Parquet file schema using field name match

2015-03-27 Thread Szehon Ho (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14384228#comment-14384228
 ] 

Szehon Ho commented on HIVE-10086:
--

+1 thanks

 Hive throws error when accessing Parquet file schema using field name match
 ---

 Key: HIVE-10086
 URL: https://issues.apache.org/jira/browse/HIVE-10086
 Project: Hive
  Issue Type: Bug
Affects Versions: 1.0.0
Reporter: Sergio Peña
Assignee: Sergio Peña
 Attachments: HIVE-10086.4.patch, HiveGroup.parquet


 When Hive table schema contains a portion of the schema of a Parquet file, 
 then the access to the values should work if the field names match the 
 schema. This does not work when a struct data type is in the schema, and 
 the Hive schema contains just a portion of the struct elements. Hive throws 
 an error instead.
 This is the example and how to reproduce:
 First, create a parquet table, and add some values on it:
 {code}
 CREATE TABLE test1 (id int, name string, address 
 structnumber:int,street:string,zip:string) STORED AS PARQUET;
 INSERT INTO TABLE test1 SELECT 1, 'Roger', 
 named_struct('number',8600,'street','Congress Ave.','zip','87366') FROM 
 srcpart LIMIT 1;
 {code}
 Note: {{srcpart}} could be any table. It is just used to leverage the INSERT 
 statement.
 The above table example generates the following Parquet file schema:
 {code}
 message hive_schema {
   optional int32 id;
   optional binary name (UTF8);
   optional group address {
 optional int32 number;
 optional binary street (UTF8);
 optional binary zip (UTF8);
   }
 }
 {code} 
 Afterwards, I create a table that contains just a portion of the schema, and 
 load the Parquet file generated above, a query will fail on that table:
 {code}
 CREATE TABLE test1 (name string, address structstreet:string) STORED AS 
 PARQUET;
 LOAD DATA LOCAL INPATH '/tmp/HiveGroup.parquet' OVERWRITE INTO TABLE test1;
 hive SELECT name FROM test1;
 OK
 Roger
 Time taken: 0.071 seconds, Fetched: 1 row(s)
 hive SELECT address FROM test1;
 OK
 Failed with exception 
 java.io.IOException:org.apache.hadoop.hive.ql.metadata.HiveException: 
 java.lang.UnsupportedOperationException: Cannot inspect 
 org.apache.hadoop.io.IntWritable
 Time taken: 0.085 seconds
 {code}
 I would expect that Parquet can access the matched names, but Hive throws an 
 error instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10116) CBO (Calcite Return Path): RelMdSize throws an Exception when Join is actually a Semijoin [CBO branch]

2015-03-27 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-10116:
---
Fix Version/s: (was: 1.2.0)
   cbo-branch

 CBO (Calcite Return Path): RelMdSize throws an Exception when Join is 
 actually a Semijoin [CBO branch]
 --

 Key: HIVE-10116
 URL: https://issues.apache.org/jira/browse/HIVE-10116
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Affects Versions: cbo-branch
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez
 Fix For: cbo-branch


 {{cbo_semijoin.q}} reproduces the error.
 Stacktrace:
 {noformat}
 2015-03-26 09:55:20,652 ERROR [main]: parse.CalcitePlanner 
 (CalcitePlanner.java:genOPTree(269)) - CBO failed, skipping CBO.
 java.lang.ArrayIndexOutOfBoundsException: 3
 at 
 org.apache.calcite.rel.metadata.RelMdSize.averageColumnSizes(RelMdSize.java:193)
 at sun.reflect.GeneratedMethodAccessor134.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at 
 org.apache.calcite.rel.metadata.ReflectiveRelMetadataProvider$2$1.invoke(ReflectiveRelMetadataProvider.java:194)
 at com.sun.proxy.$Proxy30.averageColumnSizes(Unknown Source)
 at sun.reflect.GeneratedMethodAccessor64.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at 
 at java.lang.reflect.Method.invoke(Method.java:606)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10086) Hive throws error when accessing Parquet file schema using field name match

2015-03-27 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-10086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-10086:
---
Attachment: HIVE-10086.4.patch

 Hive throws error when accessing Parquet file schema using field name match
 ---

 Key: HIVE-10086
 URL: https://issues.apache.org/jira/browse/HIVE-10086
 Project: Hive
  Issue Type: Bug
Affects Versions: 1.0.0
Reporter: Sergio Peña
Assignee: Sergio Peña
 Attachments: HIVE-10086.4.patch, HiveGroup.parquet


 When Hive table schema contains a portion of the schema of a Parquet file, 
 then the access to the values should work if the field names match the 
 schema. This does not work when a struct data type is in the schema, and 
 the Hive schema contains just a portion of the struct elements. Hive throws 
 an error instead.
 This is the example and how to reproduce:
 First, create a parquet table, and add some values on it:
 {code}
 CREATE TABLE test1 (id int, name string, address 
 structnumber:int,street:string,zip:string) STORED AS PARQUET;
 INSERT INTO TABLE test1 SELECT 1, 'Roger', 
 named_struct('number',8600,'street','Congress Ave.','zip','87366') FROM 
 srcpart LIMIT 1;
 {code}
 Note: {{srcpart}} could be any table. It is just used to leverage the INSERT 
 statement.
 The above table example generates the following Parquet file schema:
 {code}
 message hive_schema {
   optional int32 id;
   optional binary name (UTF8);
   optional group address {
 optional int32 number;
 optional binary street (UTF8);
 optional binary zip (UTF8);
   }
 }
 {code} 
 Afterwards, I create a table that contains just a portion of the schema, and 
 load the Parquet file generated above, a query will fail on that table:
 {code}
 CREATE TABLE test1 (name string, address structstreet:string) STORED AS 
 PARQUET;
 LOAD DATA LOCAL INPATH '/tmp/HiveGroup.parquet' OVERWRITE INTO TABLE test1;
 hive SELECT name FROM test1;
 OK
 Roger
 Time taken: 0.071 seconds, Fetched: 1 row(s)
 hive SELECT address FROM test1;
 OK
 Failed with exception 
 java.io.IOException:org.apache.hadoop.hive.ql.metadata.HiveException: 
 java.lang.UnsupportedOperationException: Cannot inspect 
 org.apache.hadoop.io.IntWritable
 Time taken: 0.085 seconds
 {code}
 I would expect that Parquet can access the matched names, but Hive throws an 
 error instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10118) CBO (Calcite Return Path): Internal error: Cannot find common type for join keys

2015-03-27 Thread Mostafa Mokhtar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mostafa Mokhtar updated HIVE-10118:
---
Description: 
Query 
{code}
explain
  select  ss_items.item_id
   ,ss_item_rev
   ,ss_item_rev/(ss_item_rev+cs_item_rev+ws_item_rev)/3 * 100 ss_dev
   ,cs_item_rev
   ,cs_item_rev/(ss_item_rev+cs_item_rev+ws_item_rev)/3 * 100 cs_dev
   ,ws_item_rev
query58.sql.explain.out 698L, 43463C  

  1,1   Top
   ,cs_item_rev/(ss_item_rev+cs_item_rev+ws_item_rev)/3 * 100 cs_dev
   ,ws_item_rev
   ,ws_item_rev/(ss_item_rev+cs_item_rev+ws_item_rev)/3 * 100 ws_dev
   ,(ss_item_rev+cs_item_rev+ws_item_rev)/3 average
FROM
( select i_item_id item_id ,sum(ss_ext_sales_price) as ss_item_rev
 from store_sales
 JOIN item ON store_sales.ss_item_sk = item.i_item_sk
 JOIN date_dim ON store_sales.ss_sold_date_sk = date_dim.d_date_sk
 JOIN (select d1.d_date
 from date_dim d1 JOIN date_dim d2 ON d1.d_week_seq = 
d2.d_week_seq
 where d2.d_date = '1998-08-04') sub ON date_dim.d_date = 
sub.d_date
 group by i_item_id ) ss_items
JOIN
( select i_item_id item_id ,sum(cs_ext_sales_price) as cs_item_rev
 from catalog_sales
 JOIN item ON catalog_sales.cs_item_sk = item.i_item_sk
 JOIN date_dim ON catalog_sales.cs_sold_date_sk = date_dim.d_date_sk
 JOIN (select d1.d_date
 from date_dim d1 JOIN date_dim d2 ON d1.d_week_seq = 
d2.d_week_seq
 where d2.d_date = '1998-08-04') sub ON date_dim.d_date = 
sub.d_date
 group by i_item_id ) cs_items
ON ss_items.item_id=cs_items.item_id
JOIN
( select i_item_id item_id ,sum(ws_ext_sales_price) as ws_item_rev
 from web_sales
 JOIN item ON web_sales.ws_item_sk = item.i_item_sk
 JOIN date_dim ON web_sales.ws_sold_date_sk = date_dim.d_date_sk
 JOIN (select d1.d_date
 from date_dim d1 JOIN date_dim d2 ON d1.d_week_seq = 
d2.d_week_seq
 where d2.d_date = '1998-08-04') sub ON date_dim.d_date = 
sub.d_date
 group by i_item_id ) ws_items
ON ss_items.item_id=ws_items.item_id
 where
   ss_item_rev between 0.9 * cs_item_rev and 1.1 * cs_item_rev
   and ss_item_rev between 0.9 * ws_item_rev and 1.1 * ws_item_rev
   and cs_item_rev between 0.9 * ss_item_rev and 1.1 * ss_item_rev
   and cs_item_rev between 0.9 * ws_item_rev and 1.1 * ws_item_rev
   and ws_item_rev between 0.9 * ss_item_rev and 1.1 * ss_item_rev
   and ws_item_rev between 0.9 * cs_item_rev and 1.1 * cs_item_rev
 order by item_id ,ss_item_rev
 limit 100


  41,8   6%
 order by item_id ,ss_item_rev
 limit 100
{code}

Exception 
{code}
 limit 100
15/03/27 12:38:32 [main]: ERROR parse.CalcitePlanner: CBO failed, skipping CBO.
java.lang.RuntimeException: java.lang.AssertionError: Internal error: Cannot 
find common type for join keys $1 (type INTEGER) and $1 (type 
VARCHAR(2147483647))
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner.rethrowCalciteException(CalcitePlanner.java:677)
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner.getOptimizedAST(CalcitePlanner.java:586)
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:238)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:9998)
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:201)
at 
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:224)
at 
org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:74)
at 
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:224)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:425)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:309)
at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1114)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1162)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1051)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1041)
at 
org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:207)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:159)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:370)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:305)
at 
org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:403)
at 

[jira] [Commented] (HIVE-9272) Tests for utf-8 support

2015-03-27 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14384513#comment-14384513
 ] 

Eugene Koifman commented on HIVE-9272:
--

[~asreekumar], renaming files after checkout is a good idea, but this step 
needs to be automated just like renaming of the jar is automated.  Generally, 
we want to try to minimize manual steps as much as possible.

 Tests for utf-8 support
 ---

 Key: HIVE-9272
 URL: https://issues.apache.org/jira/browse/HIVE-9272
 Project: Hive
  Issue Type: Test
  Components: Tests, WebHCat
Affects Versions: 0.14.0
Reporter: Aswathy Chellammal Sreekumar
Assignee: Aswathy Chellammal Sreekumar
Priority: Minor
 Fix For: 1.2.0

 Attachments: HIVE-9272.1.patch, HIVE-9272.2.patch, HIVE-9272.3.patch, 
 HIVE-9272.4.patch, HIVE-9272.5.patch, HIVE-9272.patch


 Including some test cases for utf8 support in webhcat. The first four tests 
 invoke hive, pig, mapred and streaming apis for testing the utf8 support for 
 data processed, file names and job name. The last test case tests the 
 filtering of job name with utf8 character



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10120) Disallow create table with dot/colon in column name

2015-03-27 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-10120:
---
Attachment: HIVE-10120.01.patch

 Disallow create table with dot/colon in column name
 ---

 Key: HIVE-10120
 URL: https://issues.apache.org/jira/browse/HIVE-10120
 Project: Hive
  Issue Type: Improvement
Reporter: Pengcheng Xiong
Assignee: Pengcheng Xiong
 Attachments: HIVE-10120.01.patch


 Since we don't allow users to query column names with dot in the middle such 
 as emp.no, don't allow users to create tables with such columns that cannot 
 be queried. Fix the documentation to reflect this fix.
 Here is an example. Consider this table:
 {code}
 CREATE TABLE a (`emp.no` string);
 select `emp.no` from a; fails with this message:
 FAILED: RuntimeException java.lang.RuntimeException: cannot find field emp 
 from [0:emp.no]
 {code}
 The hive documentation needs to be fixed:
 {code}
  (https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL) seems 
 to  indicate that any Unicode character can go between the backticks in the 
 select statement, but it doesn’t like the dot/colon or even select * when 
 there is a column that has a dot/colon. 
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10118) CBO (Calcite Return Path): Internal error: Cannot find common type for join keys

2015-03-27 Thread Mostafa Mokhtar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mostafa Mokhtar updated HIVE-10118:
---
Summary: CBO (Calcite Return Path): Internal error: Cannot find common type 
for join keys   (was: CBO : )

 CBO (Calcite Return Path): Internal error: Cannot find common type for join 
 keys 
 -

 Key: HIVE-10118
 URL: https://issues.apache.org/jira/browse/HIVE-10118
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Reporter: Mostafa Mokhtar
 Fix For: 1.2.0






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10120) Disallow create table with dot/colon in column name

2015-03-27 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-10120:
---
Description: 
Since we don't allow users to query column names with dot in the middle such as 
emp.no, don't allow users to create tables with such columns that cannot be 
queried. Fix the documentation to reflect this fix.

Here is an example. Consider this table:
{code}
CREATE TABLE a (`emp.no` string);
select `emp.no` from a; fails with this message:
FAILED: RuntimeException java.lang.RuntimeException: cannot find field emp from 
[0:emp.no]
{code}

The hive documentation needs to be fixed:
{code}
 (https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL) seems to 
 indicate that any Unicode character can go between the backticks in the select 
statement, but it doesn’t like the dot/colon or even select * when there is a 
column that has a dot/colon. 
{code}

  was:
Since we don't allow users to query column names with dot in the middle such as 
emp.no, don't allow users to create tables with such columns that cannot be 
queried. Fix the documentation to reflect this fix.

Here is an example. Consider this table:
{code}
CREATE TABLE a (`emp.no` string);
{code}
select `emp.no` from a; fails with this message:
FAILED: RuntimeException java.lang.RuntimeException: cannot find field emp from 
[0:emp.no]
{code}

The hive documentation needs to be fixed:
{code}
 (https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL) seems to 
 indicate that any Unicode character can go between the backticks in the select 
statement, but it doesn’t like the dot/colon or even select * when there is a 
column that has a dot/colon. 
{code}


 Disallow create table with dot/colon in column name
 ---

 Key: HIVE-10120
 URL: https://issues.apache.org/jira/browse/HIVE-10120
 Project: Hive
  Issue Type: Improvement
Reporter: Pengcheng Xiong
Assignee: Pengcheng Xiong

 Since we don't allow users to query column names with dot in the middle such 
 as emp.no, don't allow users to create tables with such columns that cannot 
 be queried. Fix the documentation to reflect this fix.
 Here is an example. Consider this table:
 {code}
 CREATE TABLE a (`emp.no` string);
 select `emp.no` from a; fails with this message:
 FAILED: RuntimeException java.lang.RuntimeException: cannot find field emp 
 from [0:emp.no]
 {code}
 The hive documentation needs to be fixed:
 {code}
  (https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL) seems 
 to  indicate that any Unicode character can go between the backticks in the 
 select statement, but it doesn’t like the dot/colon or even select * when 
 there is a column that has a dot/colon. 
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-10116) CBO (Calcite Return Path): RelMdSize throws an Exception when Join is actually a Semijoin [CBO branch]

2015-03-27 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan resolved HIVE-10116.
-
Resolution: Fixed

Committed to branch. Thanks, Jesus!

 CBO (Calcite Return Path): RelMdSize throws an Exception when Join is 
 actually a Semijoin [CBO branch]
 --

 Key: HIVE-10116
 URL: https://issues.apache.org/jira/browse/HIVE-10116
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Affects Versions: cbo-branch
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez
 Fix For: cbo-branch

 Attachments: HIVE-10116.cbo.patch


 {{cbo_semijoin.q}} reproduces the error.
 Stacktrace:
 {noformat}
 2015-03-26 09:55:20,652 ERROR [main]: parse.CalcitePlanner 
 (CalcitePlanner.java:genOPTree(269)) - CBO failed, skipping CBO.
 java.lang.ArrayIndexOutOfBoundsException: 3
 at 
 org.apache.calcite.rel.metadata.RelMdSize.averageColumnSizes(RelMdSize.java:193)
 at sun.reflect.GeneratedMethodAccessor134.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at 
 org.apache.calcite.rel.metadata.ReflectiveRelMetadataProvider$2$1.invoke(ReflectiveRelMetadataProvider.java:194)
 at com.sun.proxy.$Proxy30.averageColumnSizes(Unknown Source)
 at sun.reflect.GeneratedMethodAccessor64.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at 
 at java.lang.reflect.Method.invoke(Method.java:606)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10115) HS2 running on a Kerberized cluster should offer Kerberos(GSSAPI) and Delegation token(DIGEST) when alternate authentication is enabled

2015-03-27 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14384499#comment-14384499
 ] 

Hive QA commented on HIVE-10115:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12707682/HIVE-10115.0.patch

{color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 8677 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_index_bitmap3
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_stats_counter_partitioned
org.apache.hadoop.hive.metastore.txn.TestCompactionTxnHandler.testRevokeTimedOutWorkers
org.apache.hadoop.hive.thrift.TestHadoop20SAuthBridge.testMetastoreProxyUser
org.apache.hadoop.hive.thrift.TestHadoop20SAuthBridge.testSaslWithHiveMetaStore
org.apache.hive.spark.client.TestSparkClient.testSyncRpc
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3184/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3184/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3184/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 6 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12707682 - PreCommit-HIVE-TRUNK-Build

 HS2 running on a Kerberized cluster should offer Kerberos(GSSAPI) and 
 Delegation token(DIGEST) when alternate authentication is enabled
 ---

 Key: HIVE-10115
 URL: https://issues.apache.org/jira/browse/HIVE-10115
 Project: Hive
  Issue Type: Improvement
  Components: Authentication
Affects Versions: 1.0.0
Reporter: Mubashir Kazia
  Labels: patch
 Fix For: 1.1.0

 Attachments: HIVE-10115.0.patch


 In a Kerberized cluster when alternate authentication is enabled on HS2, it 
 should also accept Kerberos Authentication. The reason this is important is 
 because when we enable LDAP authentication HS2 stops accepting delegation 
 token authentication. So we are forced to enter username passwords in the 
 oozie configuration.
 The whole idea of SASL is that multiple authentication mechanism can be 
 offered. If we disable Kerberos(GSSAPI) and delegation token (DIGEST) 
 authentication when we enable LDAP authentication, this defeats SASL purpose.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10099) Enable constant folding for Decimal

2015-03-27 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14384533#comment-14384533
 ] 

Prasanth Jayachandran commented on HIVE-10099:
--

LGTM, +1

 Enable constant folding for Decimal
 ---

 Key: HIVE-10099
 URL: https://issues.apache.org/jira/browse/HIVE-10099
 Project: Hive
  Issue Type: New Feature
  Components: Logical Optimizer
Affects Versions: 0.14.0, 1.0.0, 1.1.0
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Attachments: HIVE-10099.2.patch, HIVE-10099.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10116) CBO (Calcite Return Path): RelMdSize throws an Exception when Join is actually a Semijoin [CBO branch]

2015-03-27 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14384552#comment-14384552
 ] 

Ashutosh Chauhan commented on HIVE-10116:
-

+1

 CBO (Calcite Return Path): RelMdSize throws an Exception when Join is 
 actually a Semijoin [CBO branch]
 --

 Key: HIVE-10116
 URL: https://issues.apache.org/jira/browse/HIVE-10116
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Affects Versions: cbo-branch
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez
 Fix For: cbo-branch

 Attachments: HIVE-10116.cbo.patch


 {{cbo_semijoin.q}} reproduces the error.
 Stacktrace:
 {noformat}
 2015-03-26 09:55:20,652 ERROR [main]: parse.CalcitePlanner 
 (CalcitePlanner.java:genOPTree(269)) - CBO failed, skipping CBO.
 java.lang.ArrayIndexOutOfBoundsException: 3
 at 
 org.apache.calcite.rel.metadata.RelMdSize.averageColumnSizes(RelMdSize.java:193)
 at sun.reflect.GeneratedMethodAccessor134.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at 
 org.apache.calcite.rel.metadata.ReflectiveRelMetadataProvider$2$1.invoke(ReflectiveRelMetadataProvider.java:194)
 at com.sun.proxy.$Proxy30.averageColumnSizes(Unknown Source)
 at sun.reflect.GeneratedMethodAccessor64.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at 
 at java.lang.reflect.Method.invoke(Method.java:606)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9518) Implement MONTHS_BETWEEN aligned with Oracle one

2015-03-27 Thread Alexander Pivovarov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Pivovarov updated HIVE-9518:
--
Attachment: HIVE-9518.8.patch

2 improvements in GenericUDF.getTimestampValue() related to short date format 
support

 Implement MONTHS_BETWEEN aligned with Oracle one
 

 Key: HIVE-9518
 URL: https://issues.apache.org/jira/browse/HIVE-9518
 Project: Hive
  Issue Type: Improvement
  Components: UDF
Reporter: Xiaobing Zhou
Assignee: Alexander Pivovarov
 Attachments: HIVE-9518.1.patch, HIVE-9518.2.patch, HIVE-9518.3.patch, 
 HIVE-9518.4.patch, HIVE-9518.5.patch, HIVE-9518.6.patch, HIVE-9518.7.patch, 
 HIVE-9518.8.patch


 This is used to track work to build Oracle like months_between. Here's 
 semantics:
 MONTHS_BETWEEN returns number of months between dates date1 and date2. If 
 date1 is later than date2, then the result is positive. If date1 is earlier 
 than date2, then the result is negative. If date1 and date2 are either the 
 same days of the month or both last days of months, then the result is always 
 an integer. Otherwise Oracle Database calculates the fractional portion of 
 the result based on a 31-day month and considers the difference in time 
 components date1 and date2.
 Should accept date, timestamp and string arguments in the format '-MM-dd' 
 or '-MM-dd HH:mm:ss'.
 The result should be rounded to 8 decimal places.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10100) Warning yarn jar instead of hadoop jar in hadoop 2.7.0

2015-03-27 Thread Gunther Hagleitner (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14384348#comment-14384348
 ] 

Gunther Hagleitner commented on HIVE-10100:
---

Thank you [~cnauroth]! Lowering priority. Will keep open to switch from hadoop 
jar to yarn jar to get rid of the warning over time.

 Warning yarn jar instead of hadoop jar in hadoop 2.7.0
 --

 Key: HIVE-10100
 URL: https://issues.apache.org/jira/browse/HIVE-10100
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Priority: Critical

 HADOOP-11257 adds a warning to stdout
 {noformat}
 WARNING: Use yarn jar to launch YARN applications.
 {noformat}
 which will cause issues if untreated with folks that programatically parse 
 stdout for query results (i.e.: CLI, silent mode, etc).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10099) Enable constant folding for Decimal

2015-03-27 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14384262#comment-14384262
 ] 

Hive QA commented on HIVE-10099:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12707679/HIVE-10099.2.patch

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 8677 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.thrift.TestHadoop20SAuthBridge.testMetastoreProxyUser
org.apache.hadoop.hive.thrift.TestHadoop20SAuthBridge.testSaslWithHiveMetaStore
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3183/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3183/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3183/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12707679 - PreCommit-HIVE-TRUNK-Build

 Enable constant folding for Decimal
 ---

 Key: HIVE-10099
 URL: https://issues.apache.org/jira/browse/HIVE-10099
 Project: Hive
  Issue Type: New Feature
  Components: Logical Optimizer
Affects Versions: 0.14.0, 1.0.0, 1.1.0
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Attachments: HIVE-10099.2.patch, HIVE-10099.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10117) LLAP: Use task number, attempt number to cache plans

2015-03-27 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10117?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated HIVE-10117:
--
Fix Version/s: llap

 LLAP: Use task number, attempt number to cache plans
 

 Key: HIVE-10117
 URL: https://issues.apache.org/jira/browse/HIVE-10117
 Project: Hive
  Issue Type: Sub-task
Reporter: Siddharth Seth
 Fix For: llap


 Instead of relying on thread locals only. This can be used to share the work 
 between Inputs / Processor / Outputs in Tez.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10086) Hive throws error when accessing Parquet file schema using field name match

2015-03-27 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-10086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-10086:
---
Attachment: (was: HIVE-10086.4.patch)

 Hive throws error when accessing Parquet file schema using field name match
 ---

 Key: HIVE-10086
 URL: https://issues.apache.org/jira/browse/HIVE-10086
 Project: Hive
  Issue Type: Bug
Affects Versions: 1.0.0
Reporter: Sergio Peña
Assignee: Sergio Peña
 Attachments: HIVE-10086.5.patch, HiveGroup.parquet


 When Hive table schema contains a portion of the schema of a Parquet file, 
 then the access to the values should work if the field names match the 
 schema. This does not work when a struct data type is in the schema, and 
 the Hive schema contains just a portion of the struct elements. Hive throws 
 an error instead.
 This is the example and how to reproduce:
 First, create a parquet table, and add some values on it:
 {code}
 CREATE TABLE test1 (id int, name string, address 
 structnumber:int,street:string,zip:string) STORED AS PARQUET;
 INSERT INTO TABLE test1 SELECT 1, 'Roger', 
 named_struct('number',8600,'street','Congress Ave.','zip','87366') FROM 
 srcpart LIMIT 1;
 {code}
 Note: {{srcpart}} could be any table. It is just used to leverage the INSERT 
 statement.
 The above table example generates the following Parquet file schema:
 {code}
 message hive_schema {
   optional int32 id;
   optional binary name (UTF8);
   optional group address {
 optional int32 number;
 optional binary street (UTF8);
 optional binary zip (UTF8);
   }
 }
 {code} 
 Afterwards, I create a table that contains just a portion of the schema, and 
 load the Parquet file generated above, a query will fail on that table:
 {code}
 CREATE TABLE test1 (name string, address structstreet:string) STORED AS 
 PARQUET;
 LOAD DATA LOCAL INPATH '/tmp/HiveGroup.parquet' OVERWRITE INTO TABLE test1;
 hive SELECT name FROM test1;
 OK
 Roger
 Time taken: 0.071 seconds, Fetched: 1 row(s)
 hive SELECT address FROM test1;
 OK
 Failed with exception 
 java.io.IOException:org.apache.hadoop.hive.ql.metadata.HiveException: 
 java.lang.UnsupportedOperationException: Cannot inspect 
 org.apache.hadoop.io.IntWritable
 Time taken: 0.085 seconds
 {code}
 I would expect that Parquet can access the matched names, but Hive throws an 
 error instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10114) Split strategies for ORC

2015-03-27 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-10114:
-
Attachment: HIVE-10114.2.patch

Fix test failures.

 Split strategies for ORC
 

 Key: HIVE-10114
 URL: https://issues.apache.org/jira/browse/HIVE-10114
 Project: Hive
  Issue Type: Improvement
Affects Versions: 1.2.0
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran
 Attachments: HIVE-10114.1.patch, HIVE-10114.2.patch


 ORC split generation does not have clearly defined strategies for different 
 scenarios (many small orc files, few small orc files, many large files etc.). 
 Few strategies like storing the file footer in orc split, making entire file 
 as a orc split already exists. This JIRA to make the split generation 
 simpler, support different strategies for various use cases (BI, ETL, ACID 
 etc.) and to lay the foundation for HIVE-7428.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9709) Hive should support replaying cookie from JDBC driver for beeline

2015-03-27 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-9709:

Attachment: HIVE-9709.2.patch

 Hive should support replaying cookie from JDBC driver for beeline
 -

 Key: HIVE-9709
 URL: https://issues.apache.org/jira/browse/HIVE-9709
 Project: Hive
  Issue Type: Bug
Reporter: Hari Sankar Sivarama Subramaniyan
Assignee: Hari Sankar Sivarama Subramaniyan
 Attachments: HIVE-9709.1.patch, HIVE-9709.2.patch


 Consider the following scenario:
 Beeline  Knox  HS2.
 Where Knox is going to LDAP for authentication. To avoid re-authentication, 
 Knox supports using a Cookie to identity a request. However the Beeline JDBC 
 client does not send back the cookie Knox sent and this leads to Knox having 
 to re-create LDAP authentication request on every connection.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9780) Add another level of explain for RDBMS audience

2015-03-27 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14384732#comment-14384732
 ] 

Hive QA commented on HIVE-9780:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12707691/HIVE-9780.05.patch

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 8680 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_explainuser_1
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_schemeAuthority
org.apache.hadoop.hive.thrift.TestHadoop20SAuthBridge.testMetastoreProxyUser
org.apache.hadoop.hive.thrift.TestHadoop20SAuthBridge.testSaslWithHiveMetaStore
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3185/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3185/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3185/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12707691 - PreCommit-HIVE-TRUNK-Build

 Add another level of explain for RDBMS audience
 ---

 Key: HIVE-9780
 URL: https://issues.apache.org/jira/browse/HIVE-9780
 Project: Hive
  Issue Type: Improvement
Reporter: Pengcheng Xiong
Assignee: Pengcheng Xiong
Priority: Minor
 Attachments: HIVE-9780.01.patch, HIVE-9780.02.patch, 
 HIVE-9780.03.patch, HIVE-9780.04.patch, HIVE-9780.05.patch


 Current Hive Explain (default) is targeted at MR Audience. We need a new 
 level of explain plan to be targeted at RDBMS audience. The explain requires 
 these:
 1) The focus needs to be on what part of the query is being executed rather 
 than internals of the engines
 2) There needs to be a clearly readable tree of operations
 3) Examples - Table scan should mention the table being scanned, the Sarg, 
 the size of table and expected cardinality after the Sarg'ed read. The join 
 should mention the table being joined with and the join condition. The 
 aggregate should mention the columns in the group-by. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10089) RCFile: lateral view explode caused ConcurrentModificationException

2015-03-27 Thread Selina Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Selina Zhang updated HIVE-10089:

Attachment: HIVE-10089.1.patch

 RCFile: lateral view explode caused ConcurrentModificationException
 ---

 Key: HIVE-10089
 URL: https://issues.apache.org/jira/browse/HIVE-10089
 Project: Hive
  Issue Type: Bug
Affects Versions: 1.2.0
Reporter: Selina Zhang
Assignee: Selina Zhang
 Attachments: HIVE-10089.1.patch


 CREATE TABLE test_table123 (a INT, b MAPSTRING, STRING) STORED AS RCFILE;
 INSERT OVERWRITE TABLE test_table123 SELECT 1, MAP(a1, b1, c1, d1) 
 FROM src LIMIT 1;
 The following query will lead to ConcurrentModificationException
 SELECT * FROM (SELECT b FROM test_table123) t1 LATERAL VIEW explode(b) x AS 
 b,c LIMIT 1;
 Failed with exception 
 java.io.IOException:java.util.ConcurrentModificationException



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9937) LLAP: Vectorized Field-By-Field Serialize / Deserialize to support new Vectorized Map Join

2015-03-27 Thread Matt McCline (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14384635#comment-14384635
 ] 

Matt McCline commented on HIVE-9937:


Thank you for the review comments.

I did have trouble with one of the old VectorSerDe class buffering up 1024 
Object[] rows and it caused Writable overwrite problems.  But I stopped using 
that SerDe.  The singleRow trick has been used by VectorReduceSinkOperator and 
VectorFileSinkOperator for a while with no problems.

 LLAP: Vectorized Field-By-Field Serialize / Deserialize to support new 
 Vectorized Map Join
 --

 Key: HIVE-9937
 URL: https://issues.apache.org/jira/browse/HIVE-9937
 Project: Hive
  Issue Type: Sub-task
Reporter: Matt McCline
Assignee: Matt McCline
 Attachments: HIVE-9937.01.patch, HIVE-9937.02.patch, 
 HIVE-9937.03.patch, HIVE-9937.04.patch, HIVE-9937.05.patch, 
 HIVE-9937.06.patch, HIVE-9937.07.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10074) Ability to run HCat Client Unit tests in a system test setting

2015-03-27 Thread Deepesh Khandelwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14384666#comment-14384666
 ] 

Deepesh Khandelwal commented on HIVE-10074:
---

[~sushanth] the error still seems unrelated. Can you suggest what the next 
steps here should be?

 Ability to run HCat Client Unit tests in a system test setting
 --

 Key: HIVE-10074
 URL: https://issues.apache.org/jira/browse/HIVE-10074
 Project: Hive
  Issue Type: Bug
  Components: Tests
Reporter: Deepesh Khandelwal
Assignee: Deepesh Khandelwal
 Attachments: HIVE-10074.1.patch, HIVE-10074.patch


 Following testsuite 
 {{hcatalog/webhcat/java-client/src/test/java/org/apache/hive/hcatalog/api/TestHCatClient.java}}
  is a JUnit testsuite to test some basic HCat client API. During setup it 
 brings up a Hive Metastore with embedded Derby. The testsuite however will be 
 even more useful if it can be run against a running Hive Metastore 
 (transparent to whatever backing DB its running against).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9272) Tests for utf-8 support

2015-03-27 Thread Aswathy Chellammal Sreekumar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14384674#comment-14384674
 ] 

Aswathy Chellammal Sreekumar commented on HIVE-9272:


[~ekoifman] Please find attached the patch with renaming of input files 
automated (in deploy_e2e_artifacts.sh). I was a little hesitant to add this 
initially as this looked like the only local file operation, but certainly will 
come in handy for anyone running the the test suite.

 Tests for utf-8 support
 ---

 Key: HIVE-9272
 URL: https://issues.apache.org/jira/browse/HIVE-9272
 Project: Hive
  Issue Type: Test
  Components: Tests, WebHCat
Affects Versions: 0.14.0
Reporter: Aswathy Chellammal Sreekumar
Assignee: Aswathy Chellammal Sreekumar
Priority: Minor
 Fix For: 1.2.0

 Attachments: HIVE-9272.1.patch, HIVE-9272.2.patch, HIVE-9272.3.patch, 
 HIVE-9272.4.patch, HIVE-9272.5.patch, HIVE-9272.6.patch, HIVE-9272.patch


 Including some test cases for utf8 support in webhcat. The first four tests 
 invoke hive, pig, mapred and streaming apis for testing the utf8 support for 
 data processed, file names and job name. The last test case tests the 
 filtering of job name with utf8 character



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10082) LLAP: UnwrappedRowContainer throws exceptions

2015-03-27 Thread Gunther Hagleitner (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-10082:
--
Attachment: HIVE-10082.2.patch

 LLAP: UnwrappedRowContainer throws exceptions
 -

 Key: HIVE-10082
 URL: https://issues.apache.org/jira/browse/HIVE-10082
 Project: Hive
  Issue Type: Bug
Affects Versions: llap
Reporter: Gopal V
Assignee: Gunther Hagleitner
 Fix For: llap

 Attachments: HIVE-10082.1.patch, HIVE-10082.2.patch


 TPC-DS Query27 runs with map-joins enabled results in errors originating from 
 these lines in UnwrappedRowContainer::unwrap() 
 {code}
for (int index : valueIndex) {
   if (index = 0) {
 unwrapped.add(currentKey == null ? null : currentKey[index]);
   } else {
 unwrapped.add(values.get(-index - 1));
   }
 }
 {code}
 {code}
 Caused by: java.lang.IndexOutOfBoundsException: Index: 0, Size: 0
 at java.util.ArrayList.rangeCheck(ArrayList.java:653)
 at java.util.ArrayList.get(ArrayList.java:429)
 at 
 org.apache.hadoop.hive.ql.exec.persistence.UnwrapRowContainer.unwrap(UnwrapRowContainer.java:79)
 at 
 org.apache.hadoop.hive.ql.exec.persistence.UnwrapRowContainer.first(UnwrapRowContainer.java:62)
 at 
 org.apache.hadoop.hive.ql.exec.persistence.UnwrapRowContainer.first(UnwrapRowContainer.java:33)
 at 
 org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genAllOneUniqueJoinObject(CommonJoinOperator.java:670)
 at 
 org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject(CommonJoinOperator.java:754)
 at 
 org.apache.hadoop.hive.ql.exec.MapJoinOperator.process(MapJoinOperator.java:341)
 {code}
 This is intermittent and does not cause query failures as the retries 
 succeed, but slows down the query by an entire wave due to the retry.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9272) Tests for utf-8 support

2015-03-27 Thread Aswathy Chellammal Sreekumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aswathy Chellammal Sreekumar updated HIVE-9272:
---
Attachment: HIVE-9272.6.patch

 Tests for utf-8 support
 ---

 Key: HIVE-9272
 URL: https://issues.apache.org/jira/browse/HIVE-9272
 Project: Hive
  Issue Type: Test
  Components: Tests, WebHCat
Affects Versions: 0.14.0
Reporter: Aswathy Chellammal Sreekumar
Assignee: Aswathy Chellammal Sreekumar
Priority: Minor
 Fix For: 1.2.0

 Attachments: HIVE-9272.1.patch, HIVE-9272.2.patch, HIVE-9272.3.patch, 
 HIVE-9272.4.patch, HIVE-9272.5.patch, HIVE-9272.6.patch, HIVE-9272.patch


 Including some test cases for utf8 support in webhcat. The first four tests 
 invoke hive, pig, mapred and streaming apis for testing the utf8 support for 
 data processed, file names and job name. The last test case tests the 
 filtering of job name with utf8 character



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10001) SMB join in reduce side

2015-03-27 Thread Vikram Dixit K (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14384780#comment-14384780
 ] 

Vikram Dixit K commented on HIVE-10001:
---

Address review comments.

 SMB join in reduce side
 ---

 Key: HIVE-10001
 URL: https://issues.apache.org/jira/browse/HIVE-10001
 Project: Hive
  Issue Type: Bug
Affects Versions: 1.2.0
Reporter: Vikram Dixit K
Assignee: Vikram Dixit K
 Attachments: HIVE-10001.1.patch, HIVE-10001.2.patch, 
 HIVE-10001.3.patch, HIVE-10001.4.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10050) Support overriding memory configuration for AM launched for TempletonControllerJob

2015-03-27 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated HIVE-10050:
---
Attachment: HIVE-10050.2.patch

 Support overriding memory configuration for AM launched for 
 TempletonControllerJob
 --

 Key: HIVE-10050
 URL: https://issues.apache.org/jira/browse/HIVE-10050
 Project: Hive
  Issue Type: Bug
  Components: WebHCat
Reporter: Hitesh Shah
Assignee: Hitesh Shah
 Attachments: HIVE-10050.1.patch, HIVE-10050.2.patch


 The MR AM launched for the TempletonControllerJob does not do any heavy 
 lifting and therefore can be configured to use a small memory footprint ( as 
 compared to potentially using the default footprint for most MR jobs on a 
 cluster ). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10122) Hive metastore filter-by-expression is broken for non-partition expressions

2015-03-27 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-10122:

Component/s: Metastore

 Hive metastore filter-by-expression is broken for non-partition expressions
 ---

 Key: HIVE-10122
 URL: https://issues.apache.org/jira/browse/HIVE-10122
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Reporter: Sergey Shelukhin

 See 
 https://issues.apache.org/jira/browse/HIVE-10091?focusedCommentId=14382413page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14382413
 These two lines of code
 {noformat}
 // Replace virtual columns with nulls. See javadoc for details.
 prunerExpr = removeNonPartCols(prunerExpr, extractPartColNames(tab), 
 partColsUsedInFilter);
 // Remove all parts that are not partition columns. See javadoc for 
 details.
 ExprNodeDesc compactExpr = compactExpr(prunerExpr.clone());
 {noformat}
 are supposed to take care of this; I see there were bunch of changes to this 
 code over some time, and now it appears to be broken.
 Thanks to [~thejas] for info.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10022) DFS in authorization might take too long

2015-03-27 Thread Pankit Thapar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14384753#comment-14384753
 ] 

Pankit Thapar commented on HIVE-10022:
--

Hi [~thejas] , Can you please comment on the failures. These tests  pass on my 
local machine. Only testNegativeCliDriver_authorization_uri_import fails but 
that fails even without the patch on my local machine.


 DFS in authorization might take too long
 

 Key: HIVE-10022
 URL: https://issues.apache.org/jira/browse/HIVE-10022
 Project: Hive
  Issue Type: Bug
  Components: Authorization
Affects Versions: 0.14.0
Reporter: Pankit Thapar
Assignee: Pankit Thapar
 Fix For: 1.0.1

 Attachments: HIVE-10022.2.patch, HIVE-10022.patch


 I am testing a query like : 
 set hive.test.authz.sstd.hs2.mode=true;
 set 
 hive.security.authorization.manager=org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdHiveAuthorizerFactoryForTest;
 set 
 hive.security.authenticator.manager=org.apache.hadoop.hive.ql.security.SessionStateConfigUserAuthenticator;
 set hive.security.authorization.enabled=true;
 set user.name=user1;
 create table auth_noupd(i int) clustered by (i) into 2 buckets stored as orc 
 location '${OUTPUT}' TBLPROPERTIES ('transactional'='true');
 Now, in the above query,  since authorization is true, 
 we would end up calling doAuthorizationV2() which ultimately ends up calling 
 SQLAuthorizationUtils.getPrivilegesFromFS() which calls a recursive method : 
 FileUtils.isActionPermittedForFileHierarchy() with the object or the ancestor 
 of the object we are trying to authorize if the object does not exist. 
 The logic in FileUtils.isActionPermittedForFileHierarchy() is DFS.
 Now assume, we have a path as a/b/c/d that we are trying to authorize.
 In case, a/b/c/d does not exist, we would call 
 FileUtils.isActionPermittedForFileHierarchy() with say a/b/ assuming a/b/c 
 also does not exist.
 If under the subtree at a/b, we have millions of files, then 
 FileUtils.isActionPermittedForFileHierarchy()  is going to check file 
 permission on each of those objects. 
 I do not completely understand why do we have to check for file permissions 
 in all the objects in  branch of the tree that we are not  trying to read 
 from /write to.  
 We could have checked file permission on the ancestor that exists and if it 
 matches what we expect, the return true.
 Please confirm if this is a bug so that I can submit a patch else let me know 
 what I am missing ?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10001) SMB join in reduce side

2015-03-27 Thread Vikram Dixit K (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikram Dixit K updated HIVE-10001:
--
Attachment: HIVE-10001.4.patch

 SMB join in reduce side
 ---

 Key: HIVE-10001
 URL: https://issues.apache.org/jira/browse/HIVE-10001
 Project: Hive
  Issue Type: Bug
Affects Versions: 1.2.0
Reporter: Vikram Dixit K
Assignee: Vikram Dixit K
 Attachments: HIVE-10001.1.patch, HIVE-10001.2.patch, 
 HIVE-10001.3.patch, HIVE-10001.4.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10000) 10000 whoooohooo

2015-03-27 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14384804#comment-14384804
 ] 

Lefty Leverenz commented on HIVE-1:
---

+1

Let's resolve this so [~damien.carol] will get credit for his bodacious artwork.

 1 whhooo
 

 Key: HIVE-1
 URL: https://issues.apache.org/jira/browse/HIVE-1
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Damien Carol

 {noformat}
 g▄µ,  
   
   ▄█▀²▀▀▀██▄▄,
   
 ▄▓▓░/;,,...`▀▀▀█▄, ,gg,   
   
  ╓,,,g▄▓▓░░░║║╓w;,.`²▀██▄,  ,▄▀▀░░░▀█,
   
▄▓▓▀▀▓▓▓█▄▄║║Ü╓w,,...`▀▀█▓▒░░░╟▄░░▀█   
   
   █▓░░░▀▀▒░░░║╗ÿw,..░▀▓█▄░▀█░µ░█╥ 
   
  █▓░░░║╗y..²▓▓▌░█░║/▀▄
   
  ▓▌░░░║║╓░░╩B▒▀≡██▄▄░║;..▀▓║▐▒░░¡▓▌   
   
  ▀▓░░║»╓▄▒▓▓▓▒░░¡.▀▌ÿ▓`   
   
   ▀█░╓╓░;,|`▀▓░░║y^██M
   
 ▀█║║y░▀░░░║░█▒░█▌ 
   
   ▀█▄░░░║░░░▒░░▓▌ 
   
 ▓█▄▄░░░▒▓█
   
  ▀█░░░▄▄██▓▓░░░▄▄▒░░▒▓█   
   
   ▀▌░░░g▄▄██▀▀░░░░░▒░░▓█  
   
`²▀²² g▓▒║░░░▒▓▓▓▄██▄░░░▓█ 
   
 ▄▓▓█░▒▓▓▒░░▒▒▒▓▓▒░░░╣▀██▓g▄   
   
╓▌░▒▓▓▓█▄░░░▒▒█▓▓▒▒▒▀▀▒▓`  
   
▒▀█░▒▌░░▒▀░▓░░▒▓   
   
▒.⌠▓▄▒▓▓▓░▀░░▒▓╛   
   
▒..░░▓█▄░░░▓▓█▒▒▒░░░▒▓▌
   
▐▓▄▀▓█▄░▀▓▓█░░░▓▓▄╣▒▓▓ 
   
 ▓▓▓█▄░▀▓█▓▓▀▀▀▓█▒▓█▄░░░▒▒█▓▀  
   
 ▀▓██▄░░░▓▓▓█,  ▀█▓▓▓█▀`▀▀██▓▀`
   
  ▒░▀███▄▓▓ ²▀▀φy  ▄▄▄╖µ▄▄▄¡▄▄▄╖ 
 ,▄▄▄╖▄▄▄.   
   ╙µ¡░░░▀▀▓▄▄g╓.  ▓▓▓▌█▓▓▓░▓▓▓▌ 
 ▐▓▓▓░▓▓▓N   
 ▀█▄▄░▀▓▓µ ▓▓▓▌█▓▓▓░▓▓▓▌╟▓▓▓▌█▓▓▓ 
 ²`²
   ▀███▓▓▓▌╛   ░▓▓▓▌ ▓▓▓Ñ 
 ▓▌ 
  ▀▀▀▀▓▓▓██▓▓▓█]▓▓▓▌ ▐▓▓  
 ▓Ñ 
  `╙╨╫░░░▄█▓▀▀²▓▓▓▌█▓▓▓░▓▓▓▌ ╘▓▌  
 ▄▄▄µ   
   ▓▓▓▌█▓▓▓░▓▓▓▌  █`  
 ▓▓▓▌   
    ```   
 ```
  ██╗ ██╗  ██╗  ██╗  ██╗ 
 ███║██╔═╗██╔═╗██╔═╗██╔═╗
 ╚██║██║██╔██║██║██╔██║██║██╔██║██║██╔██║
  ██║╔╝██║╔╝██║╔╝██║╔╝██║
  ██║╚██╔╝╚██╔╝╚██╔╝╚██╔╝
  ╚═╝ ╚═╝  ╚═╝  ╚═╝  ╚═╝ 
 
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10123) Hybrid grace Hash join : Use estimate key count from stats to initialize BytesBytesMultiHashMap

2015-03-27 Thread Mostafa Mokhtar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mostafa Mokhtar updated HIVE-10123:
---
Attachment: HIVE-10123.01.patch

 Hybrid grace Hash join : Use estimate key count from stats to initialize 
 BytesBytesMultiHashMap
 ---

 Key: HIVE-10123
 URL: https://issues.apache.org/jira/browse/HIVE-10123
 Project: Hive
  Issue Type: Bug
  Components: Hive
Affects Versions: 1.2.0
Reporter: Mostafa Mokhtar
Assignee: Mostafa Mokhtar
 Fix For: 1.2.0

 Attachments: HIVE-10123.01.patch


 Hybrid grace Hash join is not using estimated number of rows from the 
 statistics to initialize BytesBytesMultiHashMap. 
 Add some logging to BytesBytesMultiHashMap to track get probes and use msec 
 for expandAndRehash as us overflow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10123) Hybrid grace Hash join : Use estimate key count from stats to initialize BytesBytesMultiHashMap

2015-03-27 Thread Mostafa Mokhtar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mostafa Mokhtar updated HIVE-10123:
---
Attachment: (was: HIVE-10123.01.patch)

 Hybrid grace Hash join : Use estimate key count from stats to initialize 
 BytesBytesMultiHashMap
 ---

 Key: HIVE-10123
 URL: https://issues.apache.org/jira/browse/HIVE-10123
 Project: Hive
  Issue Type: Bug
  Components: Hive
Affects Versions: 1.2.0
Reporter: Mostafa Mokhtar
Assignee: Mostafa Mokhtar
 Fix For: 1.2.0

 Attachments: HIVE-10123.01.patch


 Hybrid grace Hash join is not using estimated number of rows from the 
 statistics to initialize BytesBytesMultiHashMap. 
 Add some logging to BytesBytesMultiHashMap to track get probes and use msec 
 for expandAndRehash as us overflow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10123) Hybrid grace Hash join : Use estimate key count from stats to initialize BytesBytesMultiHashMap

2015-03-27 Thread Mostafa Mokhtar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14384851#comment-14384851
 ] 

Mostafa Mokhtar commented on HIVE-10123:


[~sershe] [~wzheng]
Can you please take a look?

 Hybrid grace Hash join : Use estimate key count from stats to initialize 
 BytesBytesMultiHashMap
 ---

 Key: HIVE-10123
 URL: https://issues.apache.org/jira/browse/HIVE-10123
 Project: Hive
  Issue Type: Bug
  Components: Hive
Affects Versions: 1.2.0
Reporter: Mostafa Mokhtar
Assignee: Mostafa Mokhtar
 Fix For: 1.2.0

 Attachments: HIVE-10123.01.patch


 Hybrid grace Hash join is not using estimated number of rows from the 
 statistics to initialize BytesBytesMultiHashMap. 
 Add some logging to BytesBytesMultiHashMap to track get probes and use msec 
 for expandAndRehash as us overflow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10123) Hybrid grace Hash join : Use estimate key count from stats to initialize BytesBytesMultiHashMap

2015-03-27 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14384866#comment-14384866
 ] 

Sergey Shelukhin commented on HIVE-10123:
-

If you are changing metric to ms, why not use System.current-ms instead of a 
more expensive nano call?
Then, there should be a better way to expand and rehash directly to target, w/o 
iterations

 Hybrid grace Hash join : Use estimate key count from stats to initialize 
 BytesBytesMultiHashMap
 ---

 Key: HIVE-10123
 URL: https://issues.apache.org/jira/browse/HIVE-10123
 Project: Hive
  Issue Type: Bug
  Components: Hive
Affects Versions: 1.2.0
Reporter: Mostafa Mokhtar
Assignee: Mostafa Mokhtar
 Fix For: 1.2.0

 Attachments: HIVE-10123.01.patch


 Hybrid grace Hash join is not using estimated number of rows from the 
 statistics to initialize BytesBytesMultiHashMap. 
 Add some logging to BytesBytesMultiHashMap to track get probes and use msec 
 for expandAndRehash as us overflow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10122) Hive metastore filter-by-expression is broken for non-partition expressions

2015-03-27 Thread Mostafa Mokhtar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14384874#comment-14384874
 ] 

Mostafa Mokhtar commented on HIVE-10122:


[~sershe] [~thejas] [~hagleitn]

I ran explain for this query 
{code}
select 
ss_item_sk rowcount
from
store_sales
where
ss_sold_date_sk between 2450816 and 2450817
and ss_ticket_number  1
and ss_item_sk  50;
{code}

And the query that gets issues to MySQL looks correct to me as only the 
qualified partitions are queried.
What am I missing?

{code}
select 
COLUMN_NAME,
COLUMN_TYPE,
min(LONG_LOW_VALUE),
max(LONG_HIGH_VALUE),
min(DOUBLE_LOW_VALUE),
max(DOUBLE_HIGH_VALUE),
min(BIG_DECIMAL_LOW_VALUE),
max(BIG_DECIMAL_HIGH_VALUE),
sum(NUM_NULLS),
max(NUM_DISTINCTS),
max(AVG_COL_LEN),
max(MAX_COL_LEN),
sum(NUM_TRUES),
sum(NUM_FALSES)
from
PART_COL_STATS
where
DB_NAME = 'tpcds_bin_partitioned_orc_3'
and TABLE_NAME = 'store_sales'
and COLUMN_NAME in ('ss_item_sk' , 'ss_ticket_number')
and PARTITION_NAME in ('ss_sold_date_sk=2450816' , 
'ss_sold_date_sk=2450817')
group by COLUMN_NAME , COLUMN_TYPE
{code}

 Hive metastore filter-by-expression is broken for non-partition expressions
 ---

 Key: HIVE-10122
 URL: https://issues.apache.org/jira/browse/HIVE-10122
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.14.0, 1.0.0, 1.1.0
Reporter: Sergey Shelukhin

 See 
 https://issues.apache.org/jira/browse/HIVE-10091?focusedCommentId=14382413page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14382413
 These two lines of code
 {noformat}
 // Replace virtual columns with nulls. See javadoc for details.
 prunerExpr = removeNonPartCols(prunerExpr, extractPartColNames(tab), 
 partColsUsedInFilter);
 // Remove all parts that are not partition columns. See javadoc for 
 details.
 ExprNodeDesc compactExpr = compactExpr(prunerExpr.clone());
 {noformat}
 are supposed to take care of this; I see there were bunch of changes to this 
 code over some time, and now it appears to be broken.
 Thanks to [~thejas] for info.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10122) Hive metastore filter-by-expression is broken for non-partition expressions

2015-03-27 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14384880#comment-14384880
 ] 

Sergey Shelukhin commented on HIVE-10122:
-

That is stats; do you see MySQL queries to PARTITIONS table?

 Hive metastore filter-by-expression is broken for non-partition expressions
 ---

 Key: HIVE-10122
 URL: https://issues.apache.org/jira/browse/HIVE-10122
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.14.0, 1.0.0, 1.1.0
Reporter: Sergey Shelukhin

 See 
 https://issues.apache.org/jira/browse/HIVE-10091?focusedCommentId=14382413page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14382413
 These two lines of code
 {noformat}
 // Replace virtual columns with nulls. See javadoc for details.
 prunerExpr = removeNonPartCols(prunerExpr, extractPartColNames(tab), 
 partColsUsedInFilter);
 // Remove all parts that are not partition columns. See javadoc for 
 details.
 ExprNodeDesc compactExpr = compactExpr(prunerExpr.clone());
 {noformat}
 are supposed to take care of this; I see there were bunch of changes to this 
 code over some time, and now it appears to be broken.
 Thanks to [~thejas] for info.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10106) Regression : Dynamic partition pruning not working after HIVE-9976

2015-03-27 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14384886#comment-14384886
 ] 

Hive QA commented on HIVE-10106:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12707704/HIVE-10106.1.patch

{color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 8677 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_remote_script
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_root_dir_external_table
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_scriptfile1
org.apache.hadoop.hive.thrift.TestHadoop20SAuthBridge.testMetastoreProxyUser
org.apache.hadoop.hive.thrift.TestHadoop20SAuthBridge.testSaslWithHiveMetaStore
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3186/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3186/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3186/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 5 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12707704 - PreCommit-HIVE-TRUNK-Build

 Regression : Dynamic partition pruning not working after HIVE-9976
 --

 Key: HIVE-10106
 URL: https://issues.apache.org/jira/browse/HIVE-10106
 Project: Hive
  Issue Type: Bug
  Components: Hive
Affects Versions: 1.2.0
Reporter: Mostafa Mokhtar
Assignee: Siddharth Seth
 Fix For: 1.2.0

 Attachments: HIVE-10106.1.patch


 After HIVE-9976 got checked in dynamic partition pruning doesn't work.
 Partitions are pruned and later show up in splits.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-10125) LLAP: Print execution modes in tez in-place UI

2015-03-27 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran resolved HIVE-10125.
--
Resolution: Fixed

Committed to llap branch

 LLAP: Print execution modes in tez in-place UI
 --

 Key: HIVE-10125
 URL: https://issues.apache.org/jira/browse/HIVE-10125
 Project: Hive
  Issue Type: Sub-task
Affects Versions: llap
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran
 Attachments: HIVE-10125.1.patch


 There are different execution modes container, llap and uber. Print the 
 execution mode of the work in in-place UI.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-10028) LLAP: Create a fixed size execution queue for daemons

2015-03-27 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran reassigned HIVE-10028:


Assignee: Prasanth Jayachandran

 LLAP: Create a fixed size execution queue for daemons
 -

 Key: HIVE-10028
 URL: https://issues.apache.org/jira/browse/HIVE-10028
 Project: Hive
  Issue Type: Sub-task
Reporter: Siddharth Seth
Assignee: Prasanth Jayachandran
 Fix For: llap


 Currently, this is unbounded. This should be a configurable size.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-10122) Hive metastore filter-by-expression is broken for non-partition expressions

2015-03-27 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin reassigned HIVE-10122:
---

Assignee: Sergey Shelukhin

 Hive metastore filter-by-expression is broken for non-partition expressions
 ---

 Key: HIVE-10122
 URL: https://issues.apache.org/jira/browse/HIVE-10122
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.14.0, 1.0.0, 1.1.0
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin

 See 
 https://issues.apache.org/jira/browse/HIVE-10091?focusedCommentId=14382413page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14382413
 These two lines of code
 {noformat}
 // Replace virtual columns with nulls. See javadoc for details.
 prunerExpr = removeNonPartCols(prunerExpr, extractPartColNames(tab), 
 partColsUsedInFilter);
 // Remove all parts that are not partition columns. See javadoc for 
 details.
 ExprNodeDesc compactExpr = compactExpr(prunerExpr.clone());
 {noformat}
 are supposed to take care of this; I see there were bunch of changes to this 
 code over some time, and now it appears to be broken.
 Thanks to [~thejas] for info.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10112) LLAP: query 17 tasks fail due to mapjoin issue

2015-03-27 Thread Gopal V (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-10112:
---
Assignee: Gunther Hagleitner  (was: Sergey Shelukhin)

 LLAP: query 17 tasks fail due to mapjoin issue
 --

 Key: HIVE-10112
 URL: https://issues.apache.org/jira/browse/HIVE-10112
 Project: Hive
  Issue Type: Sub-task
Reporter: Sergey Shelukhin
Assignee: Gunther Hagleitner

 {noformat}
 2015-03-26 18:16:38,833 
 [TezTaskRunner_attempt_1424502260528_1696_1_07_00_0(container_1_1696_01_000220_sershe_20150326181607_188ab263-0a13-4528-b778-c803f378640d:1_Map
  1_0_0)] ERROR org.apache.hadoop.hive.ql.exec.tez.TezProcessor: 
 java.lang.RuntimeException: java.lang.AssertionError: Length is negative: -54
 at 
 org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:91)
 at 
 org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:68)
 at 
 org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:308)
 at 
 org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:148)
 at 
 org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137)
 at 
 org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:330)
 at 
 org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179)
 at 
 org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:422)
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
 at 
 org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171)
 at 
 org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167)
 at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
 at java.util.concurrent.FutureTask.run(FutureTask.java:266)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
 at java.lang.Thread.run(Thread.java:745)
 Caused by: java.lang.AssertionError: Length is negative: -54
 at 
 org.apache.hadoop.hive.serde2.WriteBuffers$ByteSegmentRef.init(WriteBuffers.java:339)
 at 
 org.apache.hadoop.hive.ql.exec.persistence.BytesBytesMultiHashMap.getValueRefs(BytesBytesMultiHashMap.java:270)
 at 
 org.apache.hadoop.hive.ql.exec.persistence.MapJoinBytesTableContainer$ReusableRowContainer.setFromOutput(MapJoinBytesTableContainer.java:429)
 at 
 org.apache.hadoop.hive.ql.exec.persistence.MapJoinBytesTableContainer$GetAdaptor.setFromVector(MapJoinBytesTableContainer.java:349)
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.setMapJoinKey(VectorMapJoinOperator.java:222)
 at 
 org.apache.hadoop.hive.ql.exec.MapJoinOperator.process(MapJoinOperator.java:310)
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.process(VectorMapJoinOperator.java:252)
 at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837)
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorFilterOperator.process(VectorFilterOperator.java:114)
 at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837)
 at 
 org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:97)
 at 
 org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:163)
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:45)
 at 
 org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:83)
 {noformat}
 Tasks do appear to pass on retries.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10128) LLAP: BytesBytesMultiHashMap does not allow concurrent read-only access

2015-03-27 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14384970#comment-14384970
 ] 

Sergey Shelukhin commented on HIVE-10128:
-

rather, bugs were before sync blocks were added earlier today.

 LLAP: BytesBytesMultiHashMap does not allow concurrent read-only access
 ---

 Key: HIVE-10128
 URL: https://issues.apache.org/jira/browse/HIVE-10128
 Project: Hive
  Issue Type: Sub-task
Reporter: Gopal V
Assignee: Sergey Shelukhin
 Attachments: hashmap-sync-source.png, hashmap-sync.png


 The multi-threaded performance takes a serious hit when LLAP shares 
 hashtables between the probe threads running in parallel. 
 !hashmap-sync.png!
 This is an explicit synchronized block inside ReusableRowContainer which 
 triggers this particular pattern.
 !hashmap-sync-source.png!
 Looking deeper into the code, the synchronization seems to be caused due to 
 the fact that WriteBuffers.setReadPoint modifies the otherwise read-only 
 hashtable.
 To generate this sort of result, run LLAP at a WARN log-level, to avoid all 
 the log synchronization that otherwise affects the thread sync.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-10129) LLAP: Fix ordering of execution modes

2015-03-27 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran resolved HIVE-10129.
--
Resolution: Fixed

Committed to llap branch.

 LLAP: Fix ordering of execution modes
 -

 Key: HIVE-10129
 URL: https://issues.apache.org/jira/browse/HIVE-10129
 Project: Hive
  Issue Type: Sub-task
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran
 Attachments: HIVE-10129.1.patch


 uber  llap  container execution modes. Fix the ordering in in-place update 
 UI.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10128) LLAP: BytesBytesMultiHashMap does not allow concurrent read-only access

2015-03-27 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14385062#comment-14385062
 ] 

Sergey Shelukhin commented on HIVE-10128:
-

As for logging, I have a patch for that that someone needs to +1 ;)

 LLAP: BytesBytesMultiHashMap does not allow concurrent read-only access
 ---

 Key: HIVE-10128
 URL: https://issues.apache.org/jira/browse/HIVE-10128
 Project: Hive
  Issue Type: Sub-task
Reporter: Gopal V
Assignee: Sergey Shelukhin
 Attachments: HIVE-10128.patch, hashmap-sync-source.png, 
 hashmap-sync.png


 The multi-threaded performance takes a serious hit when LLAP shares 
 hashtables between the probe threads running in parallel. 
 !hashmap-sync.png!
 This is an explicit synchronized block inside ReusableRowContainer which 
 triggers this particular pattern.
 !hashmap-sync-source.png!
 Looking deeper into the code, the synchronization seems to be caused due to 
 the fact that WriteBuffers.setReadPoint modifies the otherwise read-only 
 hashtable.
 To generate this sort of result, run LLAP at a WARN log-level, to avoid all 
 the log synchronization that otherwise affects the thread sync.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9780) Add another level of explain for RDBMS audience

2015-03-27 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-9780:
--
Attachment: HIVE-9780.06.patch

following [~mmokhtar]'s suggestion, add cbo info for this explain. cc'ing 
[~jpullokkaran].

 Add another level of explain for RDBMS audience
 ---

 Key: HIVE-9780
 URL: https://issues.apache.org/jira/browse/HIVE-9780
 Project: Hive
  Issue Type: Improvement
Reporter: Pengcheng Xiong
Assignee: Pengcheng Xiong
Priority: Minor
 Attachments: HIVE-9780.01.patch, HIVE-9780.02.patch, 
 HIVE-9780.03.patch, HIVE-9780.04.patch, HIVE-9780.05.patch, HIVE-9780.06.patch


 Current Hive Explain (default) is targeted at MR Audience. We need a new 
 level of explain plan to be targeted at RDBMS audience. The explain requires 
 these:
 1) The focus needs to be on what part of the query is being executed rather 
 than internals of the engines
 2) There needs to be a clearly readable tree of operations
 3) Examples - Table scan should mention the table being scanned, the Sarg, 
 the size of table and expected cardinality after the Sarg'ed read. The join 
 should mention the table being joined with and the join condition. The 
 aggregate should mention the columns in the group-by. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10038) Add Calcite's ProjectMergeRule.

2015-03-27 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-10038:

Attachment: HIVE-10038.5.patch

More fixes.

 Add Calcite's ProjectMergeRule.
 ---

 Key: HIVE-10038
 URL: https://issues.apache.org/jira/browse/HIVE-10038
 Project: Hive
  Issue Type: New Feature
  Components: CBO, Logical Optimizer
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Attachments: HIVE-10038.2.patch, HIVE-10038.3.patch, 
 HIVE-10038.4.patch, HIVE-10038.5.patch, HIVE-10038.patch


 Helps to improve latency by shortening operator pipeline. Folds adjacent 
 projections in one.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (HIVE-10128) LLAP: BytesBytesMultiHashMap does not allow concurrent read-only access

2015-03-27 Thread Gopal V (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V reopened HIVE-10128:


 LLAP: BytesBytesMultiHashMap does not allow concurrent read-only access
 ---

 Key: HIVE-10128
 URL: https://issues.apache.org/jira/browse/HIVE-10128
 Project: Hive
  Issue Type: Sub-task
Reporter: Gopal V
Assignee: Sergey Shelukhin
 Attachments: hashmap-sync-source.png, hashmap-sync.png


 The multi-threaded performance takes a serious hit when LLAP shares 
 hashtables between the probe threads running in parallel. 
 !hashmap-sync.png!
 This is an explicit synchronized block inside ReusableRowContainer which 
 triggers this particular pattern.
 !hashmap-sync-source.png!
 Looking deeper into the code, the synchronization seems to be caused due to 
 the fact that WriteBuffers.setReadPoint modifies the otherwise read-only 
 hashtable.
 To generate this sort of result, run LLAP at a WARN log-level, to avoid all 
 the log synchronization that otherwise affects the thread sync.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10129) LLAP: Fix ordering of execution modes

2015-03-27 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-10129:
-
Attachment: HIVE-10129.1.patch

 LLAP: Fix ordering of execution modes
 -

 Key: HIVE-10129
 URL: https://issues.apache.org/jira/browse/HIVE-10129
 Project: Hive
  Issue Type: Sub-task
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran
 Attachments: HIVE-10129.1.patch


 uber  llap  container execution modes. Fix the ordering in in-place update 
 UI.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10130) Merge from Spark branch to trunk 03/27/2015

2015-03-27 Thread Xuefu Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang updated HIVE-10130:
---
Attachment: HIVE-10130.1-spark.patch

 Merge from Spark branch to trunk 03/27/2015
 ---

 Key: HIVE-10130
 URL: https://issues.apache.org/jira/browse/HIVE-10130
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Affects Versions: spark-branch
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang
 Attachments: HIVE-10130.1-spark.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10093) Unnecessary HMSHandler initialization for default MemoryTokenStore on HS2

2015-03-27 Thread Aihua Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14385033#comment-14385033
 ] 

Aihua Xu commented on HIVE-10093:
-

Whoops. I included it by accident. 

 Unnecessary HMSHandler initialization for default MemoryTokenStore on HS2
 -

 Key: HIVE-10093
 URL: https://issues.apache.org/jira/browse/HIVE-10093
 Project: Hive
  Issue Type: Bug
Reporter: Szehon Ho
Assignee: Aihua Xu
Priority: Minor
 Attachments: HIVE-10093.patch


 When the HiveAuthFactory is constructed in HS2, it initializes a HMSHandler 
 unnecessarily right before the call to: 
 HadoopThriftAuthBridge.startDelegationTokenSecretManager().  If the 
 DelegationTokenStore is configured to be a memoryTokenStore, this step is not 
 needed.
 Side effect is creation of useless derby database file on HiveServer2 in 
 secure clusters, causing confusion.  This could potentially be skipped if 
 MemoryTokenStore is used.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10093) Unnecessary HMSHandler initialization for default MemoryTokenStore on HS2

2015-03-27 Thread Aihua Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14385042#comment-14385042
 ] 

Aihua Xu commented on HIVE-10093:
-

Thanks Szehon.

 Unnecessary HMSHandler initialization for default MemoryTokenStore on HS2
 -

 Key: HIVE-10093
 URL: https://issues.apache.org/jira/browse/HIVE-10093
 Project: Hive
  Issue Type: Bug
Reporter: Szehon Ho
Assignee: Aihua Xu
Priority: Minor
 Fix For: 1.2.0

 Attachments: HIVE-10093.patch


 When the HiveAuthFactory is constructed in HS2, it initializes a HMSHandler 
 unnecessarily right before the call to: 
 HadoopThriftAuthBridge.startDelegationTokenSecretManager().  If the 
 DelegationTokenStore is configured to be a memoryTokenStore, this step is not 
 needed.
 Side effect is creation of useless derby database file on HiveServer2 in 
 secure clusters, causing confusion.  This could potentially be skipped if 
 MemoryTokenStore is used.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HIVE-10128) LLAP: BytesBytesMultiHashMap does not allow concurrent read-only access

2015-03-27 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14385060#comment-14385060
 ] 

Sergey Shelukhin edited comment on HIVE-10128 at 3/28/15 2:21 AM:
--

Git is down, so merge won't propagate. I will attach the patch here for now, to 
commit I need to merge and I cannot deal with SVN today anymore


was (Author: sershe):
Git is down, so merge won't propagate. I will attach the patch here for now, to 
commit I need to merge and I cannot deal with SVN anymore

 LLAP: BytesBytesMultiHashMap does not allow concurrent read-only access
 ---

 Key: HIVE-10128
 URL: https://issues.apache.org/jira/browse/HIVE-10128
 Project: Hive
  Issue Type: Sub-task
Reporter: Gopal V
Assignee: Sergey Shelukhin
 Attachments: HIVE-10128.patch, hashmap-sync-source.png, 
 hashmap-sync.png


 The multi-threaded performance takes a serious hit when LLAP shares 
 hashtables between the probe threads running in parallel. 
 !hashmap-sync.png!
 This is an explicit synchronized block inside ReusableRowContainer which 
 triggers this particular pattern.
 !hashmap-sync-source.png!
 Looking deeper into the code, the synchronization seems to be caused due to 
 the fact that WriteBuffers.setReadPoint modifies the otherwise read-only 
 hashtable.
 To generate this sort of result, run LLAP at a WARN log-level, to avoid all 
 the log synchronization that otherwise affects the thread sync.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10126) Upgrade Tez dependency to the latest released version

2015-03-27 Thread Na Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10126?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Na Yang updated HIVE-10126:
---
Summary: Upgrade Tez dependency to the latest released version  (was: 
upgrade Tez dependency to the the latest released version)

 Upgrade Tez dependency to the latest released version
 -

 Key: HIVE-10126
 URL: https://issues.apache.org/jira/browse/HIVE-10126
 Project: Hive
  Issue Type: Bug
Reporter: Na Yang

 Tez 0.6 has been released. It will be nice to upgrade the tez dependency to 
 the latest released version. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10128) LLAP: BytesBytesMultiHashMap does not allow concurrent read-only access

2015-03-27 Thread Gopal V (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-10128:
---
Attachment: hashmap-after.png

Fix looks good, the hashmap sync points have gone away from the inner loops.

!hashmap-after.png!

The orange sections are the IO elevator lagging behind the map-join operator.

Found other init-time lock sections, will file more JIRAs.

 LLAP: BytesBytesMultiHashMap does not allow concurrent read-only access
 ---

 Key: HIVE-10128
 URL: https://issues.apache.org/jira/browse/HIVE-10128
 Project: Hive
  Issue Type: Sub-task
Reporter: Gopal V
Assignee: Sergey Shelukhin
 Attachments: HIVE-10128.patch, hashmap-after.png, 
 hashmap-sync-source.png, hashmap-sync.png


 The multi-threaded performance takes a serious hit when LLAP shares 
 hashtables between the probe threads running in parallel. 
 !hashmap-sync.png!
 This is an explicit synchronized block inside ReusableRowContainer which 
 triggers this particular pattern.
 !hashmap-sync-source.png!
 Looking deeper into the code, the synchronization seems to be caused due to 
 the fact that WriteBuffers.setReadPoint modifies the otherwise read-only 
 hashtable.
 To generate this sort of result, run LLAP at a WARN log-level, to avoid all 
 the log synchronization that otherwise affects the thread sync.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9518) Implement MONTHS_BETWEEN aligned with Oracle one

2015-03-27 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14385130#comment-14385130
 ] 

Hive QA commented on HIVE-9518:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12707843/HIVE-9518.8.patch

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 8674 tests executed
*Failed tests:*
{noformat}
TestMinimrCliDriver-smb_mapjoin_8.q - did not produce a TEST-*.xml file
TestSparkClient - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby3_map
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3189/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3189/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3189/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12707843 - PreCommit-HIVE-TRUNK-Build

 Implement MONTHS_BETWEEN aligned with Oracle one
 

 Key: HIVE-9518
 URL: https://issues.apache.org/jira/browse/HIVE-9518
 Project: Hive
  Issue Type: Improvement
  Components: UDF
Reporter: Xiaobing Zhou
Assignee: Alexander Pivovarov
 Attachments: HIVE-9518.1.patch, HIVE-9518.2.patch, HIVE-9518.3.patch, 
 HIVE-9518.4.patch, HIVE-9518.5.patch, HIVE-9518.6.patch, HIVE-9518.7.patch, 
 HIVE-9518.8.patch


 This is used to track work to build Oracle like months_between. Here's 
 semantics:
 MONTHS_BETWEEN returns number of months between dates date1 and date2. If 
 date1 is later than date2, then the result is positive. If date1 is earlier 
 than date2, then the result is negative. If date1 and date2 are either the 
 same days of the month or both last days of months, then the result is always 
 an integer. Otherwise Oracle Database calculates the fractional portion of 
 the result based on a 31-day month and considers the difference in time 
 components date1 and date2.
 Should accept date, timestamp and string arguments in the format '-MM-dd' 
 or '-MM-dd HH:mm:ss'.
 The result should be rounded to 8 decimal places.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10130) Merge from Spark branch to trunk 03/27/2015

2015-03-27 Thread Xuefu Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang updated HIVE-10130:
---
Attachment: HIVE-10130.2-spark.patch

 Merge from Spark branch to trunk 03/27/2015
 ---

 Key: HIVE-10130
 URL: https://issues.apache.org/jira/browse/HIVE-10130
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Affects Versions: spark-branch
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang
 Attachments: HIVE-10130.1-spark.patch, HIVE-10130.2-spark.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10000) 10000 whoooohooo

2015-03-27 Thread Gunther Hagleitner (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14385146#comment-14385146
 ] 

Gunther Hagleitner commented on HIVE-1:
---

+1

 1 whhooo
 

 Key: HIVE-1
 URL: https://issues.apache.org/jira/browse/HIVE-1
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Damien Carol

 {noformat}
 g▄µ,  
   
   ▄█▀²▀▀▀██▄▄,
   
 ▄▓▓░/;,,...`▀▀▀█▄, ,gg,   
   
  ╓,,,g▄▓▓░░░║║╓w;,.`²▀██▄,  ,▄▀▀░░░▀█,
   
▄▓▓▀▀▓▓▓█▄▄║║Ü╓w,,...`▀▀█▓▒░░░╟▄░░▀█   
   
   █▓░░░▀▀▒░░░║╗ÿw,..░▀▓█▄░▀█░µ░█╥ 
   
  █▓░░░║╗y..²▓▓▌░█░║/▀▄
   
  ▓▌░░░║║╓░░╩B▒▀≡██▄▄░║;..▀▓║▐▒░░¡▓▌   
   
  ▀▓░░║»╓▄▒▓▓▓▒░░¡.▀▌ÿ▓`   
   
   ▀█░╓╓░;,|`▀▓░░║y^██M
   
 ▀█║║y░▀░░░║░█▒░█▌ 
   
   ▀█▄░░░║░░░▒░░▓▌ 
   
 ▓█▄▄░░░▒▓█
   
  ▀█░░░▄▄██▓▓░░░▄▄▒░░▒▓█   
   
   ▀▌░░░g▄▄██▀▀░░░░░▒░░▓█  
   
`²▀²² g▓▒║░░░▒▓▓▓▄██▄░░░▓█ 
   
 ▄▓▓█░▒▓▓▒░░▒▒▒▓▓▒░░░╣▀██▓g▄   
   
╓▌░▒▓▓▓█▄░░░▒▒█▓▓▒▒▒▀▀▒▓`  
   
▒▀█░▒▌░░▒▀░▓░░▒▓   
   
▒.⌠▓▄▒▓▓▓░▀░░▒▓╛   
   
▒..░░▓█▄░░░▓▓█▒▒▒░░░▒▓▌
   
▐▓▄▀▓█▄░▀▓▓█░░░▓▓▄╣▒▓▓ 
   
 ▓▓▓█▄░▀▓█▓▓▀▀▀▓█▒▓█▄░░░▒▒█▓▀  
   
 ▀▓██▄░░░▓▓▓█,  ▀█▓▓▓█▀`▀▀██▓▀`
   
  ▒░▀███▄▓▓ ²▀▀φy  ▄▄▄╖µ▄▄▄¡▄▄▄╖ 
 ,▄▄▄╖▄▄▄.   
   ╙µ¡░░░▀▀▓▄▄g╓.  ▓▓▓▌█▓▓▓░▓▓▓▌ 
 ▐▓▓▓░▓▓▓N   
 ▀█▄▄░▀▓▓µ ▓▓▓▌█▓▓▓░▓▓▓▌╟▓▓▓▌█▓▓▓ 
 ²`²
   ▀███▓▓▓▌╛   ░▓▓▓▌ ▓▓▓Ñ 
 ▓▌ 
  ▀▀▀▀▓▓▓██▓▓▓█]▓▓▓▌ ▐▓▓  
 ▓Ñ 
  `╙╨╫░░░▄█▓▀▀²▓▓▓▌█▓▓▓░▓▓▓▌ ╘▓▌  
 ▄▄▄µ   
   ▓▓▓▌█▓▓▓░▓▓▓▌  █`  
 ▓▓▓▌   
    ```   
 ```
  ██╗ ██╗  ██╗  ██╗  ██╗ 
 ███║██╔═╗██╔═╗██╔═╗██╔═╗
 ╚██║██║██╔██║██║██╔██║██║██╔██║██║██╔██║
  ██║╔╝██║╔╝██║╔╝██║╔╝██║
  ██║╚██╔╝╚██╔╝╚██╔╝╚██╔╝
  ╚═╝ ╚═╝  ╚═╝  ╚═╝  ╚═╝ 
 
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)