[jira] [Commented] (HIVE-6981) Remove old website from SVN

2014-10-15 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14172046#comment-14172046
 ] 

Lefty Leverenz commented on HIVE-6981:
--

Step 5 in the Publishing section of How to Release is also obsolete (not to 
mention most of the other steps): 

{quote}
5.  Prepare to edit the website.
{{svn co https://svn.apache.org/repos/asf/hive/site}}
{quote}

We need complete reviews of both the How to Commit and How to Release 
wikidocs.  Should this be a new jira ticket?

Quick reference:

* [How to Commit | https://cwiki.apache.org/confluence/display/Hive/HowToCommit]
** [How to Commit -- Committing Documentation | 
https://cwiki.apache.org/confluence/display/Hive/HowToCommit#HowToCommit-CommittingDocumentation]
* [How to Release | 
https://cwiki.apache.org/confluence/display/Hive/HowToRelease]
** [How to Release -- Publishing | 
https://cwiki.apache.org/confluence/display/Hive/HowToRelease#HowToRelease-Publishing]
* [How to edit the website | 
https://cwiki.apache.org/confluence/display/Hive/How+to+edit+the+website]

 Remove old website from SVN
 ---

 Key: HIVE-6981
 URL: https://issues.apache.org/jira/browse/HIVE-6981
 Project: Hive
  Issue Type: Task
Reporter: Brock Noland
Assignee: Brock Noland

 Command to do removal:
 {noformat}
 svn delete https://svn.apache.org/repos/asf/hive/site/ --message HIVE-6981 - 
 Remove old website from SVN
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8320) Error in MetaException(message:Got exception: org.apache.thrift.transport.TTransportException java.net.SocketTimeoutException: Read timed out)

2014-10-15 Thread gavin kim (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

gavin kim updated HIVE-8320:

Attachment: HIVE-8320.1.patch

I re-code getMetastoreClient of HiveSessionImpl.java to use threadLocal Method 
(i.e. Hive.get).

Hiveserver2's session's meta store client was property per session, but after 
this patch meta store client is resource per thread.

I cannot find problems in my test yet.

Is this suitable for Hive's coding pattern??

 Error in MetaException(message:Got exception: 
 org.apache.thrift.transport.TTransportException 
 java.net.SocketTimeoutException: Read timed out)
 --

 Key: HIVE-8320
 URL: https://issues.apache.org/jira/browse/HIVE-8320
 Project: Hive
  Issue Type: Improvement
  Components: HiveServer2
Affects Versions: 0.13.1
Reporter: gavin kim
Assignee: gavin kim
Priority: Minor
  Labels: patch
 Fix For: 0.13.1

 Attachments: 
 0001-make-to-synchronize-hiveserver2-session-s-metastore-.patch, 
 HIVE-8320.1.patch


 I'm using Hive 13.1 in cdh environment.
 Using hue's beeswax, sometimes hiveserver2 occur MetaException.
 And after that, hive meta data request timed out.
 error log's detail is below.
 2014-09-29 12:05:44,829 ERROR hive.log: Got exception: 
 org.apache.thrift.transport.TTransportException 
 java.net.SocketTimeoutException: Read timed out
 org.apache.thrift.transport.TTransportException: 
 java.net.SocketTimeoutException: Read timed out
 at 
 org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:129)
 at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
 at 
 org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:378)
 at 
 org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:297)
 at 
 org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:204)
 at 
 org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69)
 at 
 org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_get_databases(ThriftHiveMetastore.java:600)
 at 
 org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_databases(ThriftHiveMetastore.java:587)
 at 
 org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getDatabases(HiveMetaStoreClient.java:826)
 at 
 org.apache.hive.service.cli.operation.GetSchemasOperation.run(GetSchemasOperation.java:62)
 at 
 org.apache.hive.service.cli.session.HiveSessionImpl.runOperationWithLogCapture(HiveSessionImpl.java:562)
 at 
 org.apache.hive.service.cli.session.HiveSessionImpl.getSchemas(HiveSessionImpl.java:315)
 at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at 
 org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:79)
 at 
 org.apache.hive.service.cli.session.HiveSessionProxy.access$000(HiveSessionProxy.java:37)
 at 
 org.apache.hive.service.cli.session.HiveSessionProxy$1.run(HiveSessionProxy.java:64)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:415)
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
 at 
 org.apache.hadoop.hive.shims.HadoopShimsSecure.doAs(HadoopShimsSecure.java:493)
 at 
 org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:60)
 at com.sun.proxy.$Proxy13.getSchemas(Unknown Source)
 at 
 org.apache.hive.service.cli.CLIService.getSchemas(CLIService.java:273)
 at 
 org.apache.hive.service.cli.thrift.ThriftCLIService.GetSchemas(ThriftCLIService.java:402)
 at 
 org.apache.hive.service.cli.thrift.TCLIService$Processor$GetSchemas.getResult(TCLIService.java:1429)
 at 
 org.apache.hive.service.cli.thrift.TCLIService$Processor$GetSchemas.getResult(TCLIService.java:1414)
 at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
 at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
 at 
 org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:55)
 at 
 org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:206)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at 

[jira] [Commented] (HIVE-8320) Error in MetaException(message:Got exception: org.apache.thrift.transport.TTransportException java.net.SocketTimeoutException: Read timed out)

2014-10-15 Thread gavin kim (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14172051#comment-14172051
 ] 

gavin kim commented on HIVE-8320:
-

And i thank you again for your detailed reply.

It's very helpful to me. :)

 Error in MetaException(message:Got exception: 
 org.apache.thrift.transport.TTransportException 
 java.net.SocketTimeoutException: Read timed out)
 --

 Key: HIVE-8320
 URL: https://issues.apache.org/jira/browse/HIVE-8320
 Project: Hive
  Issue Type: Improvement
  Components: HiveServer2
Affects Versions: 0.13.1
Reporter: gavin kim
Assignee: gavin kim
Priority: Minor
  Labels: patch
 Fix For: 0.13.1

 Attachments: 
 0001-make-to-synchronize-hiveserver2-session-s-metastore-.patch, 
 HIVE-8320.1.patch


 I'm using Hive 13.1 in cdh environment.
 Using hue's beeswax, sometimes hiveserver2 occur MetaException.
 And after that, hive meta data request timed out.
 error log's detail is below.
 2014-09-29 12:05:44,829 ERROR hive.log: Got exception: 
 org.apache.thrift.transport.TTransportException 
 java.net.SocketTimeoutException: Read timed out
 org.apache.thrift.transport.TTransportException: 
 java.net.SocketTimeoutException: Read timed out
 at 
 org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:129)
 at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
 at 
 org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:378)
 at 
 org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:297)
 at 
 org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:204)
 at 
 org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69)
 at 
 org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_get_databases(ThriftHiveMetastore.java:600)
 at 
 org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_databases(ThriftHiveMetastore.java:587)
 at 
 org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getDatabases(HiveMetaStoreClient.java:826)
 at 
 org.apache.hive.service.cli.operation.GetSchemasOperation.run(GetSchemasOperation.java:62)
 at 
 org.apache.hive.service.cli.session.HiveSessionImpl.runOperationWithLogCapture(HiveSessionImpl.java:562)
 at 
 org.apache.hive.service.cli.session.HiveSessionImpl.getSchemas(HiveSessionImpl.java:315)
 at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at 
 org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:79)
 at 
 org.apache.hive.service.cli.session.HiveSessionProxy.access$000(HiveSessionProxy.java:37)
 at 
 org.apache.hive.service.cli.session.HiveSessionProxy$1.run(HiveSessionProxy.java:64)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:415)
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
 at 
 org.apache.hadoop.hive.shims.HadoopShimsSecure.doAs(HadoopShimsSecure.java:493)
 at 
 org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:60)
 at com.sun.proxy.$Proxy13.getSchemas(Unknown Source)
 at 
 org.apache.hive.service.cli.CLIService.getSchemas(CLIService.java:273)
 at 
 org.apache.hive.service.cli.thrift.ThriftCLIService.GetSchemas(ThriftCLIService.java:402)
 at 
 org.apache.hive.service.cli.thrift.TCLIService$Processor$GetSchemas.getResult(TCLIService.java:1429)
 at 
 org.apache.hive.service.cli.thrift.TCLIService$Processor$GetSchemas.getResult(TCLIService.java:1414)
 at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
 at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
 at 
 org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:55)
 at 
 org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:206)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:745)
 Caused by: java.net.SocketTimeoutException: Read timed out
 at java.net.SocketInputStream.socketRead0(Native Method)
 at 

[jira] [Updated] (HIVE-8320) Error in MetaException(message:Got exception: org.apache.thrift.transport.TTransportException java.net.SocketTimeoutException: Read timed out)

2014-10-15 Thread gavin kim (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

gavin kim updated HIVE-8320:

Status: Patch Available  (was: Open)

 Error in MetaException(message:Got exception: 
 org.apache.thrift.transport.TTransportException 
 java.net.SocketTimeoutException: Read timed out)
 --

 Key: HIVE-8320
 URL: https://issues.apache.org/jira/browse/HIVE-8320
 Project: Hive
  Issue Type: Improvement
  Components: HiveServer2
Affects Versions: 0.13.1
Reporter: gavin kim
Assignee: gavin kim
Priority: Minor
  Labels: patch
 Fix For: 0.13.1

 Attachments: 
 0001-make-to-synchronize-hiveserver2-session-s-metastore-.patch, 
 HIVE-8320.1.patch


 I'm using Hive 13.1 in cdh environment.
 Using hue's beeswax, sometimes hiveserver2 occur MetaException.
 And after that, hive meta data request timed out.
 error log's detail is below.
 2014-09-29 12:05:44,829 ERROR hive.log: Got exception: 
 org.apache.thrift.transport.TTransportException 
 java.net.SocketTimeoutException: Read timed out
 org.apache.thrift.transport.TTransportException: 
 java.net.SocketTimeoutException: Read timed out
 at 
 org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:129)
 at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
 at 
 org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:378)
 at 
 org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:297)
 at 
 org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:204)
 at 
 org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69)
 at 
 org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_get_databases(ThriftHiveMetastore.java:600)
 at 
 org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_databases(ThriftHiveMetastore.java:587)
 at 
 org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getDatabases(HiveMetaStoreClient.java:826)
 at 
 org.apache.hive.service.cli.operation.GetSchemasOperation.run(GetSchemasOperation.java:62)
 at 
 org.apache.hive.service.cli.session.HiveSessionImpl.runOperationWithLogCapture(HiveSessionImpl.java:562)
 at 
 org.apache.hive.service.cli.session.HiveSessionImpl.getSchemas(HiveSessionImpl.java:315)
 at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at 
 org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:79)
 at 
 org.apache.hive.service.cli.session.HiveSessionProxy.access$000(HiveSessionProxy.java:37)
 at 
 org.apache.hive.service.cli.session.HiveSessionProxy$1.run(HiveSessionProxy.java:64)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:415)
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
 at 
 org.apache.hadoop.hive.shims.HadoopShimsSecure.doAs(HadoopShimsSecure.java:493)
 at 
 org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:60)
 at com.sun.proxy.$Proxy13.getSchemas(Unknown Source)
 at 
 org.apache.hive.service.cli.CLIService.getSchemas(CLIService.java:273)
 at 
 org.apache.hive.service.cli.thrift.ThriftCLIService.GetSchemas(ThriftCLIService.java:402)
 at 
 org.apache.hive.service.cli.thrift.TCLIService$Processor$GetSchemas.getResult(TCLIService.java:1429)
 at 
 org.apache.hive.service.cli.thrift.TCLIService$Processor$GetSchemas.getResult(TCLIService.java:1414)
 at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
 at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
 at 
 org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:55)
 at 
 org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:206)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:745)
 Caused by: java.net.SocketTimeoutException: Read timed out
 at java.net.SocketInputStream.socketRead0(Native Method)
 at java.net.SocketInputStream.read(SocketInputStream.java:152)
 at java.net.SocketInputStream.read(SocketInputStream.java:122)
   

[jira] [Updated] (HIVE-8450) Create table like does not copy over table properties

2014-10-15 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-8450:

Assignee: Navis
  Status: Patch Available  (was: Open)

 Create table like does not copy over table properties
 -

 Key: HIVE-8450
 URL: https://issues.apache.org/jira/browse/HIVE-8450
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.1, 0.14.0
Reporter: Brock Noland
Assignee: Navis
Priority: Critical
 Attachments: HIVE-8450.1.patch.txt


 Assuming t2 is a avro backed table, the following:
 {{create table t1 like t2}}
 should create an avro backed table, but the schema.url.* is not being copied 
 correctly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8450) Create table like does not copy over table properties

2014-10-15 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-8450:

Attachment: HIVE-8450.1.patch.txt

 Create table like does not copy over table properties
 -

 Key: HIVE-8450
 URL: https://issues.apache.org/jira/browse/HIVE-8450
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.14.0, 0.13.1
Reporter: Brock Noland
Priority: Critical
 Attachments: HIVE-8450.1.patch.txt


 Assuming t2 is a avro backed table, the following:
 {{create table t1 like t2}}
 should create an avro backed table, but the schema.url.* is not being copied 
 correctly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-7733) Ambiguous column reference error on query

2014-10-15 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14172059#comment-14172059
 ] 

Hive QA commented on HIVE-7733:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12674705/HIVE-7733.7.patch.txt

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 6557 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parallel
org.apache.hive.beeline.TestSchemaTool.testSchemaInit
org.apache.hive.beeline.TestSchemaTool.testSchemaUpgrade
org.apache.hive.hcatalog.streaming.TestStreaming.testTransactionBatchAbort
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1271/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1271/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-1271/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12674705

 Ambiguous column reference error on query
 -

 Key: HIVE-7733
 URL: https://issues.apache.org/jira/browse/HIVE-7733
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0
Reporter: Jason Dere
Assignee: Navis
 Attachments: HIVE-7733.1.patch.txt, HIVE-7733.2.patch.txt, 
 HIVE-7733.3.patch.txt, HIVE-7733.4.patch.txt, HIVE-7733.5.patch.txt, 
 HIVE-7733.6.patch.txt, HIVE-7733.7.patch.txt


 {noformat}
 CREATE TABLE agg1 
   ( 
  col0 INT, 
  col1 STRING, 
  col2 DOUBLE 
   ); 
 explain SELECT single_use_subq11.a1 AS a1, 
single_use_subq11.a2 AS a2 
 FROM   (SELECT Sum(agg1.col2) AS a1 
 FROM   agg1 
 GROUP  BY agg1.col0) single_use_subq12 
JOIN (SELECT alias.a2 AS a0, 
 alias.a1 AS a1, 
 alias.a1 AS a2 
  FROM   (SELECT agg1.col1 AS a0, 
 '42'  AS a1, 
 agg1.col0 AS a2 
  FROM   agg1 
  UNION ALL 
  SELECT agg1.col1 AS a0, 
 '41'  AS a1, 
 agg1.col0 AS a2 
  FROM   agg1) alias 
  GROUP  BY alias.a2, 
alias.a1) single_use_subq11 
  ON ( single_use_subq11.a0 = single_use_subq11.a0 );
 {noformat}
 Gets the following error:
 FAILED: SemanticException [Error 10007]: Ambiguous column reference a2
 Looks like this query had been working in 0.12 but starting failing with this 
 error in 0.13



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8465) Fix some minor test fails on trunk

2014-10-15 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-8465:

Summary: Fix some minor test fails on trunk  (was: Fix some minor test 
fails)

 Fix some minor test fails on trunk
 --

 Key: HIVE-8465
 URL: https://issues.apache.org/jira/browse/HIVE-8465
 Project: Hive
  Issue Type: Task
  Components: Tests
Reporter: Navis
Assignee: Navis
Priority: Minor

 org.apache.hive.beeline.TestSchemaTool.testSchemaInit
 org.apache.hive.beeline.TestSchemaTool.testSchemaUpgrade
 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parallel



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-8465) Fix some minor test fails

2014-10-15 Thread Navis (JIRA)
Navis created HIVE-8465:
---

 Summary: Fix some minor test fails
 Key: HIVE-8465
 URL: https://issues.apache.org/jira/browse/HIVE-8465
 Project: Hive
  Issue Type: Task
  Components: Tests
Reporter: Navis
Assignee: Navis
Priority: Minor


org.apache.hive.beeline.TestSchemaTool.testSchemaInit
org.apache.hive.beeline.TestSchemaTool.testSchemaUpgrade
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parallel



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8083) Authorization DDLs should not enforce hive identifier syntax for user or group

2014-10-15 Thread Lefty Leverenz (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-8083:
-
Labels:   (was: TODOC14)

 Authorization DDLs should not enforce hive identifier syntax for user or group
 --

 Key: HIVE-8083
 URL: https://issues.apache.org/jira/browse/HIVE-8083
 Project: Hive
  Issue Type: Bug
  Components: SQL, SQLStandardAuthorization
Affects Versions: 0.13.0, 0.13.1
Reporter: Prasad Mujumdar
Assignee: Prasad Mujumdar
 Fix For: 0.14.0

 Attachments: HIVE-8083.1.patch, HIVE-8083.2.patch, HIVE-8083.3.patch


 The compiler expects principals (user, group and role) as hive identifiers 
 for authorization DDLs. The user and group are entities that belong to 
 external namespace and we can't expect those to follow hive identifier syntax 
 rules. For example, a userid or group can contain '-' which is not allowed by 
 compiler.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8465) Fix some minor test fails on trunk

2014-10-15 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-8465:

Attachment: HIVE-8465.1.patch.txt

 Fix some minor test fails on trunk
 --

 Key: HIVE-8465
 URL: https://issues.apache.org/jira/browse/HIVE-8465
 Project: Hive
  Issue Type: Task
  Components: Tests
Reporter: Navis
Assignee: Navis
Priority: Minor
 Attachments: HIVE-8465.1.patch.txt


 org.apache.hive.beeline.TestSchemaTool.testSchemaInit
 org.apache.hive.beeline.TestSchemaTool.testSchemaUpgrade
 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parallel



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8465) Fix some minor test fails on trunk

2014-10-15 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-8465:

Status: Patch Available  (was: Open)

 Fix some minor test fails on trunk
 --

 Key: HIVE-8465
 URL: https://issues.apache.org/jira/browse/HIVE-8465
 Project: Hive
  Issue Type: Task
  Components: Tests
Reporter: Navis
Assignee: Navis
Priority: Minor
 Attachments: HIVE-8465.1.patch.txt


 org.apache.hive.beeline.TestSchemaTool.testSchemaInit
 org.apache.hive.beeline.TestSchemaTool.testSchemaUpgrade
 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parallel



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-7733) Ambiguous column reference error on query

2014-10-15 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14172076#comment-14172076
 ] 

Navis commented on HIVE-7733:
-

Cannot reproduce fail of TestStreaming.testTransactionBatchAbort and other 
fails are booked on https://issues.apache.org/jira/browse/HIVE-8465

 Ambiguous column reference error on query
 -

 Key: HIVE-7733
 URL: https://issues.apache.org/jira/browse/HIVE-7733
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0
Reporter: Jason Dere
Assignee: Navis
 Attachments: HIVE-7733.1.patch.txt, HIVE-7733.2.patch.txt, 
 HIVE-7733.3.patch.txt, HIVE-7733.4.patch.txt, HIVE-7733.5.patch.txt, 
 HIVE-7733.6.patch.txt, HIVE-7733.7.patch.txt


 {noformat}
 CREATE TABLE agg1 
   ( 
  col0 INT, 
  col1 STRING, 
  col2 DOUBLE 
   ); 
 explain SELECT single_use_subq11.a1 AS a1, 
single_use_subq11.a2 AS a2 
 FROM   (SELECT Sum(agg1.col2) AS a1 
 FROM   agg1 
 GROUP  BY agg1.col0) single_use_subq12 
JOIN (SELECT alias.a2 AS a0, 
 alias.a1 AS a1, 
 alias.a1 AS a2 
  FROM   (SELECT agg1.col1 AS a0, 
 '42'  AS a1, 
 agg1.col0 AS a2 
  FROM   agg1 
  UNION ALL 
  SELECT agg1.col1 AS a0, 
 '41'  AS a1, 
 agg1.col0 AS a2 
  FROM   agg1) alias 
  GROUP  BY alias.a2, 
alias.a1) single_use_subq11 
  ON ( single_use_subq11.a0 = single_use_subq11.a0 );
 {noformat}
 Gets the following error:
 FAILED: SemanticException [Error 10007]: Ambiguous column reference a2
 Looks like this query had been working in 0.12 but starting failing with this 
 error in 0.13



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8083) Authorization DDLs should not enforce hive identifier syntax for user or group

2014-10-15 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14172077#comment-14172077
 ] 

Lefty Leverenz commented on HIVE-8083:
--

Docs look good.  I added links back to this jira and removed the jira's TODOC14 
label.  Thanks [~prasadm].

 Authorization DDLs should not enforce hive identifier syntax for user or group
 --

 Key: HIVE-8083
 URL: https://issues.apache.org/jira/browse/HIVE-8083
 Project: Hive
  Issue Type: Bug
  Components: SQL, SQLStandardAuthorization
Affects Versions: 0.13.0, 0.13.1
Reporter: Prasad Mujumdar
Assignee: Prasad Mujumdar
 Fix For: 0.14.0

 Attachments: HIVE-8083.1.patch, HIVE-8083.2.patch, HIVE-8083.3.patch


 The compiler expects principals (user, group and role) as hive identifiers 
 for authorization DDLs. The user and group are entities that belong to 
 external namespace and we can't expect those to follow hive identifier syntax 
 rules. For example, a userid or group can contain '-' which is not allowed by 
 compiler.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8455) Print Spark job progress format info on the console[Spark Branch]

2014-10-15 Thread Chengxiang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chengxiang Li updated HIVE-8455:

Attachment: hive on spark job status.PNG

 Print Spark job progress format info on the console[Spark Branch]
 -

 Key: HIVE-8455
 URL: https://issues.apache.org/jira/browse/HIVE-8455
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Chengxiang Li
Assignee: Chengxiang Li
  Labels: Spark-M3
 Attachments: HIVE-8455.1-spark.patch, hive on spark job status.PNG


 We have add support of spark job status monitoring on HIVE-7439, but not 
 print job progress format info on the console, user may confuse about what 
 the  progress info means, so I would like to add job progress format info 
 here.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8455) Print Spark job progress format info on the console[Spark Branch]

2014-10-15 Thread Chengxiang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chengxiang Li updated HIVE-8455:

Attachment: HIVE-8455.2-spark.patch

 Print Spark job progress format info on the console[Spark Branch]
 -

 Key: HIVE-8455
 URL: https://issues.apache.org/jira/browse/HIVE-8455
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Chengxiang Li
Assignee: Chengxiang Li
  Labels: Spark-M3
 Attachments: HIVE-8455.1-spark.patch, HIVE-8455.2-spark.patch, hive 
 on spark job status.PNG


 We have add support of spark job status monitoring on HIVE-7439, but not 
 print job progress format info on the console, user may confuse about what 
 the  progress info means, so I would like to add job progress format info 
 here.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8455) Print Spark job progress format info on the console[Spark Branch]

2014-10-15 Thread Chengxiang Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14172102#comment-14172102
 ] 

Chengxiang Li commented on HIVE-8455:
-

I‘ve add date stamp and stage cost time after stage finished. I tried to print 
the key every 15 lines, it turns out very unharmonious with job progress info, 
As the key info is printed at the begin and the progress info is quite 
self-explained, I think we may not actually need this, what do you think?

 Print Spark job progress format info on the console[Spark Branch]
 -

 Key: HIVE-8455
 URL: https://issues.apache.org/jira/browse/HIVE-8455
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Chengxiang Li
Assignee: Chengxiang Li
  Labels: Spark-M3
 Attachments: HIVE-8455.1-spark.patch, HIVE-8455.2-spark.patch, hive 
 on spark job status.PNG


 We have add support of spark job status monitoring on HIVE-7439, but not 
 print job progress format info on the console, user may confuse about what 
 the  progress info means, so I would like to add job progress format info 
 here.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Review Request 26732: HIVE-8455 Print Spark job progress format info on the console[Spark Branch]

2014-10-15 Thread chengxiang li

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/26732/
---

Review request for hive, Brock Noland and Szehon Ho.


Bugs: HIVE-8455
https://issues.apache.org/jira/browse/HIVE-8455


Repository: hive-git


Description
---

1.add data stamp in progress info.
2.print stage cost time after stage finished.


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/status/SparkJobMonitor.java 
b092abc 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/status/SparkJobStatus.java 
8717fe2 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/status/SparkProgress.java 
36322eb 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/spark/status/SparkStageProgress.java 
PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/spark/status/impl/SimpleSparkJobStatus.java
 c7ef83c 

Diff: https://reviews.apache.org/r/26732/diff/


Testing
---


Thanks,

chengxiang li



[jira] [Commented] (HIVE-2906) Support providing some table properties by user via SQL

2014-10-15 Thread cw (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14172104#comment-14172104
 ] 

cw commented on HIVE-2906:
--

Hi [~navis].  We found that we can not use the identifier 'user' as table alias 
because the FromClauseParser.g was changed in this patch.

-: tabname=tableName (ts=tableSample)? (KW_AS? alias=identifier)?
-- ^(TOK_TABREF $tabname $ts? $alias?)
+: tabname=tableName (props=tableProperties)? (ts=tableSample)? (KW_AS? 
alias=Identifier)?
+- ^(TOK_TABREF $tabname $props? $ts? $alias?)

It changed the 'identifier' to a uppercase 'Identifier'. Is it intended or just 
a mistake?

 Support providing some table properties by user via SQL
 ---

 Key: HIVE-2906
 URL: https://issues.apache.org/jira/browse/HIVE-2906
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Navis
Assignee: Navis
 Fix For: 0.12.0

 Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-2906.D2499.1.patch, 
 ASF.LICENSE.NOT.GRANTED--HIVE-2906.D2499.2.patch, 
 ASF.LICENSE.NOT.GRANTED--HIVE-2906.D2499.3.patch, 
 ASF.LICENSE.NOT.GRANTED--HIVE-2906.D2499.4.patch, 
 ASF.LICENSE.NOT.GRANTED--HIVE-2906.D2499.5.patch, HIVE-2906.D2499.6.patch, 
 HIVE-2906.D2499.7.patch


 Some properties are needed to be provided to StorageHandler by user in 
 runtime. It might be an address for remote resource or retry count for access 
 or maximum version count(for hbase), etc.
 For example,  
 {code}
 select emp.empno, emp.ename from hbase_emp ('max.version'='3') emp;
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-2906) Support providing some table properties by user via SQL

2014-10-15 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14172115#comment-14172115
 ] 

Navis commented on HIVE-2906:
-

[~cwsteinbach] It seemed apparently a mistake. There was not identifier type 
when this patch was first created (see the first patch). Seemed need a fix. 
Would you do that or do I?

 Support providing some table properties by user via SQL
 ---

 Key: HIVE-2906
 URL: https://issues.apache.org/jira/browse/HIVE-2906
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Navis
Assignee: Navis
 Fix For: 0.12.0

 Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-2906.D2499.1.patch, 
 ASF.LICENSE.NOT.GRANTED--HIVE-2906.D2499.2.patch, 
 ASF.LICENSE.NOT.GRANTED--HIVE-2906.D2499.3.patch, 
 ASF.LICENSE.NOT.GRANTED--HIVE-2906.D2499.4.patch, 
 ASF.LICENSE.NOT.GRANTED--HIVE-2906.D2499.5.patch, HIVE-2906.D2499.6.patch, 
 HIVE-2906.D2499.7.patch


 Some properties are needed to be provided to StorageHandler by user in 
 runtime. It might be an address for remote resource or retry count for access 
 or maximum version count(for hbase), etc.
 For example,  
 {code}
 select emp.empno, emp.ename from hbase_emp ('max.version'='3') emp;
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-2906) Support providing some table properties by user via SQL

2014-10-15 Thread cw (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14172119#comment-14172119
 ] 

cw commented on HIVE-2906:
--

[~navis] I would like to create a issue and submit a patch to fix it.

 Support providing some table properties by user via SQL
 ---

 Key: HIVE-2906
 URL: https://issues.apache.org/jira/browse/HIVE-2906
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Navis
Assignee: Navis
 Fix For: 0.12.0

 Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-2906.D2499.1.patch, 
 ASF.LICENSE.NOT.GRANTED--HIVE-2906.D2499.2.patch, 
 ASF.LICENSE.NOT.GRANTED--HIVE-2906.D2499.3.patch, 
 ASF.LICENSE.NOT.GRANTED--HIVE-2906.D2499.4.patch, 
 ASF.LICENSE.NOT.GRANTED--HIVE-2906.D2499.5.patch, HIVE-2906.D2499.6.patch, 
 HIVE-2906.D2499.7.patch


 Some properties are needed to be provided to StorageHandler by user in 
 runtime. It might be an address for remote resource or retry count for access 
 or maximum version count(for hbase), etc.
 For example,  
 {code}
 select emp.empno, emp.ename from hbase_emp ('max.version'='3') emp;
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-8466) nonReserved keywords can not be used as table alias

2014-10-15 Thread cw (JIRA)
cw created HIVE-8466:


 Summary: nonReserved keywords can not be used as table alias
 Key: HIVE-8466
 URL: https://issues.apache.org/jira/browse/HIVE-8466
 Project: Hive
  Issue Type: Bug
  Components: SQL
Affects Versions: 0.13.1, 0.13.0, 0.12.0
Reporter: cw
Priority: Minor


There is a small mistake in the patch of issue HIVE-2906. See the change of 
FromClauseParser.g

-: tabname=tableName (ts=tableSample)? (KW_AS? alias=identifier)?
-- ^(TOK_TABREF $tabname $ts? $alias?)
+: tabname=tableName (props=tableProperties)? (ts=tableSample)? (KW_AS? 
alias=Identifier)?
+- ^(TOK_TABREF $tabname $props? $ts? $alias?)

With the 'identifier' changed to 'Identifier' we can not use nonReserved 
keywords as table alias.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8466) nonReserved keywords can not be used as table alias

2014-10-15 Thread cw (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

cw updated HIVE-8466:
-
Attachment: HIVE-8466.1.patch

Commit a patch.

 nonReserved keywords can not be used as table alias
 ---

 Key: HIVE-8466
 URL: https://issues.apache.org/jira/browse/HIVE-8466
 Project: Hive
  Issue Type: Bug
  Components: SQL
Affects Versions: 0.12.0, 0.13.0, 0.13.1
Reporter: cw
Priority: Minor
 Attachments: HIVE-8466.1.patch


 There is a small mistake in the patch of issue HIVE-2906. See the change of 
 FromClauseParser.g
 -: tabname=tableName (ts=tableSample)? (KW_AS? alias=identifier)?
 -- ^(TOK_TABREF $tabname $ts? $alias?)
 +: tabname=tableName (props=tableProperties)? (ts=tableSample)? (KW_AS? 
 alias=Identifier)?
 +- ^(TOK_TABREF $tabname $props? $ts? $alias?)
 With the 'identifier' changed to 'Identifier' we can not use nonReserved 
 keywords as table alias.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8343) Return value from BlockingQueue.offer() is not checked in DynamicPartitionPruner

2014-10-15 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14172126#comment-14172126
 ] 

Hive QA commented on HIVE-8343:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12674723/HIVE-8343.patch

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 6559 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parallel
org.apache.hadoop.hive.ql.txn.compactor.TestCompactor.testStatsAfterCompactionPartTbl
org.apache.hive.beeline.TestSchemaTool.testSchemaInit
org.apache.hive.beeline.TestSchemaTool.testSchemaUpgrade
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1272/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1272/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-1272/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12674723

 Return value from BlockingQueue.offer() is not checked in 
 DynamicPartitionPruner
 

 Key: HIVE-8343
 URL: https://issues.apache.org/jira/browse/HIVE-8343
 Project: Hive
  Issue Type: Bug
Reporter: Ted Yu
Assignee: JongWon Park
Priority: Minor
 Attachments: HIVE-8343.patch


 In addEvent() and processVertex(), there is call such as the following:
 {code}
   queue.offer(event);
 {code}
 The return value should be checked. If false is returned, event would not 
 have been queued.
 Take a look at line 328 in:
 http://fuseyism.com/classpath/doc/java/util/concurrent/LinkedBlockingQueue-source.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8466) nonReserved keywords can not be used as table alias

2014-10-15 Thread cw (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

cw updated HIVE-8466:
-
Status: Patch Available  (was: Open)

 nonReserved keywords can not be used as table alias
 ---

 Key: HIVE-8466
 URL: https://issues.apache.org/jira/browse/HIVE-8466
 Project: Hive
  Issue Type: Bug
  Components: SQL
Affects Versions: 0.13.1, 0.13.0, 0.12.0
Reporter: cw
Priority: Minor
 Attachments: HIVE-8466.1.patch


 There is a small mistake in the patch of issue HIVE-2906. See the change of 
 FromClauseParser.g
 -: tabname=tableName (ts=tableSample)? (KW_AS? alias=identifier)?
 -- ^(TOK_TABREF $tabname $ts? $alias?)
 +: tabname=tableName (props=tableProperties)? (ts=tableSample)? (KW_AS? 
 alias=Identifier)?
 +- ^(TOK_TABREF $tabname $props? $ts? $alias?)
 With the 'identifier' changed to 'Identifier' we can not use nonReserved 
 keywords as table alias.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-2906) Support providing some table properties by user via SQL

2014-10-15 Thread cw (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14172132#comment-14172132
 ] 

cw commented on HIVE-2906:
--

[~navis] I created a issue here: https://issues.apache.org/jira/browse/HIVE-8466

 Support providing some table properties by user via SQL
 ---

 Key: HIVE-2906
 URL: https://issues.apache.org/jira/browse/HIVE-2906
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Navis
Assignee: Navis
 Fix For: 0.12.0

 Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-2906.D2499.1.patch, 
 ASF.LICENSE.NOT.GRANTED--HIVE-2906.D2499.2.patch, 
 ASF.LICENSE.NOT.GRANTED--HIVE-2906.D2499.3.patch, 
 ASF.LICENSE.NOT.GRANTED--HIVE-2906.D2499.4.patch, 
 ASF.LICENSE.NOT.GRANTED--HIVE-2906.D2499.5.patch, HIVE-2906.D2499.6.patch, 
 HIVE-2906.D2499.7.patch


 Some properties are needed to be provided to StorageHandler by user in 
 runtime. It might be an address for remote resource or retry count for access 
 or maximum version count(for hbase), etc.
 For example,  
 {code}
 select emp.empno, emp.ename from hbase_emp ('max.version'='3') emp;
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-2906) Support providing some table properties by user via SQL

2014-10-15 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14172135#comment-14172135
 ] 

Navis commented on HIVE-2906:
-

[~cwsteinbach] Good. Thanks!

 Support providing some table properties by user via SQL
 ---

 Key: HIVE-2906
 URL: https://issues.apache.org/jira/browse/HIVE-2906
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Navis
Assignee: Navis
 Fix For: 0.12.0

 Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-2906.D2499.1.patch, 
 ASF.LICENSE.NOT.GRANTED--HIVE-2906.D2499.2.patch, 
 ASF.LICENSE.NOT.GRANTED--HIVE-2906.D2499.3.patch, 
 ASF.LICENSE.NOT.GRANTED--HIVE-2906.D2499.4.patch, 
 ASF.LICENSE.NOT.GRANTED--HIVE-2906.D2499.5.patch, HIVE-2906.D2499.6.patch, 
 HIVE-2906.D2499.7.patch


 Some properties are needed to be provided to StorageHandler by user in 
 runtime. It might be an address for remote resource or retry count for access 
 or maximum version count(for hbase), etc.
 For example,  
 {code}
 select emp.empno, emp.ename from hbase_emp ('max.version'='3') emp;
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8456) Support Hive Counter to collect spark job metric[Spark Branch]

2014-10-15 Thread Chengxiang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chengxiang Li updated HIVE-8456:

Attachment: HIVE-8456.2-spark.patch

 Support Hive Counter to collect spark job metric[Spark Branch]
 --

 Key: HIVE-8456
 URL: https://issues.apache.org/jira/browse/HIVE-8456
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Chengxiang Li
Assignee: Chengxiang Li
  Labels: Spark-M3
 Attachments: HIVE-8456.1-spark.patch, HIVE-8456.2-spark.patch


 Several Hive query metric in Hive operators is collected by Hive Counter, 
 such as CREATEDFILES and DESERIALIZE_ERRORS, Besides, Hive use Counter as an 
 option to collect table stats info.  Spark support Accumulator which is 
 pretty similiar with Hive Counter, we could try to enable Hive Counter based 
 on it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Review Request 26733: HIVE-8456 Support Hive Counter to collect spark job metric[Spark Branch]

2014-10-15 Thread chengxiang li

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/26733/
---

Review request for hive, Brock Noland and Szehon Ho.


Bugs: HIVE-8456
https://issues.apache.org/jira/browse/HIVE-8456


Repository: hive-git


Description
---

Several Hive query metric in Hive operators is collected by Hive Counter, such 
as CREATEDFILES and DESERIALIZE_ERRORS, Besides, Hive use Counter as an option 
to collect table stats info. Spark support Accumulator which is pretty similiar 
with Hive Counter, we could try to enable Hive Counter based on it.


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/counter/SparkCounter.java 
PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/spark/counter/SparkCounterGroup.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/counter/SparkCounters.java 
PRE-CREATION 

Diff: https://reviews.apache.org/r/26733/diff/


Testing
---


Thanks,

chengxiang li



[jira] [Commented] (HIVE-8455) Print Spark job progress format info on the console[Spark Branch]

2014-10-15 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14172155#comment-14172155
 ] 

Hive QA commented on HIVE-8455:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12674942/HIVE-8455.2-spark.patch

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 6769 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_sample_islocalmode_hook
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_tez_smb_1
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/219/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/219/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-SPARK-Build-219/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12674942

 Print Spark job progress format info on the console[Spark Branch]
 -

 Key: HIVE-8455
 URL: https://issues.apache.org/jira/browse/HIVE-8455
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Chengxiang Li
Assignee: Chengxiang Li
  Labels: Spark-M3
 Attachments: HIVE-8455.1-spark.patch, HIVE-8455.2-spark.patch, hive 
 on spark job status.PNG


 We have add support of spark job status monitoring on HIVE-7439, but not 
 print job progress format info on the console, user may confuse about what 
 the  progress info means, so I would like to add job progress format info 
 here.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8456) Support Hive Counter to collect spark job metric[Spark Branch]

2014-10-15 Thread Chengxiang Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14172158#comment-14172158
 ] 

Chengxiang Li commented on HIVE-8456:
-

As Spark does not use Counter to collect framework/job metric like MR/Tez, so 
Hive on Spark only use Counter in several simple case, I implement a simple 
implementation which would fit into those requirement based on Spark 
accumulator.

 Support Hive Counter to collect spark job metric[Spark Branch]
 --

 Key: HIVE-8456
 URL: https://issues.apache.org/jira/browse/HIVE-8456
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Chengxiang Li
Assignee: Chengxiang Li
  Labels: Spark-M3
 Attachments: HIVE-8456.1-spark.patch, HIVE-8456.2-spark.patch


 Several Hive query metric in Hive operators is collected by Hive Counter, 
 such as CREATEDFILES and DESERIALIZE_ERRORS, Besides, Hive use Counter as an 
 option to collect table stats info.  Spark support Accumulator which is 
 pretty similiar with Hive Counter, we could try to enable Hive Counter based 
 on it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-2573) Create per-session function registry

2014-10-15 Thread Jason Dere (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-2573:
-
Attachment: HIVE-2573.7.patch

Fix test failures. TestHiveServerSessions seems to pass when I move it to 
itests/

 Create per-session function registry 
 -

 Key: HIVE-2573
 URL: https://issues.apache.org/jira/browse/HIVE-2573
 Project: Hive
  Issue Type: Improvement
  Components: Server Infrastructure
Reporter: Navis
Assignee: Navis
Priority: Minor
 Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-2573.D3231.1.patch, 
 HIVE-2573.1.patch.txt, HIVE-2573.2.patch.txt, HIVE-2573.3.patch.txt, 
 HIVE-2573.4.patch.txt, HIVE-2573.5.patch, HIVE-2573.6.patch, HIVE-2573.7.patch


 Currently the function registry is shared resource and could be overrided by 
 other users when using HiveServer. If per-session function registry is 
 provided, this situation could be prevented.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8456) Support Hive Counter to collect spark job metric[Spark Branch]

2014-10-15 Thread Chengxiang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chengxiang Li updated HIVE-8456:

Attachment: HIVE-8456.2-spark.patch

 Support Hive Counter to collect spark job metric[Spark Branch]
 --

 Key: HIVE-8456
 URL: https://issues.apache.org/jira/browse/HIVE-8456
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Chengxiang Li
Assignee: Chengxiang Li
  Labels: Spark-M3
 Attachments: HIVE-8456.1-spark.patch, HIVE-8456.2-spark.patch


 Several Hive query metric in Hive operators is collected by Hive Counter, 
 such as CREATEDFILES and DESERIALIZE_ERRORS, Besides, Hive use Counter as an 
 option to collect table stats info.  Spark support Accumulator which is 
 pretty similiar with Hive Counter, we could try to enable Hive Counter based 
 on it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8456) Support Hive Counter to collect spark job metric[Spark Branch]

2014-10-15 Thread Chengxiang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chengxiang Li updated HIVE-8456:

Attachment: (was: HIVE-8456.2-spark.patch)

 Support Hive Counter to collect spark job metric[Spark Branch]
 --

 Key: HIVE-8456
 URL: https://issues.apache.org/jira/browse/HIVE-8456
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Chengxiang Li
Assignee: Chengxiang Li
  Labels: Spark-M3
 Attachments: HIVE-8456.1-spark.patch, HIVE-8456.2-spark.patch


 Several Hive query metric in Hive operators is collected by Hive Counter, 
 such as CREATEDFILES and DESERIALIZE_ERRORS, Besides, Hive use Counter as an 
 option to collect table stats info.  Spark support Accumulator which is 
 pretty similiar with Hive Counter, we could try to enable Hive Counter based 
 on it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8284) Equality comparison is done between two floating point variables in HiveRelMdUniqueKeys#getUniqueKeys()

2014-10-15 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14172210#comment-14172210
 ] 

Hive QA commented on HIVE-8284:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12674726/HIVE-8284.patch

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 6559 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parallel
org.apache.hive.beeline.TestSchemaTool.testSchemaInit
org.apache.hive.beeline.TestSchemaTool.testSchemaUpgrade
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1273/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1273/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-1273/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12674726

 Equality comparison is done between two floating point variables in 
 HiveRelMdUniqueKeys#getUniqueKeys()
 ---

 Key: HIVE-8284
 URL: https://issues.apache.org/jira/browse/HIVE-8284
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.14.0
Reporter: Ted Yu
Assignee: JongWon Park
Priority: Minor
 Fix For: 0.14.0

 Attachments: HIVE-8284.patch


 {code}
 double numRows = tScan.getRows();
 ...
 double r = cStat.getRange().maxValue.doubleValue() -
 cStat.getRange().minValue.doubleValue() + 1;
 isKey = (numRows == r);
 {code}
 The equality check should use a small epsilon as tolerance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8456) Support Hive Counter to collect spark job metric[Spark Branch]

2014-10-15 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14172213#comment-14172213
 ] 

Hive QA commented on HIVE-8456:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12674958/HIVE-8456.2-spark.patch

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 6769 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_sample_islocalmode_hook
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_tez_smb_1
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/220/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/220/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-SPARK-Build-220/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12674958

 Support Hive Counter to collect spark job metric[Spark Branch]
 --

 Key: HIVE-8456
 URL: https://issues.apache.org/jira/browse/HIVE-8456
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Chengxiang Li
Assignee: Chengxiang Li
  Labels: Spark-M3
 Attachments: HIVE-8456.1-spark.patch, HIVE-8456.2-spark.patch


 Several Hive query metric in Hive operators is collected by Hive Counter, 
 such as CREATEDFILES and DESERIALIZE_ERRORS, Besides, Hive use Counter as an 
 option to collect table stats info.  Spark support Accumulator which is 
 pretty similiar with Hive Counter, we could try to enable Hive Counter based 
 on it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8456) Support Hive Counter to collect spark job metric[Spark Branch]

2014-10-15 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14172248#comment-14172248
 ] 

Hive QA commented on HIVE-8456:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12674960/HIVE-8456.2-spark.patch

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 6769 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_sample_islocalmode_hook
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_tez_smb_1
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/221/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/221/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-SPARK-Build-221/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12674960

 Support Hive Counter to collect spark job metric[Spark Branch]
 --

 Key: HIVE-8456
 URL: https://issues.apache.org/jira/browse/HIVE-8456
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Chengxiang Li
Assignee: Chengxiang Li
  Labels: Spark-M3
 Attachments: HIVE-8456.1-spark.patch, HIVE-8456.2-spark.patch


 Several Hive query metric in Hive operators is collected by Hive Counter, 
 such as CREATEDFILES and DESERIALIZE_ERRORS, Besides, Hive use Counter as an 
 option to collect table stats info.  Spark support Accumulator which is 
 pretty similiar with Hive Counter, we could try to enable Hive Counter based 
 on it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8341) Transaction information in config file can grow excessively large

2014-10-15 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14172265#comment-14172265
 ] 

Hive QA commented on HIVE-8341:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12674786/HIVE-8341.2.patch

{color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 6564 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parallel
org.apache.hadoop.hive.ql.exec.TestOperators.testScriptOperatorEnvVarsProcessing
org.apache.hadoop.hive.ql.txn.compactor.TestCompactor.testStatsAfterCompactionPartTbl
org.apache.hive.beeline.TestSchemaTool.testSchemaInit
org.apache.hive.beeline.TestSchemaTool.testSchemaUpgrade
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1274/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1274/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-1274/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 5 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12674786

 Transaction information in config file can grow excessively large
 -

 Key: HIVE-8341
 URL: https://issues.apache.org/jira/browse/HIVE-8341
 Project: Hive
  Issue Type: Bug
  Components: Transactions
Affects Versions: 0.14.0
Reporter: Alan Gates
Assignee: Alan Gates
Priority: Critical
 Attachments: HIVE-8341.2.patch, HIVE-8341.patch


 In our testing we have seen cases where the transaction list grows very 
 large.  We need a more efficient way of communicating the list.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-7467) When querying HBase table, task fails with exception: java.lang.IllegalAccessError: com/google/protobuf/HBaseZeroCopyByteString

2014-10-15 Thread Jimmy Xiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang resolved HIVE-7467.
---
Resolution: Later

I see. Thanks for bringing up this issue. There is not much we can do in Hive 
side as I know for now. Thanks.

 When querying HBase table, task fails with exception: 
 java.lang.IllegalAccessError: com/google/protobuf/HBaseZeroCopyByteString
 ---

 Key: HIVE-7467
 URL: https://issues.apache.org/jira/browse/HIVE-7467
 Project: Hive
  Issue Type: Bug
  Components: Spark
 Environment: Spark-1.0.0, HBase-0.98.2
Reporter: Rui Li
Assignee: Jimmy Xiang

 When I run select count( * ) on an HBase table, spark task fails with:
 {quote}
 java.lang.IllegalAccessError: com/google/protobuf/HBaseZeroCopyByteString
 at 
 org.apache.hadoop.hbase.protobuf.RequestConverter.buildRegionSpecifier(RequestConverter.java:910)
 at 
 org.apache.hadoop.hbase.protobuf.RequestConverter.buildGetRowOrBeforeRequest(RequestConverter.java:131)
 at 
 org.apache.hadoop.hbase.protobuf.ProtobufUtil.getRowOrBefore(ProtobufUtil.java:1403)
 at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:1181)
 at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:1059)
 at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:1016)
 at org.apache.hadoop.hbase.client.HTable.finishSetup(HTable.java:326)
 at org.apache.hadoop.hbase.client.HTable.init(HTable.java:192)
 at org.apache.hadoop.hbase.client.HTable.init(HTable.java:165)
 at 
 org.apache.hadoop.hive.hbase.HiveHBaseTableInputFormat.getRecordReader(HiveHBaseTableInputFormat.java:93)
 at 
 org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:241)
 at org.apache.spark.rdd.HadoopRDD$$anon$1.init(HadoopRDD.scala:193)
 at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:184)
 at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:93)
 at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)
 at org.apache.spark.rdd.RDD.iterator(RDD.scala:229)
 at 
 org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35)
 at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)
 at org.apache.spark.rdd.RDD.iterator(RDD.scala:229)
 at 
 org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:158)
 at 
 org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:99)
 at org.apache.spark.scheduler.Task.run(Task.scala:51)
 {quote}
 NO PRECOMMIT TESTS. This is for spark branch only.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8362) Investigate flaky test parallel.q

2014-10-15 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14172330#comment-14172330
 ] 

Hive QA commented on HIVE-8362:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12674791/HIVE-8362.3.patch

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 6559 tests executed
*Failed tests:*
{noformat}
org.apache.hive.beeline.TestSchemaTool.testSchemaInit
org.apache.hive.beeline.TestSchemaTool.testSchemaUpgrade
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1275/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1275/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-1275/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12674791

 Investigate flaky test parallel.q
 -

 Key: HIVE-8362
 URL: https://issues.apache.org/jira/browse/HIVE-8362
 Project: Hive
  Issue Type: Sub-task
  Components: Testing Infrastructure
Affects Versions: 0.14.0
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
 Attachments: HIVE-8362.1-spark.patch, HIVE-8362.2.patch, 
 HIVE-8362.3.patch, HIVE-8362.patch


 Test parallel.q is flaky. It fails sometimes with error like:
 {noformat}
 Failed tests: 
   TestSparkCliDriver.testCliDriver_parallel:120-runTest:146 Unexpected 
 exception junit.framework.AssertionFailedError: Client Execution results 
 failed with error code = 1
 See ./ql/target/tmp/log/hive.log or ./itests/qtest/target/tmp/log/hive.log, 
 or check ./ql/target/surefire-reports or 
 ./itests/qtest/target/surefire-reports/ for specific test cases logs.
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 26661: HIVE-7873 Re-enable lazy HiveBaseFunctionResultList

2014-10-15 Thread Jimmy Xiang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/26661/
---

(Updated Oct. 15, 2014, 12:55 p.m.)


Review request for hive and Xuefu Zhang.


Changes
---

The new patch made HiveKVResultCache fully thread safe as Xuefu suggested.


Bugs: HIVE-7873
https://issues.apache.org/jira/browse/HIVE-7873


Repository: hive-git


Description
---

Re-enabled lazy HiveBaseFunctionResultList. A separate RowContainer is used to 
work around the no-write-after-read limitation of RowContainer. The patch also 
fixed a concurrency issue in HiveKVResultCache. Synchronized is used instead of 
reentrant lock since I assume there won't be many threads to access the cache.


Diffs (updated)
-

  
ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveBaseFunctionResultList.java
 0df2580 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveKVResultCache.java 
a6b9037 
  ql/src/test/org/apache/hadoop/hive/ql/exec/spark/TestHiveKVResultCache.java 
496a11f 

Diff: https://reviews.apache.org/r/26661/diff/


Testing
---

Unit test, some simple perf test.


Thanks,

Jimmy Xiang



[jira] [Updated] (HIVE-7873) Re-enable lazy HiveBaseFunctionResultList

2014-10-15 Thread Jimmy Xiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang updated HIVE-7873:
--
Attachment: HIVE-7873.3-spark.patch

Thanks Xuefu for the review. Attached patch 3 that addressed review comments.

 Re-enable lazy HiveBaseFunctionResultList
 -

 Key: HIVE-7873
 URL: https://issues.apache.org/jira/browse/HIVE-7873
 Project: Hive
  Issue Type: Sub-task
Reporter: Brock Noland
Assignee: Jimmy Xiang
  Labels: Spark-M4, spark
 Attachments: HIVE-7873.1-spark.patch, HIVE-7873.2-spark.patch, 
 HIVE-7873.2-spark.patch, HIVE-7873.3-spark.patch


 We removed this optimization in HIVE-7799.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 26661: HIVE-7873 Re-enable lazy HiveBaseFunctionResultList

2014-10-15 Thread Xuefu Zhang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/26661/#review56698
---

Ship it!


Ship It!

- Xuefu Zhang


On Oct. 15, 2014, 12:55 p.m., Jimmy Xiang wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/26661/
 ---
 
 (Updated Oct. 15, 2014, 12:55 p.m.)
 
 
 Review request for hive and Xuefu Zhang.
 
 
 Bugs: HIVE-7873
 https://issues.apache.org/jira/browse/HIVE-7873
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 Re-enabled lazy HiveBaseFunctionResultList. A separate RowContainer is used 
 to work around the no-write-after-read limitation of RowContainer. The patch 
 also fixed a concurrency issue in HiveKVResultCache. Synchronized is used 
 instead of reentrant lock since I assume there won't be many threads to 
 access the cache.
 
 
 Diffs
 -
 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveBaseFunctionResultList.java
  0df2580 
   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveKVResultCache.java 
 a6b9037 
   ql/src/test/org/apache/hadoop/hive/ql/exec/spark/TestHiveKVResultCache.java 
 496a11f 
 
 Diff: https://reviews.apache.org/r/26661/diff/
 
 
 Testing
 ---
 
 Unit test, some simple perf test.
 
 
 Thanks,
 
 Jimmy Xiang
 




[jira] [Commented] (HIVE-7873) Re-enable lazy HiveBaseFunctionResultList

2014-10-15 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14172356#comment-14172356
 ] 

Xuefu Zhang commented on HIVE-7873:
---

+1 pending on test.

 Re-enable lazy HiveBaseFunctionResultList
 -

 Key: HIVE-7873
 URL: https://issues.apache.org/jira/browse/HIVE-7873
 Project: Hive
  Issue Type: Sub-task
Reporter: Brock Noland
Assignee: Jimmy Xiang
  Labels: Spark-M4, spark
 Attachments: HIVE-7873.1-spark.patch, HIVE-7873.2-spark.patch, 
 HIVE-7873.2-spark.patch, HIVE-7873.3-spark.patch


 We removed this optimization in HIVE-7799.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8448) Union All might not work due to the type conversion issue

2014-10-15 Thread Yongzhi Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongzhi Chen updated HIVE-8448:
---
Attachment: HIVE-8448.1.patch

Need code review. 

 Union All might not work due to the type conversion issue
 -

 Key: HIVE-8448
 URL: https://issues.apache.org/jira/browse/HIVE-8448
 Project: Hive
  Issue Type: Bug
Reporter: Chaoyu Tang
Assignee: Yongzhi Chen
Priority: Minor
 Attachments: HIVE-8448.1.patch


 create table t1 (val date);
 insert overwrite table t1 select '2014-10-10' from src limit 1;
 create table t2 (val varchar(10));
 insert overwrite table t2 select '2014-10-10' from src limit 1; 
 ==
 Query:
 select t.val from
 (select val from t1
 union all
 select val from t1
 union all
 select val from t2
 union all
 select val from t1) t;
 ==
 Will throw exception: 
 {code}
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Incompatible 
 types for union operator
   at 
 org.apache.hadoop.hive.ql.exec.UnionOperator.initializeOp(UnionOperator.java:86)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:464)
   at 
 org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:420)
   at 
 org.apache.hadoop.hive.ql.exec.SelectOperator.initializeOp(SelectOperator.java:65)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:464)
   at 
 org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:420)
   at 
 org.apache.hadoop.hive.ql.exec.TableScanOperator.initializeOp(TableScanOperator.java:193)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380)
   at 
 org.apache.hadoop.hive.ql.exec.MapOperator.initializeOp(MapOperator.java:443)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380)
   at 
 org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:133)
   ... 22 more
 {code}
 It was because at this query parse step, getCommonClassForUnionAll is used, 
 but at execution getCommonClass is used. They are not used consistently in 
 union. The later one does not support the implicit conversion from date to 
 string, which is the problem cause.
 The change might be simple to fix this particular union issue but I noticed 
 that there are three versions of getCommonClass: getCommonClass, 
 getCommonClassForComparison, getCommonClassForUnionAll, and wonder if they 
 need to be cleaned and refactored.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8448) Union All might not work due to the type conversion issue

2014-10-15 Thread Yongzhi Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongzhi Chen updated HIVE-8448:
---
Affects Version/s: 0.13.1
   Status: Patch Available  (was: Open)

when SemanticAnalyzer check union plan, it use getCommonClassForUnionAll, but 
when evaluate union operator it uses getCommonClass. The inconsistency cause 
some queries with multiple union all on date type column can pass the analyzer 
but fail with HiveException: Incompatible types at execute time.
Fixed by use new updateForUnionAll method which use getCommonClassForUnionAll 
to update column for union operator.

 Union All might not work due to the type conversion issue
 -

 Key: HIVE-8448
 URL: https://issues.apache.org/jira/browse/HIVE-8448
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.1
Reporter: Chaoyu Tang
Assignee: Yongzhi Chen
Priority: Minor
 Attachments: HIVE-8448.1.patch


 create table t1 (val date);
 insert overwrite table t1 select '2014-10-10' from src limit 1;
 create table t2 (val varchar(10));
 insert overwrite table t2 select '2014-10-10' from src limit 1; 
 ==
 Query:
 select t.val from
 (select val from t1
 union all
 select val from t1
 union all
 select val from t2
 union all
 select val from t1) t;
 ==
 Will throw exception: 
 {code}
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Incompatible 
 types for union operator
   at 
 org.apache.hadoop.hive.ql.exec.UnionOperator.initializeOp(UnionOperator.java:86)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:464)
   at 
 org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:420)
   at 
 org.apache.hadoop.hive.ql.exec.SelectOperator.initializeOp(SelectOperator.java:65)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:464)
   at 
 org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:420)
   at 
 org.apache.hadoop.hive.ql.exec.TableScanOperator.initializeOp(TableScanOperator.java:193)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380)
   at 
 org.apache.hadoop.hive.ql.exec.MapOperator.initializeOp(MapOperator.java:443)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380)
   at 
 org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:133)
   ... 22 more
 {code}
 It was because at this query parse step, getCommonClassForUnionAll is used, 
 but at execution getCommonClass is used. They are not used consistently in 
 union. The later one does not support the implicit conversion from date to 
 string, which is the problem cause.
 The change might be simple to fix this particular union issue but I noticed 
 that there are three versions of getCommonClass: getCommonClass, 
 getCommonClassForComparison, getCommonClassForUnionAll, and wonder if they 
 need to be cleaned and refactored.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-7873) Re-enable lazy HiveBaseFunctionResultList

2014-10-15 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14172391#comment-14172391
 ] 

Hive QA commented on HIVE-7873:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12674987/HIVE-7873.3-spark.patch

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 6770 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_sample_islocalmode_hook
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_tez_smb_1
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/222/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/222/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-SPARK-Build-222/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12674987

 Re-enable lazy HiveBaseFunctionResultList
 -

 Key: HIVE-7873
 URL: https://issues.apache.org/jira/browse/HIVE-7873
 Project: Hive
  Issue Type: Sub-task
Reporter: Brock Noland
Assignee: Jimmy Xiang
  Labels: Spark-M4, spark
 Attachments: HIVE-7873.1-spark.patch, HIVE-7873.2-spark.patch, 
 HIVE-7873.2-spark.patch, HIVE-7873.3-spark.patch


 We removed this optimization in HIVE-7799.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8429) Add records in/out counters

2014-10-15 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14172408#comment-14172408
 ] 

Hive QA commented on HIVE-8429:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12674794/HIVE-8429.4.patch

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 6558 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parallel
org.apache.hive.beeline.TestSchemaTool.testSchemaInit
org.apache.hive.beeline.TestSchemaTool.testSchemaUpgrade
org.apache.hive.hcatalog.streaming.TestStreaming.testTransactionBatchAbort
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1276/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1276/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-1276/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12674794

 Add records in/out counters
 ---

 Key: HIVE-8429
 URL: https://issues.apache.org/jira/browse/HIVE-8429
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Attachments: HIVE-8429.1.patch, HIVE-8429.2.patch, HIVE-8429.3.patch, 
 HIVE-8429.4.patch


 We don't do counters for input/output records right now. That would help for 
 debugging though (if it can be done with minimal overhead).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HIVE-8455) Print Spark job progress format info on the console[Spark Branch]

2014-10-15 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14172409#comment-14172409
 ] 

Xuefu Zhang edited comment on HIVE-8455 at 10/15/14 2:32 PM:
-

{quote}
 I think we may not actually need this, what do you think?
{quote}

IMO, it's okay to just print at the beginning. Once user understands this, it 
might actually becomes annoying to see more of it.

And at this stage, we don't have to make everything perfect. We can always come 
back to improve it.


was (Author: xuefuz):
{quote}
 I think we may not actually need this, what do you think?
{quote}

IMO, it's okay to just print at the beginning. Once user understands this, it 
might actually becomes annoying to see more of it.

 Print Spark job progress format info on the console[Spark Branch]
 -

 Key: HIVE-8455
 URL: https://issues.apache.org/jira/browse/HIVE-8455
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Chengxiang Li
Assignee: Chengxiang Li
  Labels: Spark-M3
 Attachments: HIVE-8455.1-spark.patch, HIVE-8455.2-spark.patch, hive 
 on spark job status.PNG


 We have add support of spark job status monitoring on HIVE-7439, but not 
 print job progress format info on the console, user may confuse about what 
 the  progress info means, so I would like to add job progress format info 
 here.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8395) CBO: enable by default

2014-10-15 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14172411#comment-14172411
 ] 

Hive QA commented on HIVE-8395:
---



{color:red}Overall{color}: -1 no tests executed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12674845/HIVE-8395.05.patch

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1277/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1277/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-1277/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ [[ -n /usr/java/jdk1.7.0_45-cloudera ]]
+ export JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera
+ JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera
+ export 
PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/java/jdk1.6.0_34/bin:/usr/local/apache-maven-3.0.5/bin:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.6.0_34/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin
+ 
PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/java/jdk1.6.0_34/bin:/usr/local/apache-maven-3.0.5/bin:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.6.0_34/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ cd /data/hive-ptest/working/
+ tee /data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-1277/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ svn = \s\v\n ]]
+ [[ -n '' ]]
+ [[ -d apache-svn-trunk-source ]]
+ [[ ! -d apache-svn-trunk-source/.svn ]]
+ [[ ! -d apache-svn-trunk-source ]]
+ cd apache-svn-trunk-source
+ svn revert -R .
Reverted 'ql/src/test/org/apache/hadoop/hive/ql/exec/TestOperators.java'
Reverted 'ql/src/java/org/apache/hadoop/hive/ql/exec/JoinOperator.java'
Reverted 'ql/src/java/org/apache/hadoop/hive/ql/exec/MapOperator.java'
Reverted 'ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java'
Reverted 'ql/src/java/org/apache/hadoop/hive/ql/exec/FilterOperator.java'
Reverted 'ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecMapper.java'
Reverted 'ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecReducer.java'
Reverted 'ql/src/java/org/apache/hadoop/hive/ql/exec/ScriptOperator.java'
Reverted 
'ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorMapOperator.java'
Reverted 
'ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorFilterOperator.java'
Reverted 'ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java'
Reverted 'ql/src/java/org/apache/hadoop/hive/ql/exec/ReduceSinkOperator.java'
Reverted 
'ql/src/java/org/apache/hadoop/hive/ql/exec/tez/MergeFileRecordProcessor.java'
Reverted 
'ql/src/java/org/apache/hadoop/hive/ql/exec/tez/ReduceRecordProcessor.java'
Reverted 
'ql/src/java/org/apache/hadoop/hive/ql/exec/tez/MapRecordProcessor.java'
Reverted 'ql/src/java/org/apache/hadoop/hive/ql/exec/tez/DagUtils.java'
Reverted 'ql/src/java/org/apache/hadoop/hive/ql/exec/tez/RecordProcessor.java'
++ egrep -v '^X|^Performing status on external'
++ awk '{print $2}'
++ svn status --no-ignore
+ rm -rf target datanucleus.log ant/target shims/target shims/0.20/target 
shims/0.20S/target shims/0.23/target shims/aggregator/target 
shims/common/target shims/common-secure/target packaging/target 
hbase-handler/target testutils/target jdbc/target metastore/target 
itests/target itests/hcatalog-unit/target itests/test-serde/target 
itests/qtest/target itests/hive-unit-hadoop2/target itests/hive-minikdc/target 
itests/hive-unit/target itests/custom-serde/target itests/util/target 
hcatalog/target hcatalog/core/target hcatalog/streaming/target 
hcatalog/server-extensions/target hcatalog/webhcat/svr/target 
hcatalog/webhcat/java-client/target hcatalog/hcatalog-pig-adapter/target 
accumulo-handler/target hwi/target common/target common/src/gen service/target 
contrib/target serde/target beeline/target odbc/target cli/target 
ql/dependency-reduced-pom.xml ql/target
+ svn update

Fetching external item into 'hcatalog/src/test/e2e/harness'
External at revision 1632054.

At revision 1632054.
+ patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hive-ptest/working/scratch/build.patch
+ [[ -f /data/hive-ptest/working/scratch/build.patch ]]
+ chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh
+ 

[jira] [Updated] (HIVE-7873) Re-enable lazy HiveBaseFunctionResultList [Spark Branch]

2014-10-15 Thread Xuefu Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang updated HIVE-7873:
--
Summary: Re-enable lazy HiveBaseFunctionResultList [Spark Branch]  (was: 
Re-enable lazy HiveBaseFunctionResultList)

 Re-enable lazy HiveBaseFunctionResultList [Spark Branch]
 

 Key: HIVE-7873
 URL: https://issues.apache.org/jira/browse/HIVE-7873
 Project: Hive
  Issue Type: Sub-task
Reporter: Brock Noland
Assignee: Jimmy Xiang
  Labels: Spark-M4, spark
 Attachments: HIVE-7873.1-spark.patch, HIVE-7873.2-spark.patch, 
 HIVE-7873.2-spark.patch, HIVE-7873.3-spark.patch


 We removed this optimization in HIVE-7799.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-6715) Hive JDBC should include username into open session request for non-sasl connection

2014-10-15 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-6715:

Attachment: (was: HIVE-6715.2.patch)

 Hive JDBC should include username into open session request for non-sasl 
 connection
 ---

 Key: HIVE-6715
 URL: https://issues.apache.org/jira/browse/HIVE-6715
 Project: Hive
  Issue Type: Bug
  Components: JDBC
Reporter: Srinath
Assignee: Prasad Mujumdar
 Attachments: HIVE-6715.1.patch


 The only parameter from sessVars that's being set in 
 HiveConnection.openSession() is HS2_PROXY_USER. 
 HIVE_AUTH_USER must also be set.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-6715) Hive JDBC should include username into open session request for non-sasl connection

2014-10-15 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-6715:

Attachment: HIVE-6715.2.patch

[~prasadm] Thanks for the patch. Looks good to me.  +1

I have rebased it for the latest trunk.


 Hive JDBC should include username into open session request for non-sasl 
 connection
 ---

 Key: HIVE-6715
 URL: https://issues.apache.org/jira/browse/HIVE-6715
 Project: Hive
  Issue Type: Bug
  Components: JDBC
Reporter: Srinath
Assignee: Prasad Mujumdar
 Attachments: HIVE-6715.1.patch


 The only parameter from sessVars that's being set in 
 HiveConnection.openSession() is HS2_PROXY_USER. 
 HIVE_AUTH_USER must also be set.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-6715) Hive JDBC should include username into open session request for non-sasl connection

2014-10-15 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-6715:

Attachment: HIVE-6715.2.patch

 Hive JDBC should include username into open session request for non-sasl 
 connection
 ---

 Key: HIVE-6715
 URL: https://issues.apache.org/jira/browse/HIVE-6715
 Project: Hive
  Issue Type: Bug
  Components: JDBC
Reporter: Srinath
Assignee: Prasad Mujumdar
 Attachments: HIVE-6715.1.patch, HIVE-6715.2.patch


 The only parameter from sessVars that's being set in 
 HiveConnection.openSession() is HS2_PROXY_USER. 
 HIVE_AUTH_USER must also be set.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-6715) Hive JDBC should include username into open session request for non-sasl connection

2014-10-15 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-6715:

Status: Patch Available  (was: Open)

 Hive JDBC should include username into open session request for non-sasl 
 connection
 ---

 Key: HIVE-6715
 URL: https://issues.apache.org/jira/browse/HIVE-6715
 Project: Hive
  Issue Type: Bug
  Components: JDBC
Reporter: Srinath
Assignee: Prasad Mujumdar
 Attachments: HIVE-6715.1.patch, HIVE-6715.2.patch


 The only parameter from sessVars that's being set in 
 HiveConnection.openSession() is HS2_PROXY_USER. 
 HIVE_AUTH_USER must also be set.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-7914) Simplify join predicates for CBO to avoid cross products

2014-10-15 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-7914:
---
   Resolution: Fixed
Fix Version/s: (was: 0.14.0)
   0.15.0
   Status: Resolved  (was: Patch Available)

Committed to trunk. Thanks, John!
[~vikram.dixit] it will be good to have this in 0.14


 Simplify join predicates for CBO to avoid cross products
 

 Key: HIVE-7914
 URL: https://issues.apache.org/jira/browse/HIVE-7914
 Project: Hive
  Issue Type: Bug
  Components: CBO
Affects Versions: 0.13.1
Reporter: Mostafa Mokhtar
Assignee: Laljo John Pullokkaran
 Fix For: 0.15.0

 Attachments: HIVE-7914.patch


 Simplify join predicates for disjunctive predicates to avoid cross products.
 For TPC-DS query 13 we generate a cross products.
 The join predicate on (store_sales x customer_demographics) ,  (store_sales x 
 household_demographics) and (store_sales x customer_address) can be pull up 
 to avoid the cross products
 {code}
 select avg(ss_quantity)
,avg(ss_ext_sales_price)
,avg(ss_ext_wholesale_cost)
,sum(ss_ext_wholesale_cost)
  from store_sales
  ,store
  ,customer_demographics
  ,household_demographics
  ,customer_address
  ,date_dim
  where store.s_store_sk = store_sales.ss_store_sk
  and  store_sales.ss_sold_date_sk = date_dim.d_date_sk and date_dim.d_year = 
 2001
  and((store_sales.ss_hdemo_sk=household_demographics.hd_demo_sk
   and customer_demographics.cd_demo_sk = store_sales.ss_cdemo_sk
   and customer_demographics.cd_marital_status = 'M'
   and customer_demographics.cd_education_status = '4 yr Degree'
   and store_sales.ss_sales_price between 100.00 and 150.00
   and household_demographics.hd_dep_count = 3   
  )or
  (store_sales.ss_hdemo_sk=household_demographics.hd_demo_sk
   and customer_demographics.cd_demo_sk = store_sales.ss_cdemo_sk
   and customer_demographics.cd_marital_status = 'D'
   and customer_demographics.cd_education_status = 'Primary'
   and store_sales.ss_sales_price between 50.00 and 100.00   
   and household_demographics.hd_dep_count = 1
  ) or 
  (store_sales.ss_hdemo_sk=household_demographics.hd_demo_sk
   and customer_demographics.cd_demo_sk = ss_cdemo_sk
   and customer_demographics.cd_marital_status = 'U'
   and customer_demographics.cd_education_status = 'Advanced Degree'
   and store_sales.ss_sales_price between 150.00 and 200.00 
   and household_demographics.hd_dep_count = 1  
  ))
  and((store_sales.ss_addr_sk = customer_address.ca_address_sk
   and customer_address.ca_country = 'United States'
   and customer_address.ca_state in ('KY', 'GA', 'NM')
   and store_sales.ss_net_profit between 100 and 200  
  ) or
  (store_sales.ss_addr_sk = customer_address.ca_address_sk
   and customer_address.ca_country = 'United States'
   and customer_address.ca_state in ('MT', 'OR', 'IN')
   and store_sales.ss_net_profit between 150 and 300  
  ) or
  (store_sales.ss_addr_sk = customer_address.ca_address_sk
   and customer_address.ca_country = 'United States'
   and customer_address.ca_state in ('WI', 'MO', 'WV')
   and store_sales.ss_net_profit between 50 and 250  
  ))
 ;
 {code}
 This is the plan currently generated without any predicate simplification 
 {code}
 Warning: Map Join MAPJOIN[59][bigTable=?] in task 'Map 8' is a cross product
 Warning: Map Join MAPJOIN[58][bigTable=?] in task 'Map 8' is a cross product
 Warning: Shuffle Join JOIN[29][tables = [$hdt$_5, $hdt$_6]] in Stage 'Reducer 
 2' is a cross product
 OK
 STAGE DEPENDENCIES:
   Stage-1 is a root stage
   Stage-0 depends on stages: Stage-1
 STAGE PLANS:
   Stage: Stage-1
 Tez
   Edges:
 Map 7 - Map 8 (BROADCAST_EDGE)
 Map 8 - Map 5 (BROADCAST_EDGE), Map 6 (BROADCAST_EDGE)
 Reducer 2 - Map 1 (SIMPLE_EDGE), Map 4 (BROADCAST_EDGE), Map 7 
 (SIMPLE_EDGE)
 Reducer 3 - Reducer 2 (SIMPLE_EDGE)
   DagName: mmokhtar_20140828155050_7059c24b-501b-4683-86c0-4f3c023f0b0e:1
   Vertices:
 Map 1 
 Map Operator Tree:
 TableScan
   alias: customer_address
   Statistics: Num rows: 4000 Data size: 40595195284 Basic 
 stats: COMPLETE Column stats: NONE
   Select Operator
 expressions: ca_address_sk (type: int), ca_state (type: 
 string), ca_country (type: string)
 outputColumnNames: _col0, _col1, _col2
 Statistics: Num rows: 4000 Data size: 40595195284 
 Basic stats: COMPLETE Column stats: NONE
 Reduce Output Operator
   sort order: 
   Statistics: Num rows: 4000 Data size: 40595195284 
 Basic stats: 

[jira] [Resolved] (HIVE-7913) Simplify filter predicates for CBO

2014-10-15 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7913?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan resolved HIVE-7913.

Resolution: Duplicate

Fixed via HIVE-7914

 Simplify filter predicates for CBO
 --

 Key: HIVE-7913
 URL: https://issues.apache.org/jira/browse/HIVE-7913
 Project: Hive
  Issue Type: Bug
  Components: CBO
Affects Versions: 0.13.1
Reporter: Mostafa Mokhtar
Assignee: Laljo John Pullokkaran
 Fix For: 0.14.0


 Simplify predicates for disjunctive predicates so that can get pushed down to 
 the scan.
 For TPC-DS query 13 we push down predicates in the following form 
 where c_martial_status in ('M','D','U') etc.. 
 {code}
 select avg(ss_quantity)
,avg(ss_ext_sales_price)
,avg(ss_ext_wholesale_cost)
,sum(ss_ext_wholesale_cost)
  from store_sales
  ,store
  ,customer_demographics
  ,household_demographics
  ,customer_address
  ,date_dim
  where store.s_store_sk = store_sales.ss_store_sk
  and  store_sales.ss_sold_date_sk = date_dim.d_date_sk and date_dim.d_year = 
 2001
  and((store_sales.ss_hdemo_sk=household_demographics.hd_demo_sk
   and customer_demographics.cd_demo_sk = store_sales.ss_cdemo_sk
   and customer_demographics.cd_marital_status = 'M'
   and customer_demographics.cd_education_status = '4 yr Degree'
   and store_sales.ss_sales_price between 100.00 and 150.00
   and household_demographics.hd_dep_count = 3   
  )or
  (store_sales.ss_hdemo_sk=household_demographics.hd_demo_sk
   and customer_demographics.cd_demo_sk = store_sales.ss_cdemo_sk
   and customer_demographics.cd_marital_status = 'D'
   and customer_demographics.cd_education_status = 'Primary'
   and store_sales.ss_sales_price between 50.00 and 100.00   
   and household_demographics.hd_dep_count = 1
  ) or 
  (store_sales.ss_hdemo_sk=household_demographics.hd_demo_sk
   and customer_demographics.cd_demo_sk = ss_cdemo_sk
   and customer_demographics.cd_marital_status = 'U'
   and customer_demographics.cd_education_status = 'Advanced Degree'
   and store_sales.ss_sales_price between 150.00 and 200.00 
   and household_demographics.hd_dep_count = 1  
  ))
  and((store_sales.ss_addr_sk = customer_address.ca_address_sk
   and customer_address.ca_country = 'United States'
   and customer_address.ca_state in ('KY', 'GA', 'NM')
   and store_sales.ss_net_profit between 100 and 200  
  ) or
  (store_sales.ss_addr_sk = customer_address.ca_address_sk
   and customer_address.ca_country = 'United States'
   and customer_address.ca_state in ('MT', 'OR', 'IN')
   and store_sales.ss_net_profit between 150 and 300  
  ) or
  (store_sales.ss_addr_sk = customer_address.ca_address_sk
   and customer_address.ca_country = 'United States'
   and customer_address.ca_state in ('WI', 'MO', 'WV')
   and store_sales.ss_net_profit between 50 and 250  
  ))
 ;
 {code}
 This is the plan currently generated without any predicate simplification 
 {code}
 STAGE DEPENDENCIES:
   Stage-1 is a root stage
   Stage-0 depends on stages: Stage-1
 STAGE PLANS:
   Stage: Stage-1
 Tez
   Edges:
 Map 7 - Map 8 (BROADCAST_EDGE)
 Map 8 - Map 5 (BROADCAST_EDGE), Map 6 (BROADCAST_EDGE)
 Reducer 2 - Map 1 (SIMPLE_EDGE), Map 4 (BROADCAST_EDGE), Map 7 
 (SIMPLE_EDGE)
 Reducer 3 - Reducer 2 (SIMPLE_EDGE)
   DagName: mmokhtar_20140828155050_7059c24b-501b-4683-86c0-4f3c023f0b0e:1
   Vertices:
 Map 1 
 Map Operator Tree:
 TableScan
   alias: customer_address
   Statistics: Num rows: 4000 Data size: 40595195284 Basic 
 stats: COMPLETE Column stats: NONE
   Select Operator
 expressions: ca_address_sk (type: int), ca_state (type: 
 string), ca_country (type: string)
 outputColumnNames: _col0, _col1, _col2
 Statistics: Num rows: 4000 Data size: 40595195284 
 Basic stats: COMPLETE Column stats: NONE
 Reduce Output Operator
   sort order: 
   Statistics: Num rows: 4000 Data size: 40595195284 
 Basic stats: COMPLETE Column stats: NONE
   value expressions: _col0 (type: int), _col1 (type: 
 string), _col2 (type: string)
 Execution mode: vectorized
 Map 4 
 Map Operator Tree:
 TableScan
   alias: date_dim
   filterExpr: ((d_year = 2001) and d_date_sk is not null) 
 (type: boolean)
   Statistics: Num rows: 73049 Data size: 81741831 Basic 
 stats: COMPLETE Column stats: NONE
   Filter Operator
 predicate: ((d_year = 2001) and d_date_sk is not null) 
 (type: boolean)

[jira] [Updated] (HIVE-6715) Hive JDBC should include username into open session request for non-sasl connection

2014-10-15 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-6715:

Attachment: HIVE-6715.3.patch

HIVE-6715.3.patch - In current trunk, setting the auth property via conf for 
MiniHS2 is what works  (HiveAuthFactory no longer creates a new hiveconf). 
Updating test case.


 Hive JDBC should include username into open session request for non-sasl 
 connection
 ---

 Key: HIVE-6715
 URL: https://issues.apache.org/jira/browse/HIVE-6715
 Project: Hive
  Issue Type: Bug
  Components: JDBC
Reporter: Srinath
Assignee: Prasad Mujumdar
 Attachments: HIVE-6715.1.patch, HIVE-6715.2.patch, HIVE-6715.3.patch


 The only parameter from sessVars that's being set in 
 HiveConnection.openSession() is HS2_PROXY_USER. 
 HIVE_AUTH_USER must also be set.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-8467) Table Copy - Background, incremental data load

2014-10-15 Thread Rajat Venkatesh (JIRA)
Rajat Venkatesh created HIVE-8467:
-

 Summary: Table Copy - Background, incremental data load
 Key: HIVE-8467
 URL: https://issues.apache.org/jira/browse/HIVE-8467
 Project: Hive
  Issue Type: New Feature
Reporter: Rajat Venkatesh


Traditionally, Hive and other tools in the Hadoop eco-system havent required a 
load stage. However, with recent developments, Hive is much more performant 
when data is stored in specific formats like ORC, Parquet, Avro etc. 
Technologies like Presto, also work much better with certain data formats. At 
the same time, data is generated or obtained from 3rd parties in non-optimal 
formats such as CSV, tab-limited or JSON. Many a times, its not an option to 
change the data format at the source. We've found that users either use 
sub-optimal formats or spend a large amount of effort creating and maintaining 
copies. We want to propose a new construct - Table Copy - to help “load” data 
into an optimal storage format.

I am going to attach a PDF document with a lot more details especially 
addressing how is this different from bulk loads in relational DBs or 
materialized views.

Looking forward to hear if others see a similar need to formalize conversion of 
data to different storage formats.  If yes, are the details in the PDF document 
a good start ?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8467) Table Copy - Background, incremental data load

2014-10-15 Thread Rajat Venkatesh (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajat Venkatesh updated HIVE-8467:
--
Attachment: Table Copies.pdf

 Table Copy - Background, incremental data load
 --

 Key: HIVE-8467
 URL: https://issues.apache.org/jira/browse/HIVE-8467
 Project: Hive
  Issue Type: New Feature
Reporter: Rajat Venkatesh
 Attachments: Table Copies.pdf


 Traditionally, Hive and other tools in the Hadoop eco-system havent required 
 a load stage. However, with recent developments, Hive is much more performant 
 when data is stored in specific formats like ORC, Parquet, Avro etc. 
 Technologies like Presto, also work much better with certain data formats. At 
 the same time, data is generated or obtained from 3rd parties in non-optimal 
 formats such as CSV, tab-limited or JSON. Many a times, its not an option to 
 change the data format at the source. We've found that users either use 
 sub-optimal formats or spend a large amount of effort creating and 
 maintaining copies. We want to propose a new construct - Table Copy - to help 
 “load” data into an optimal storage format.
 I am going to attach a PDF document with a lot more details especially 
 addressing how is this different from bulk loads in relational DBs or 
 materialized views.
 Looking forward to hear if others see a similar need to formalize conversion 
 of data to different storage formats.  If yes, are the details in the PDF 
 document a good start ?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8433) CBO loses a column during AST conversion

2014-10-15 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14172508#comment-14172508
 ] 

Hive QA commented on HIVE-8433:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12674858/HIVE-8433.patch

{color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 6560 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parallel
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udaf_histogram_numeric
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_cbo_correctness
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_optimize_nullscan
org.apache.hive.beeline.TestSchemaTool.testSchemaInit
org.apache.hive.beeline.TestSchemaTool.testSchemaUpgrade
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1278/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1278/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-1278/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 6 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12674858

 CBO loses a column during AST conversion
 

 Key: HIVE-8433
 URL: https://issues.apache.org/jira/browse/HIVE-8433
 Project: Hive
  Issue Type: Bug
  Components: CBO
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
Priority: Critical
 Attachments: HIVE-8433.patch


 {noformat}
 SELECT
   CAST(value AS BINARY),
   value
 FROM src
 ORDER BY value
 LIMIT 100
 {noformat}
 returns only one column.
 Final CBO plan is
 {noformat}
   HiveSortRel(sort0=[$1], dir0=[ASC]): rowcount = 500.0, cumulative cost = 
 {24858.432393688767 rows, 500.0 cpu, 0.0 io}, id = 44
 HiveProjectRel(value=[CAST($0):BINARY(2147483647) NOT NULL], 
 value1=[$0]): rowcount = 500.0, cumulative cost = {0.0 rows, 0.0 cpu, 0.0 
 io}, id = 42
   HiveProjectRel(value=[$1]): rowcount = 500.0, cumulative cost = {0.0 
 rows, 0.0 cpu, 0.0 io}, id = 40
 HiveTableScanRel(table=[[default.src]]): rowcount = 500.0, cumulative 
 cost = {0}, id = 0
 {noformat}
 but the resulting AST has only one column. Must be some bug in conversion, 
 probably related to the name collision in the schema, judging by the alias of 
 the column for the binary-cast value in the AST
 {noformat} 
 TOK_QUERY
TOK_FROM
   TOK_SUBQUERY
  TOK_QUERY
 TOK_FROM
TOK_TABREF
   TOK_TABNAME
  default
  src
   src
 TOK_INSERT
TOK_DESTINATION
   TOK_DIR
  TOK_TMP_FILE
TOK_SELECT
   TOK_SELEXPR
  .
 TOK_TABLE_OR_COL
src
 value
  value
  $hdt$_0
TOK_INSERT
   TOK_DESTINATION
  TOK_DIR
 TOK_TMP_FILE
   TOK_SELECT
  TOK_SELEXPR
 TOK_FUNCTION
TOK_BINARY
.
   TOK_TABLE_OR_COL
  $hdt$_0
   value
 value
   TOK_ORDERBY
  TOK_TABSORTCOLNAMEASC
 TOK_TABLE_OR_COL
value
   TOK_LIMIT
  100
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8450) Create table like does not copy over table properties

2014-10-15 Thread Brock Noland (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14172558#comment-14172558
 ] 

Brock Noland commented on HIVE-8450:


Nice work!! +1 pending tests

 Create table like does not copy over table properties
 -

 Key: HIVE-8450
 URL: https://issues.apache.org/jira/browse/HIVE-8450
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.14.0, 0.13.1
Reporter: Brock Noland
Assignee: Navis
Priority: Critical
 Attachments: HIVE-8450.1.patch.txt


 Assuming t2 is a avro backed table, the following:
 {{create table t1 like t2}}
 should create an avro backed table, but the schema.url.* is not being copied 
 correctly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-8468) TestSchemaTool is failing after committing the version number change

2014-10-15 Thread Brock Noland (JIRA)
Brock Noland created HIVE-8468:
--

 Summary: TestSchemaTool is failing after committing the version 
number change
 Key: HIVE-8468
 URL: https://issues.apache.org/jira/browse/HIVE-8468
 Project: Hive
  Issue Type: Bug
Reporter: Brock Noland






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8468) TestSchemaTool is failing after committing the version number change

2014-10-15 Thread Brock Noland (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14172563#comment-14172563
 ] 

Brock Noland commented on HIVE-8468:


I think this is related to HIVE-8381. FYI [~vikram.dixit]

 TestSchemaTool is failing after committing the version number change
 

 Key: HIVE-8468
 URL: https://issues.apache.org/jira/browse/HIVE-8468
 Project: Hive
  Issue Type: Bug
Reporter: Brock Noland





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8468) TestSchemaTool is failing after committing the version number change

2014-10-15 Thread Brock Noland (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-8468:
---
Description: Logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-1278/failed/TestSchemaTool/TEST-TestSchemaTool-TEST-org.apache.hive.beeline.TestSchemaTool.xml

 TestSchemaTool is failing after committing the version number change
 

 Key: HIVE-8468
 URL: https://issues.apache.org/jira/browse/HIVE-8468
 Project: Hive
  Issue Type: Bug
Reporter: Brock Noland

 Logs: 
 http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-1278/failed/TestSchemaTool/TEST-TestSchemaTool-TEST-org.apache.hive.beeline.TestSchemaTool.xml



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8455) Print Spark job progress format info on the console[Spark Branch]

2014-10-15 Thread Brock Noland (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14172619#comment-14172619
 ] 

Brock Noland commented on HIVE-8455:


Thanks guys! This was my wishlist from yesterday after watching a 10+ minute 
job run. It's certainly possible that not all of the items are needed. Let's go 
ahead with this patch!

+1

 Print Spark job progress format info on the console[Spark Branch]
 -

 Key: HIVE-8455
 URL: https://issues.apache.org/jira/browse/HIVE-8455
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Chengxiang Li
Assignee: Chengxiang Li
  Labels: Spark-M3
 Attachments: HIVE-8455.1-spark.patch, HIVE-8455.2-spark.patch, hive 
 on spark job status.PNG


 We have add support of spark job status monitoring on HIVE-7439, but not 
 print job progress format info on the console, user may confuse about what 
 the  progress info means, so I would like to add job progress format info 
 here.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


New Feature Request

2014-10-15 Thread Rajat Venkatesh
Hello Hive Dev,
I filed a JIRA for a new feature in Hive - Table Copy. At Qubole, we've
noticed that many users want to convert data to one of the more efficient
data formats (for e.g. ORC) supported by Hive. Similarly, one of the
prereqs for having a good experience on Presto is to convert the data to
ORC.
So we've tried to formalize the process of converting data to a more
efficient format.
We have a prototype that some of our users are trying out.

Please take a look at https://issues.apache.org/jira/browse/HIVE-8467

We would love to get your feedback if such a feature is useful to the
larger Hive community.


-- 
Rajat Venkatesh | Engg Lead
Qubole Inc | www.qubole.com


[jira] [Commented] (HIVE-8455) Print Spark job progress format info on the console[Spark Branch]

2014-10-15 Thread Brock Noland (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14172623#comment-14172623
 ] 

Brock Noland commented on HIVE-8455:


Thank you Chengxiang and Xuefu! I have committed this to spark.

 Print Spark job progress format info on the console[Spark Branch]
 -

 Key: HIVE-8455
 URL: https://issues.apache.org/jira/browse/HIVE-8455
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Chengxiang Li
Assignee: Chengxiang Li
  Labels: Spark-M3
 Fix For: 0.15.0

 Attachments: HIVE-8455.1-spark.patch, HIVE-8455.2-spark.patch, hive 
 on spark job status.PNG


 We have add support of spark job status monitoring on HIVE-7439, but not 
 print job progress format info on the console, user may confuse about what 
 the  progress info means, so I would like to add job progress format info 
 here.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8455) Print Spark job progress format info on the console[Spark Branch]

2014-10-15 Thread Brock Noland (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-8455:
---
   Resolution: Fixed
Fix Version/s: 0.15.0
   Status: Resolved  (was: Patch Available)

 Print Spark job progress format info on the console[Spark Branch]
 -

 Key: HIVE-8455
 URL: https://issues.apache.org/jira/browse/HIVE-8455
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Chengxiang Li
Assignee: Chengxiang Li
  Labels: Spark-M3
 Fix For: 0.15.0

 Attachments: HIVE-8455.1-spark.patch, HIVE-8455.2-spark.patch, hive 
 on spark job status.PNG


 We have add support of spark job status monitoring on HIVE-7439, but not 
 print job progress format info on the console, user may confuse about what 
 the  progress info means, so I would like to add job progress format info 
 here.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8387) add retry logic to ZooKeeperStorage in WebHCat

2014-10-15 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-8387:
-
Status: Patch Available  (was: Open)

 add retry logic to ZooKeeperStorage in WebHCat
 --

 Key: HIVE-8387
 URL: https://issues.apache.org/jira/browse/HIVE-8387
 Project: Hive
  Issue Type: Bug
  Components: WebHCat
Affects Versions: 0.13.1
Reporter: Eugene Koifman
Assignee: Eugene Koifman
 Attachments: HIVE-8387.patch


 ZK interactions may run into transient errors that should be retried.  
 Currently there is no retry logic in WebHCat for this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8387) add retry logic to ZooKeeperStorage in WebHCat

2014-10-15 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-8387:
-
Attachment: HIVE-8387.patch

[~thejas], [~sushanth] Could one of you review this please

 add retry logic to ZooKeeperStorage in WebHCat
 --

 Key: HIVE-8387
 URL: https://issues.apache.org/jira/browse/HIVE-8387
 Project: Hive
  Issue Type: Bug
  Components: WebHCat
Affects Versions: 0.13.1
Reporter: Eugene Koifman
Assignee: Eugene Koifman
 Attachments: HIVE-8387.patch


 ZK interactions may run into transient errors that should be retried.  
 Currently there is no retry logic in WebHCat for this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8362) Investigate flaky test parallel.q

2014-10-15 Thread Brock Noland (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-8362:
---
   Resolution: Fixed
Fix Version/s: 0.15.0
   Status: Resolved  (was: Patch Available)

Thank you Jimmy for fixing this long standing issue! I have committed it to 
trunk!

 Investigate flaky test parallel.q
 -

 Key: HIVE-8362
 URL: https://issues.apache.org/jira/browse/HIVE-8362
 Project: Hive
  Issue Type: Sub-task
  Components: Testing Infrastructure
Affects Versions: 0.14.0
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
 Fix For: 0.15.0

 Attachments: HIVE-8362.1-spark.patch, HIVE-8362.2.patch, 
 HIVE-8362.3.patch, HIVE-8362.patch


 Test parallel.q is flaky. It fails sometimes with error like:
 {noformat}
 Failed tests: 
   TestSparkCliDriver.testCliDriver_parallel:120-runTest:146 Unexpected 
 exception junit.framework.AssertionFailedError: Client Execution results 
 failed with error code = 1
 See ./ql/target/tmp/log/hive.log or ./itests/qtest/target/tmp/log/hive.log, 
 or check ./ql/target/surefire-reports or 
 ./itests/qtest/target/surefire-reports/ for specific test cases logs.
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8448) Union All might not work due to the type conversion issue

2014-10-15 Thread Szehon Ho (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14172650#comment-14172650
 ] 

Szehon Ho commented on HIVE-8448:
-

This looks fine to me.  Only style feedback is please follow the common style 
convention in GenericUDFUtils, like put parenthesis for if {...} else {...}, 
and also put a space after if.  And the method name capitalization should be 
updatePriv, although I would suggest renaming it to avoid confusion (when I 
first read it, I thought its updating the privilege).

To be honest I'm not the expert of union, I wonder if [~jdere], [~navis] would 
have any further comment?  If not, +1 after these changes, pending the test.

 Union All might not work due to the type conversion issue
 -

 Key: HIVE-8448
 URL: https://issues.apache.org/jira/browse/HIVE-8448
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.1
Reporter: Chaoyu Tang
Assignee: Yongzhi Chen
Priority: Minor
 Attachments: HIVE-8448.1.patch


 create table t1 (val date);
 insert overwrite table t1 select '2014-10-10' from src limit 1;
 create table t2 (val varchar(10));
 insert overwrite table t2 select '2014-10-10' from src limit 1; 
 ==
 Query:
 select t.val from
 (select val from t1
 union all
 select val from t1
 union all
 select val from t2
 union all
 select val from t1) t;
 ==
 Will throw exception: 
 {code}
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Incompatible 
 types for union operator
   at 
 org.apache.hadoop.hive.ql.exec.UnionOperator.initializeOp(UnionOperator.java:86)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:464)
   at 
 org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:420)
   at 
 org.apache.hadoop.hive.ql.exec.SelectOperator.initializeOp(SelectOperator.java:65)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:464)
   at 
 org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:420)
   at 
 org.apache.hadoop.hive.ql.exec.TableScanOperator.initializeOp(TableScanOperator.java:193)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380)
   at 
 org.apache.hadoop.hive.ql.exec.MapOperator.initializeOp(MapOperator.java:443)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380)
   at 
 org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:133)
   ... 22 more
 {code}
 It was because at this query parse step, getCommonClassForUnionAll is used, 
 but at execution getCommonClass is used. They are not used consistently in 
 union. The later one does not support the implicit conversion from date to 
 string, which is the problem cause.
 The change might be simple to fix this particular union issue but I noticed 
 that there are three versions of getCommonClass: getCommonClass, 
 getCommonClassForComparison, getCommonClassForUnionAll, and wonder if they 
 need to be cleaned and refactored.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-7858) Parquet compression should be configurable via table property

2014-10-15 Thread Brock Noland (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-7858:
---
   Resolution: Fixed
Fix Version/s: 0.15.0
   Status: Resolved  (was: Patch Available)

Thank you so much! I have committed this to trunk!

 Parquet compression should be configurable via table property
 -

 Key: HIVE-7858
 URL: https://issues.apache.org/jira/browse/HIVE-7858
 Project: Hive
  Issue Type: Improvement
Reporter: Brock Noland
Assignee: Ferdinand Xu
 Fix For: 0.15.0

 Attachments: HIVE-7858.1.patch, HIVE-7858.patch, HIVE-7858.patch


 ORC supports the orc.compress table property:
 https://cwiki.apache.org/confluence/display/Hive/LanguageManual+ORC
 {noformat}
 create table Addresses (
   name string,
   street string,
   city string,
   state string,
   zip int
 ) stored as orc tblproperties (orc.compress=NONE);
 {noformat}
 I think it'd be great to support the same for Parquet.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-8469) Add parquet.compression as a Serde Property

2014-10-15 Thread Brock Noland (JIRA)
Brock Noland created HIVE-8469:
--

 Summary: Add parquet.compression as a Serde Property
 Key: HIVE-8469
 URL: https://issues.apache.org/jira/browse/HIVE-8469
 Project: Hive
  Issue Type: Improvement
Reporter: Brock Noland
Priority: Minor


In HIVE-8450 we are annotating the serdes with their properties. We should add 
compression for paquet.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-7858) Parquet compression should be configurable via table property

2014-10-15 Thread Brock Noland (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14172685#comment-14172685
 ] 

Brock Noland commented on HIVE-7858:


FYI that I created HIVE-8469 to add this as an annotated serde property post 
HIVE-8450

 Parquet compression should be configurable via table property
 -

 Key: HIVE-7858
 URL: https://issues.apache.org/jira/browse/HIVE-7858
 Project: Hive
  Issue Type: Improvement
Reporter: Brock Noland
Assignee: Ferdinand Xu
 Fix For: 0.15.0

 Attachments: HIVE-7858.1.patch, HIVE-7858.patch, HIVE-7858.patch


 ORC supports the orc.compress table property:
 https://cwiki.apache.org/confluence/display/Hive/LanguageManual+ORC
 {noformat}
 create table Addresses (
   name string,
   street string,
   city string,
   state string,
   zip int
 ) stored as orc tblproperties (orc.compress=NONE);
 {noformat}
 I think it'd be great to support the same for Parquet.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-7858) Parquet compression should be configurable via table property

2014-10-15 Thread Brock Noland (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-7858:
---
Release Note: The property parquet.compression can not be configured as a 
table property.

 Parquet compression should be configurable via table property
 -

 Key: HIVE-7858
 URL: https://issues.apache.org/jira/browse/HIVE-7858
 Project: Hive
  Issue Type: Improvement
Reporter: Brock Noland
Assignee: Ferdinand Xu
 Fix For: 0.15.0

 Attachments: HIVE-7858.1.patch, HIVE-7858.patch, HIVE-7858.patch


 ORC supports the orc.compress table property:
 https://cwiki.apache.org/confluence/display/Hive/LanguageManual+ORC
 {noformat}
 create table Addresses (
   name string,
   street string,
   city string,
   state string,
   zip int
 ) stored as orc tblproperties (orc.compress=NONE);
 {noformat}
 I think it'd be great to support the same for Parquet.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-8470) Orc writer cant handle column of type void

2014-10-15 Thread Ashutosh Chauhan (JIRA)
Ashutosh Chauhan created HIVE-8470:
--

 Summary: Orc writer cant handle column of type void
 Key: HIVE-8470
 URL: https://issues.apache.org/jira/browse/HIVE-8470
 Project: Hive
  Issue Type: Bug
  Components: File Formats
Affects Versions: 0.13.0, 0.12.0, 0.11.0, 0.14.0
Reporter: Ashutosh Chauhan


e.g,
insert into table t1 select null from src;



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8470) Orc writer cant handle column of type void

2014-10-15 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14172700#comment-14172700
 ] 

Ashutosh Chauhan commented on HIVE-8470:


Stack trace:
{code}
Caused by: java.lang.IllegalArgumentException: Bad primitive category VOID
at 
org.apache.hadoop.hive.ql.io.orc.WriterImpl.createTreeWriter(WriterImpl.java:1842)
at 
org.apache.hadoop.hive.ql.io.orc.WriterImpl.access$1500(WriterImpl.java:106)
at 
org.apache.hadoop.hive.ql.io.orc.WriterImpl$StructTreeWriter.init(WriterImpl.java:1592)
at 
org.apache.hadoop.hive.ql.io.orc.WriterImpl.createTreeWriter(WriterImpl.java:1846)
at 
org.apache.hadoop.hive.ql.io.orc.WriterImpl.init(WriterImpl.java:203)
at 
org.apache.hadoop.hive.ql.io.orc.OrcFile.createWriter(OrcFile.java:415)
at 
org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat$OrcRecordWriter.write(OrcOutputFormat.java:84)
at 
org.apache.hadoop.hive.ql.exec.FileSinkOperator.processOp(FileSinkOperator.java:671)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:799)
at 
org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:799)
at 
org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:95)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:799)
at 
org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:536)
at 
org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:546)
at 
org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:85)
{code}

 Orc writer cant handle column of type void
 --

 Key: HIVE-8470
 URL: https://issues.apache.org/jira/browse/HIVE-8470
 Project: Hive
  Issue Type: Bug
  Components: File Formats
Affects Versions: 0.11.0, 0.12.0, 0.13.0, 0.14.0
Reporter: Ashutosh Chauhan

 e.g,
 insert into table t1 select null from src;



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8456) Support Hive Counter to collect spark job metric[Spark Branch]

2014-10-15 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14172706#comment-14172706
 ] 

Xuefu Zhang commented on HIVE-8456:
---

[~lirui] would you like to review the patch? Thanks.

 Support Hive Counter to collect spark job metric[Spark Branch]
 --

 Key: HIVE-8456
 URL: https://issues.apache.org/jira/browse/HIVE-8456
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Chengxiang Li
Assignee: Chengxiang Li
  Labels: Spark-M3
 Attachments: HIVE-8456.1-spark.patch, HIVE-8456.2-spark.patch


 Several Hive query metric in Hive operators is collected by Hive Counter, 
 such as CREATEDFILES and DESERIALIZE_ERRORS, Besides, Hive use Counter as an 
 option to collect table stats info.  Spark support Accumulator which is 
 pretty similiar with Hive Counter, we could try to enable Hive Counter based 
 on it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8387) add retry logic to ZooKeeperStorage in WebHCat

2014-10-15 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14172714#comment-14172714
 ] 

Thejas M Nair commented on HIVE-8387:
-

[~ekoifman] Can you please add a review board link ?


 add retry logic to ZooKeeperStorage in WebHCat
 --

 Key: HIVE-8387
 URL: https://issues.apache.org/jira/browse/HIVE-8387
 Project: Hive
  Issue Type: Bug
  Components: WebHCat
Affects Versions: 0.13.1
Reporter: Eugene Koifman
Assignee: Eugene Koifman
 Attachments: HIVE-8387.patch


 ZK interactions may run into transient errors that should be retried.  
 Currently there is no retry logic in WebHCat for this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8448) Union All might not work due to the type conversion issue

2014-10-15 Thread Yongzhi Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongzhi Chen updated HIVE-8448:
---
Attachment: HIVE-8448.2.patch

fixes after review

 Union All might not work due to the type conversion issue
 -

 Key: HIVE-8448
 URL: https://issues.apache.org/jira/browse/HIVE-8448
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.1
Reporter: Chaoyu Tang
Assignee: Yongzhi Chen
Priority: Minor
 Attachments: HIVE-8448.1.patch, HIVE-8448.2.patch


 create table t1 (val date);
 insert overwrite table t1 select '2014-10-10' from src limit 1;
 create table t2 (val varchar(10));
 insert overwrite table t2 select '2014-10-10' from src limit 1; 
 ==
 Query:
 select t.val from
 (select val from t1
 union all
 select val from t1
 union all
 select val from t2
 union all
 select val from t1) t;
 ==
 Will throw exception: 
 {code}
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Incompatible 
 types for union operator
   at 
 org.apache.hadoop.hive.ql.exec.UnionOperator.initializeOp(UnionOperator.java:86)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:464)
   at 
 org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:420)
   at 
 org.apache.hadoop.hive.ql.exec.SelectOperator.initializeOp(SelectOperator.java:65)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:464)
   at 
 org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:420)
   at 
 org.apache.hadoop.hive.ql.exec.TableScanOperator.initializeOp(TableScanOperator.java:193)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380)
   at 
 org.apache.hadoop.hive.ql.exec.MapOperator.initializeOp(MapOperator.java:443)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380)
   at 
 org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:133)
   ... 22 more
 {code}
 It was because at this query parse step, getCommonClassForUnionAll is used, 
 but at execution getCommonClass is used. They are not used consistently in 
 union. The later one does not support the implicit conversion from date to 
 string, which is the problem cause.
 The change might be simple to fix this particular union issue but I noticed 
 that there are three versions of getCommonClass: getCommonClass, 
 getCommonClassForComparison, getCommonClassForUnionAll, and wonder if they 
 need to be cleaned and refactored.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Update: Hive user group meeting tonight

2014-10-15 Thread Xuefu Zhang
Hi all,

Quick update, you should be able to attend as long as you have an ID with
you. (Sorry about the confusion.)

For those willing to dial in, the info is on the meetup page.
http://www.meetup.com/Hive-User-Group-Meeting/events/202007872/

Regards,
Xuefu


[jira] [Commented] (HIVE-8448) Union All might not work due to the type conversion issue

2014-10-15 Thread Yongzhi Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14172726#comment-14172726
 ] 

Yongzhi Chen commented on HIVE-8448:


Thanks [~szehon], I attached the new patch following your review advice, I also 
submit a review request for the jira:
https://reviews.apache.org/r/26763/

 Union All might not work due to the type conversion issue
 -

 Key: HIVE-8448
 URL: https://issues.apache.org/jira/browse/HIVE-8448
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.1
Reporter: Chaoyu Tang
Assignee: Yongzhi Chen
Priority: Minor
 Attachments: HIVE-8448.1.patch, HIVE-8448.2.patch


 create table t1 (val date);
 insert overwrite table t1 select '2014-10-10' from src limit 1;
 create table t2 (val varchar(10));
 insert overwrite table t2 select '2014-10-10' from src limit 1; 
 ==
 Query:
 select t.val from
 (select val from t1
 union all
 select val from t1
 union all
 select val from t2
 union all
 select val from t1) t;
 ==
 Will throw exception: 
 {code}
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Incompatible 
 types for union operator
   at 
 org.apache.hadoop.hive.ql.exec.UnionOperator.initializeOp(UnionOperator.java:86)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:464)
   at 
 org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:420)
   at 
 org.apache.hadoop.hive.ql.exec.SelectOperator.initializeOp(SelectOperator.java:65)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:464)
   at 
 org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:420)
   at 
 org.apache.hadoop.hive.ql.exec.TableScanOperator.initializeOp(TableScanOperator.java:193)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380)
   at 
 org.apache.hadoop.hive.ql.exec.MapOperator.initializeOp(MapOperator.java:443)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380)
   at 
 org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:133)
   ... 22 more
 {code}
 It was because at this query parse step, getCommonClassForUnionAll is used, 
 but at execution getCommonClass is used. They are not used consistently in 
 union. The later one does not support the implicit conversion from date to 
 string, which is the problem cause.
 The change might be simple to fix this particular union issue but I noticed 
 that there are three versions of getCommonClass: getCommonClass, 
 getCommonClassForComparison, getCommonClassForUnionAll, and wonder if they 
 need to be cleaned and refactored.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8122) Make use of SearchArgument classes

2014-10-15 Thread Brock Noland (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14172733#comment-14172733
 ] 

Brock Noland commented on HIVE-8122:


You are right. I created this JIRA to use something better than the string we 
are currently using. However, if we used SearchArgument then I think the 
parquet project would need that. We should create FilterPredicate and give that 
to Parquet.

Thank you for your detailed analysis!

 Make use of SearchArgument classes
 --

 Key: HIVE-8122
 URL: https://issues.apache.org/jira/browse/HIVE-8122
 Project: Hive
  Issue Type: Sub-task
Reporter: Brock Noland
Assignee: Ferdinand Xu

 ParquetSerde could be much cleaner if we used SearchArgument and associated 
 classes like ORC does:
 https://github.com/apache/hive/blob/trunk/serde/src/java/org/apache/hadoop/hive/ql/io/sarg/SearchArgument.java



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-7914) Simplify join predicates for CBO to avoid cross products

2014-10-15 Thread Mostafa Mokhtar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14172736#comment-14172736
 ] 

Mostafa Mokhtar commented on HIVE-7914:
---

Issue resolved 
{code}

STAGE DEPENDENCIES:
  Stage-1 is a root stage
  Stage-0 depends on stages: Stage-1

STAGE PLANS:
  Stage: Stage-1
Tez
  Edges:
Map 4 - Map 1 (BROADCAST_EDGE), Map 2 (BROADCAST_EDGE), Map 3 
(BROADCAST_EDGE), Map 6 (BROADCAST_EDGE), Map 7 (BROADCAST_EDGE)
Reducer 5 - Map 4 (SIMPLE_EDGE)
  DagName: mmokhtar_2014101512_452c339a-3fa1-4ae4-99ed-0fb052342532:1
  Vertices:
Map 1 
Map Operator Tree:
TableScan
  alias: household_demographics
  filterExpr: hd_demo_sk is not null (type: boolean)
  Statistics: Num rows: 7200 Data size: 770400 Basic stats: 
COMPLETE Column stats: COMPLETE
  Filter Operator
predicate: hd_demo_sk is not null (type: boolean)
Statistics: Num rows: 7200 Data size: 57600 Basic stats: 
COMPLETE Column stats: COMPLETE
Select Operator
  expressions: hd_demo_sk (type: int), hd_dep_count (type: 
int)
  outputColumnNames: _col0, _col1
  Statistics: Num rows: 7200 Data size: 57600 Basic stats: 
COMPLETE Column stats: COMPLETE
  Reduce Output Operator
key expressions: _col0 (type: int)
sort order: +
Map-reduce partition columns: _col0 (type: int)
Statistics: Num rows: 7200 Data size: 57600 Basic 
stats: COMPLETE Column stats: COMPLETE
value expressions: _col1 (type: int)
Execution mode: vectorized
Map 2 
Map Operator Tree:
TableScan
  alias: customer_address
  filterExpr: ((ca_country = 'United States') and ca_address_sk 
is not null) (type: boolean)
  Statistics: Num rows: 80 Data size: 811903688 Basic 
stats: COMPLETE Column stats: COMPLETE
  Filter Operator
predicate: ((ca_country = 'United States') and 
ca_address_sk is not null) (type: boolean)
Statistics: Num rows: 40 Data size: 7480 Basic 
stats: COMPLETE Column stats: COMPLETE
Select Operator
  expressions: ca_address_sk (type: int), ca_state (type: 
string)
  outputColumnNames: _col0, _col1
  Statistics: Num rows: 40 Data size: 3600 Basic 
stats: COMPLETE Column stats: COMPLETE
  Reduce Output Operator
key expressions: _col0 (type: int)
sort order: +
Map-reduce partition columns: _col0 (type: int)
Statistics: Num rows: 40 Data size: 3600 Basic 
stats: COMPLETE Column stats: COMPLETE
value expressions: _col1 (type: string)
Execution mode: vectorized
Map 3 
Map Operator Tree:
TableScan
  alias: date_dim
  filterExpr: ((d_year = 2001) and d_date_sk is not null) 
(type: boolean)
  Statistics: Num rows: 73049 Data size: 81741831 Basic stats: 
COMPLETE Column stats: COMPLETE
  Filter Operator
predicate: ((d_year = 2001) and d_date_sk is not null) 
(type: boolean)
Statistics: Num rows: 652 Data size: 5216 Basic stats: 
COMPLETE Column stats: COMPLETE
Select Operator
  expressions: d_date_sk (type: int)
  outputColumnNames: _col0
  Statistics: Num rows: 652 Data size: 2608 Basic stats: 
COMPLETE Column stats: COMPLETE
  Reduce Output Operator
key expressions: _col0 (type: int)
sort order: +
Map-reduce partition columns: _col0 (type: int)
Statistics: Num rows: 652 Data size: 2608 Basic stats: 
COMPLETE Column stats: COMPLETE
  Select Operator
expressions: _col0 (type: int)
outputColumnNames: _col0
Statistics: Num rows: 652 Data size: 0 Basic stats: 
PARTIAL Column stats: COMPLETE
Group By Operator
  keys: _col0 (type: int)
  mode: hash
  outputColumnNames: _col0
  Statistics: Num rows: 652 Data size: 0 Basic stats: 
PARTIAL Column stats: COMPLETE
  Dynamic Partitioning Event Operator
Target 

[jira] [Updated] (HIVE-8448) Union All might not work due to the type conversion issue

2014-10-15 Thread Yongzhi Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongzhi Chen updated HIVE-8448:
---
Status: Open  (was: Patch Available)

 Union All might not work due to the type conversion issue
 -

 Key: HIVE-8448
 URL: https://issues.apache.org/jira/browse/HIVE-8448
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.1
Reporter: Chaoyu Tang
Assignee: Yongzhi Chen
Priority: Minor
 Attachments: HIVE-8448.1.patch, HIVE-8448.2.patch


 create table t1 (val date);
 insert overwrite table t1 select '2014-10-10' from src limit 1;
 create table t2 (val varchar(10));
 insert overwrite table t2 select '2014-10-10' from src limit 1; 
 ==
 Query:
 select t.val from
 (select val from t1
 union all
 select val from t1
 union all
 select val from t2
 union all
 select val from t1) t;
 ==
 Will throw exception: 
 {code}
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Incompatible 
 types for union operator
   at 
 org.apache.hadoop.hive.ql.exec.UnionOperator.initializeOp(UnionOperator.java:86)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:464)
   at 
 org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:420)
   at 
 org.apache.hadoop.hive.ql.exec.SelectOperator.initializeOp(SelectOperator.java:65)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:464)
   at 
 org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:420)
   at 
 org.apache.hadoop.hive.ql.exec.TableScanOperator.initializeOp(TableScanOperator.java:193)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380)
   at 
 org.apache.hadoop.hive.ql.exec.MapOperator.initializeOp(MapOperator.java:443)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380)
   at 
 org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:133)
   ... 22 more
 {code}
 It was because at this query parse step, getCommonClassForUnionAll is used, 
 but at execution getCommonClass is used. They are not used consistently in 
 union. The later one does not support the implicit conversion from date to 
 string, which is the problem cause.
 The change might be simple to fix this particular union issue but I noticed 
 that there are three versions of getCommonClass: getCommonClass, 
 getCommonClassForComparison, getCommonClassForUnionAll, and wonder if they 
 need to be cleaned and refactored.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8448) Union All might not work due to the type conversion issue

2014-10-15 Thread Yongzhi Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongzhi Chen updated HIVE-8448:
---
Attachment: HIVE-8448.3.patch

need review

 Union All might not work due to the type conversion issue
 -

 Key: HIVE-8448
 URL: https://issues.apache.org/jira/browse/HIVE-8448
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.1
Reporter: Chaoyu Tang
Assignee: Yongzhi Chen
Priority: Minor
 Attachments: HIVE-8448.1.patch, HIVE-8448.2.patch, HIVE-8448.3.patch


 create table t1 (val date);
 insert overwrite table t1 select '2014-10-10' from src limit 1;
 create table t2 (val varchar(10));
 insert overwrite table t2 select '2014-10-10' from src limit 1; 
 ==
 Query:
 select t.val from
 (select val from t1
 union all
 select val from t1
 union all
 select val from t2
 union all
 select val from t1) t;
 ==
 Will throw exception: 
 {code}
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Incompatible 
 types for union operator
   at 
 org.apache.hadoop.hive.ql.exec.UnionOperator.initializeOp(UnionOperator.java:86)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:464)
   at 
 org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:420)
   at 
 org.apache.hadoop.hive.ql.exec.SelectOperator.initializeOp(SelectOperator.java:65)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:464)
   at 
 org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:420)
   at 
 org.apache.hadoop.hive.ql.exec.TableScanOperator.initializeOp(TableScanOperator.java:193)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380)
   at 
 org.apache.hadoop.hive.ql.exec.MapOperator.initializeOp(MapOperator.java:443)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380)
   at 
 org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:133)
   ... 22 more
 {code}
 It was because at this query parse step, getCommonClassForUnionAll is used, 
 but at execution getCommonClass is used. They are not used consistently in 
 union. The later one does not support the implicit conversion from date to 
 string, which is the problem cause.
 The change might be simple to fix this particular union issue but I noticed 
 that there are three versions of getCommonClass: getCommonClass, 
 getCommonClassForComparison, getCommonClassForUnionAll, and wonder if they 
 need to be cleaned and refactored.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8448) Union All might not work due to the type conversion issue

2014-10-15 Thread Yongzhi Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongzhi Chen updated HIVE-8448:
---
Status: Patch Available  (was: Open)

Fixed code style after review

 Union All might not work due to the type conversion issue
 -

 Key: HIVE-8448
 URL: https://issues.apache.org/jira/browse/HIVE-8448
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.1
Reporter: Chaoyu Tang
Assignee: Yongzhi Chen
Priority: Minor
 Attachments: HIVE-8448.1.patch, HIVE-8448.2.patch, HIVE-8448.3.patch


 create table t1 (val date);
 insert overwrite table t1 select '2014-10-10' from src limit 1;
 create table t2 (val varchar(10));
 insert overwrite table t2 select '2014-10-10' from src limit 1; 
 ==
 Query:
 select t.val from
 (select val from t1
 union all
 select val from t1
 union all
 select val from t2
 union all
 select val from t1) t;
 ==
 Will throw exception: 
 {code}
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Incompatible 
 types for union operator
   at 
 org.apache.hadoop.hive.ql.exec.UnionOperator.initializeOp(UnionOperator.java:86)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:464)
   at 
 org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:420)
   at 
 org.apache.hadoop.hive.ql.exec.SelectOperator.initializeOp(SelectOperator.java:65)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:464)
   at 
 org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:420)
   at 
 org.apache.hadoop.hive.ql.exec.TableScanOperator.initializeOp(TableScanOperator.java:193)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380)
   at 
 org.apache.hadoop.hive.ql.exec.MapOperator.initializeOp(MapOperator.java:443)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380)
   at 
 org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:133)
   ... 22 more
 {code}
 It was because at this query parse step, getCommonClassForUnionAll is used, 
 but at execution getCommonClass is used. They are not used consistently in 
 union. The later one does not support the implicit conversion from date to 
 string, which is the problem cause.
 The change might be simple to fix this particular union issue but I noticed 
 that there are three versions of getCommonClass: getCommonClass, 
 getCommonClassForComparison, getCommonClassForUnionAll, and wonder if they 
 need to be cleaned and refactored.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8467) Table Copy - Background, incremental data load

2014-10-15 Thread Julian Hyde (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14172767#comment-14172767
 ] 

Julian Hyde commented on HIVE-8467:
---

I see this as a particular kind of materialized view. In general, a 
materialized view is a table whose contents are guaranteed to be the same as 
executing a particular query. In this case, that query is simply 'select * from 
t'.

We don't have materialized view support yet, but I have been working on 
lattices in Calcite (formerly known as Optiq) (see OPTIQ-344) and there is a 
lot of interest in adding them to Hive. Each materialized tile in a lattice 
is a materialized view of the form 'select d1, d2, sum(m1), count(m2) from t 
group by d1, d2'.

So, let's talk about whether we could change the syntax to 'create materialized 
view'  and still deliver the functionality you need. Of course if the user 
enters anything other than 'select * from t order by k1, k2' they would get an 
error.

In terms of query planning, I strongly recommend that you build on the CBO work 
powered by Calcite. Let's suppose there is a table T and a copy C. After 
translating the query to a Calcite RelNode tree, there will be a 
TableAccessRel(T). After reading the metadata, we should create a 
TableAccessRel(C) and tell Calcite that it is equivalent.

That's all you need to do. Calcite will take it from there. Assuming the stats 
indicate that C is better (and they should, right, because the ORC 
representation will be smaller?) then the query will end up using C. But if, 
say, T has a partitioning scheme which is more suitable for a particular query, 
then Calcite will choose T.

 Table Copy - Background, incremental data load
 --

 Key: HIVE-8467
 URL: https://issues.apache.org/jira/browse/HIVE-8467
 Project: Hive
  Issue Type: New Feature
Reporter: Rajat Venkatesh
 Attachments: Table Copies.pdf


 Traditionally, Hive and other tools in the Hadoop eco-system havent required 
 a load stage. However, with recent developments, Hive is much more performant 
 when data is stored in specific formats like ORC, Parquet, Avro etc. 
 Technologies like Presto, also work much better with certain data formats. At 
 the same time, data is generated or obtained from 3rd parties in non-optimal 
 formats such as CSV, tab-limited or JSON. Many a times, its not an option to 
 change the data format at the source. We've found that users either use 
 sub-optimal formats or spend a large amount of effort creating and 
 maintaining copies. We want to propose a new construct - Table Copy - to help 
 “load” data into an optimal storage format.
 I am going to attach a PDF document with a lot more details especially 
 addressing how is this different from bulk loads in relational DBs or 
 materialized views.
 Looking forward to hear if others see a similar need to formalize conversion 
 of data to different storage formats.  If yes, are the details in the PDF 
 document a good start ?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8428) PCR doesnt remove filters involving casts

2014-10-15 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-8428:
---
Status: Open  (was: Patch Available)

 PCR doesnt remove filters involving casts
 -

 Key: HIVE-8428
 URL: https://issues.apache.org/jira/browse/HIVE-8428
 Project: Hive
  Issue Type: Improvement
  Components: Logical Optimizer
Affects Versions: 0.13.0, 0.12.0, 0.11.0
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Attachments: HIVE-8428.1.patch, HIVE-8428.2.patch, HIVE-8428.patch


 e.g.,
 select key,value from srcpart where hr = cast(11 as double);



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8428) PCR doesnt remove filters involving casts

2014-10-15 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-8428:
---
Status: Patch Available  (was: Open)

 PCR doesnt remove filters involving casts
 -

 Key: HIVE-8428
 URL: https://issues.apache.org/jira/browse/HIVE-8428
 Project: Hive
  Issue Type: Improvement
  Components: Logical Optimizer
Affects Versions: 0.13.0, 0.12.0, 0.11.0
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Attachments: HIVE-8428.1.patch, HIVE-8428.2.patch, HIVE-8428.3.patch, 
 HIVE-8428.patch


 e.g.,
 select key,value from srcpart where hr = cast(11 as double);



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8428) PCR doesnt remove filters involving casts

2014-10-15 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-8428:
---
Attachment: HIVE-8428.3.patch

 PCR doesnt remove filters involving casts
 -

 Key: HIVE-8428
 URL: https://issues.apache.org/jira/browse/HIVE-8428
 Project: Hive
  Issue Type: Improvement
  Components: Logical Optimizer
Affects Versions: 0.11.0, 0.12.0, 0.13.0
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Attachments: HIVE-8428.1.patch, HIVE-8428.2.patch, HIVE-8428.3.patch, 
 HIVE-8428.patch


 e.g.,
 select key,value from srcpart where hr = cast(11 as double);



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8428) PCR doesnt remove filters involving casts

2014-10-15 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14172781#comment-14172781
 ] 

Ashutosh Chauhan commented on HIVE-8428:


Failures are because of other bugs:
* orc_ppd_decimal : HIVE-8460
* orc_vectorization_ppd : HIVE-8470
* parallel : HIVE-8362


 PCR doesnt remove filters involving casts
 -

 Key: HIVE-8428
 URL: https://issues.apache.org/jira/browse/HIVE-8428
 Project: Hive
  Issue Type: Improvement
  Components: Logical Optimizer
Affects Versions: 0.11.0, 0.12.0, 0.13.0
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Attachments: HIVE-8428.1.patch, HIVE-8428.2.patch, HIVE-8428.3.patch, 
 HIVE-8428.patch


 e.g.,
 select key,value from srcpart where hr = cast(11 as double);



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8387) add retry logic to ZooKeeperStorage in WebHCat

2014-10-15 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14172785#comment-14172785
 ] 

Eugene Koifman commented on HIVE-8387:
--

https://reviews.apache.org/r/26771/

 add retry logic to ZooKeeperStorage in WebHCat
 --

 Key: HIVE-8387
 URL: https://issues.apache.org/jira/browse/HIVE-8387
 Project: Hive
  Issue Type: Bug
  Components: WebHCat
Affects Versions: 0.13.1
Reporter: Eugene Koifman
Assignee: Eugene Koifman
 Attachments: HIVE-8387.patch


 ZK interactions may run into transient errors that should be retried.  
 Currently there is no retry logic in WebHCat for this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8448) Union All might not work due to the type conversion issue

2014-10-15 Thread Yongzhi Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongzhi Chen updated HIVE-8448:
---
Attachment: (was: HIVE-8448.3.patch)

 Union All might not work due to the type conversion issue
 -

 Key: HIVE-8448
 URL: https://issues.apache.org/jira/browse/HIVE-8448
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.1
Reporter: Chaoyu Tang
Assignee: Yongzhi Chen
Priority: Minor

 create table t1 (val date);
 insert overwrite table t1 select '2014-10-10' from src limit 1;
 create table t2 (val varchar(10));
 insert overwrite table t2 select '2014-10-10' from src limit 1; 
 ==
 Query:
 select t.val from
 (select val from t1
 union all
 select val from t1
 union all
 select val from t2
 union all
 select val from t1) t;
 ==
 Will throw exception: 
 {code}
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Incompatible 
 types for union operator
   at 
 org.apache.hadoop.hive.ql.exec.UnionOperator.initializeOp(UnionOperator.java:86)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:464)
   at 
 org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:420)
   at 
 org.apache.hadoop.hive.ql.exec.SelectOperator.initializeOp(SelectOperator.java:65)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:464)
   at 
 org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:420)
   at 
 org.apache.hadoop.hive.ql.exec.TableScanOperator.initializeOp(TableScanOperator.java:193)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380)
   at 
 org.apache.hadoop.hive.ql.exec.MapOperator.initializeOp(MapOperator.java:443)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380)
   at 
 org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:133)
   ... 22 more
 {code}
 It was because at this query parse step, getCommonClassForUnionAll is used, 
 but at execution getCommonClass is used. They are not used consistently in 
 union. The later one does not support the implicit conversion from date to 
 string, which is the problem cause.
 The change might be simple to fix this particular union issue but I noticed 
 that there are three versions of getCommonClass: getCommonClass, 
 getCommonClassForComparison, getCommonClassForUnionAll, and wonder if they 
 need to be cleaned and refactored.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8448) Union All might not work due to the type conversion issue

2014-10-15 Thread Yongzhi Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongzhi Chen updated HIVE-8448:
---
Attachment: (was: HIVE-8448.2.patch)

 Union All might not work due to the type conversion issue
 -

 Key: HIVE-8448
 URL: https://issues.apache.org/jira/browse/HIVE-8448
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.1
Reporter: Chaoyu Tang
Assignee: Yongzhi Chen
Priority: Minor

 create table t1 (val date);
 insert overwrite table t1 select '2014-10-10' from src limit 1;
 create table t2 (val varchar(10));
 insert overwrite table t2 select '2014-10-10' from src limit 1; 
 ==
 Query:
 select t.val from
 (select val from t1
 union all
 select val from t1
 union all
 select val from t2
 union all
 select val from t1) t;
 ==
 Will throw exception: 
 {code}
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Incompatible 
 types for union operator
   at 
 org.apache.hadoop.hive.ql.exec.UnionOperator.initializeOp(UnionOperator.java:86)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:464)
   at 
 org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:420)
   at 
 org.apache.hadoop.hive.ql.exec.SelectOperator.initializeOp(SelectOperator.java:65)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:464)
   at 
 org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:420)
   at 
 org.apache.hadoop.hive.ql.exec.TableScanOperator.initializeOp(TableScanOperator.java:193)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380)
   at 
 org.apache.hadoop.hive.ql.exec.MapOperator.initializeOp(MapOperator.java:443)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380)
   at 
 org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:133)
   ... 22 more
 {code}
 It was because at this query parse step, getCommonClassForUnionAll is used, 
 but at execution getCommonClass is used. They are not used consistently in 
 union. The later one does not support the implicit conversion from date to 
 string, which is the problem cause.
 The change might be simple to fix this particular union issue but I noticed 
 that there are three versions of getCommonClass: getCommonClass, 
 getCommonClassForComparison, getCommonClassForUnionAll, and wonder if they 
 need to be cleaned and refactored.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8448) Union All might not work due to the type conversion issue

2014-10-15 Thread Yongzhi Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongzhi Chen updated HIVE-8448:
---
Attachment: (was: HIVE-8448.1.patch)

 Union All might not work due to the type conversion issue
 -

 Key: HIVE-8448
 URL: https://issues.apache.org/jira/browse/HIVE-8448
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.1
Reporter: Chaoyu Tang
Assignee: Yongzhi Chen
Priority: Minor

 create table t1 (val date);
 insert overwrite table t1 select '2014-10-10' from src limit 1;
 create table t2 (val varchar(10));
 insert overwrite table t2 select '2014-10-10' from src limit 1; 
 ==
 Query:
 select t.val from
 (select val from t1
 union all
 select val from t1
 union all
 select val from t2
 union all
 select val from t1) t;
 ==
 Will throw exception: 
 {code}
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Incompatible 
 types for union operator
   at 
 org.apache.hadoop.hive.ql.exec.UnionOperator.initializeOp(UnionOperator.java:86)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:464)
   at 
 org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:420)
   at 
 org.apache.hadoop.hive.ql.exec.SelectOperator.initializeOp(SelectOperator.java:65)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:464)
   at 
 org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:420)
   at 
 org.apache.hadoop.hive.ql.exec.TableScanOperator.initializeOp(TableScanOperator.java:193)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380)
   at 
 org.apache.hadoop.hive.ql.exec.MapOperator.initializeOp(MapOperator.java:443)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380)
   at 
 org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:133)
   ... 22 more
 {code}
 It was because at this query parse step, getCommonClassForUnionAll is used, 
 but at execution getCommonClass is used. They are not used consistently in 
 union. The later one does not support the implicit conversion from date to 
 string, which is the problem cause.
 The change might be simple to fix this particular union issue but I noticed 
 that there are three versions of getCommonClass: getCommonClass, 
 getCommonClassForComparison, getCommonClassForUnionAll, and wonder if they 
 need to be cleaned and refactored.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8448) Union All might not work due to the type conversion issue

2014-10-15 Thread Yongzhi Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongzhi Chen updated HIVE-8448:
---
Attachment: HIVE-8448.4.patch

need code review

 Union All might not work due to the type conversion issue
 -

 Key: HIVE-8448
 URL: https://issues.apache.org/jira/browse/HIVE-8448
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.1
Reporter: Chaoyu Tang
Assignee: Yongzhi Chen
Priority: Minor
 Attachments: HIVE-8448.4.patch


 create table t1 (val date);
 insert overwrite table t1 select '2014-10-10' from src limit 1;
 create table t2 (val varchar(10));
 insert overwrite table t2 select '2014-10-10' from src limit 1; 
 ==
 Query:
 select t.val from
 (select val from t1
 union all
 select val from t1
 union all
 select val from t2
 union all
 select val from t1) t;
 ==
 Will throw exception: 
 {code}
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Incompatible 
 types for union operator
   at 
 org.apache.hadoop.hive.ql.exec.UnionOperator.initializeOp(UnionOperator.java:86)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:464)
   at 
 org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:420)
   at 
 org.apache.hadoop.hive.ql.exec.SelectOperator.initializeOp(SelectOperator.java:65)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:464)
   at 
 org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:420)
   at 
 org.apache.hadoop.hive.ql.exec.TableScanOperator.initializeOp(TableScanOperator.java:193)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380)
   at 
 org.apache.hadoop.hive.ql.exec.MapOperator.initializeOp(MapOperator.java:443)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380)
   at 
 org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:133)
   ... 22 more
 {code}
 It was because at this query parse step, getCommonClassForUnionAll is used, 
 but at execution getCommonClass is used. They are not used consistently in 
 union. The later one does not support the implicit conversion from date to 
 string, which is the problem cause.
 The change might be simple to fix this particular union issue but I noticed 
 that there are three versions of getCommonClass: getCommonClass, 
 getCommonClassForComparison, getCommonClassForUnionAll, and wonder if they 
 need to be cleaned and refactored.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8448) Union All might not work due to the type conversion issue

2014-10-15 Thread Yongzhi Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongzhi Chen updated HIVE-8448:
---
Status: Open  (was: Patch Available)

 Union All might not work due to the type conversion issue
 -

 Key: HIVE-8448
 URL: https://issues.apache.org/jira/browse/HIVE-8448
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.1
Reporter: Chaoyu Tang
Assignee: Yongzhi Chen
Priority: Minor

 create table t1 (val date);
 insert overwrite table t1 select '2014-10-10' from src limit 1;
 create table t2 (val varchar(10));
 insert overwrite table t2 select '2014-10-10' from src limit 1; 
 ==
 Query:
 select t.val from
 (select val from t1
 union all
 select val from t1
 union all
 select val from t2
 union all
 select val from t1) t;
 ==
 Will throw exception: 
 {code}
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Incompatible 
 types for union operator
   at 
 org.apache.hadoop.hive.ql.exec.UnionOperator.initializeOp(UnionOperator.java:86)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:464)
   at 
 org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:420)
   at 
 org.apache.hadoop.hive.ql.exec.SelectOperator.initializeOp(SelectOperator.java:65)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:464)
   at 
 org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:420)
   at 
 org.apache.hadoop.hive.ql.exec.TableScanOperator.initializeOp(TableScanOperator.java:193)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380)
   at 
 org.apache.hadoop.hive.ql.exec.MapOperator.initializeOp(MapOperator.java:443)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380)
   at 
 org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:133)
   ... 22 more
 {code}
 It was because at this query parse step, getCommonClassForUnionAll is used, 
 but at execution getCommonClass is used. They are not used consistently in 
 union. The later one does not support the implicit conversion from date to 
 string, which is the problem cause.
 The change might be simple to fix this particular union issue but I noticed 
 that there are three versions of getCommonClass: getCommonClass, 
 getCommonClassForComparison, getCommonClassForUnionAll, and wonder if they 
 need to be cleaned and refactored.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   3   >