[jira] [Assigned] (HIVE-5618) Hive local task fails to run when run from oozie in a secure cluster

2013-10-23 Thread Prasad Mujumdar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasad Mujumdar reassigned HIVE-5618:
-

Assignee: Prasad Mujumdar

 Hive local task fails to run when run from oozie in a secure cluster
 

 Key: HIVE-5618
 URL: https://issues.apache.org/jira/browse/HIVE-5618
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0
 Environment: Hadoop 2.2.0
Reporter: Venkat Ranganathan
Assignee: Prasad Mujumdar

 When a hive query like the one below
 ==
 INSERT OVERWRITE DIRECTORY 'outdir' SELECT table1.*, table2.* FROM table1 
 JOIN table2 ON (table1.col = table2.col);
 ==
 is run from a hive action in Oozie in a secure cluster, the hive action fails 
 with the following stack trace
 ===
 org.apache.hadoop.ipc.RemoteException(java.io.IOException): Delegation Token 
 can be issued only with kerberos or web authentication
   at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getDelegationToken(FSNamesystem.java:5886)
   at 
 org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getDelegationToken(NameNodeRpcServer.java:447)
   at 
 org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getDelegationToken(ClientNamenodeProtocolServerSideTranslatorPB.java:833)
   at 
 org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:59648)
   at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2053)
   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:396)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2047)
   at org.apache.hadoop.ipc.Client.call(Client.java:1347)
   at org.apache.hadoop.ipc.Client.call(Client.java:1300)
   at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
   at $Proxy10.getDelegationToken(Unknown Source)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186)
   at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
   at $Proxy10.getDelegationToken(Unknown Source)
   at 
 org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getDelegationToken(ClientNamenodeProtocolTranslatorPB.java:805)
   at 
 org.apache.hadoop.hdfs.DFSClient.getDelegationToken(DFSClient.java:847)
   at 
 org.apache.hadoop.hdfs.DistributedFileSystem.getDelegationToken(DistributedFileSystem.java:1318)
   at 
 org.apache.hadoop.hive.shims.HadoopShimsSecure.createDelegationTokenFile(HadoopShimsSecure.java:535)
   at 
 org.apache.hadoop.hive.ql.exec.SecureCmdDoAs.init(SecureCmdDoAs.java:38)
   at 
 org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask.execute(MapredLocalTask.java:238)
   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:151)
   at 
 org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:65)
   at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1437)
   at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1215)
   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1043)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:911)
   at 
 org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:259)
   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:216)
   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:413)
   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:348)
   at 
 org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:446)
   at org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:456)
   at 
 org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:737)
   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:675)
   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:614)
   at org.apache.oozie.action.hadoop.HiveMain.runHive(HiveMain.java:312)
   at 

[jira] [Commented] (HIVE-5514) webhcat_server.sh foreground option does not work as expected

2013-10-23 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13802630#comment-13802630
 ] 

Hudson commented on HIVE-5514:
--

FAILURE: Integrated in Hive-trunk-hadoop2 #517 (See 
[https://builds.apache.org/job/Hive-trunk-hadoop2/517/])
HIVE-5514 - webhcat_server.sh foreground option does not work as expected 
(brock: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1534662)
* /hive/trunk/hcatalog/webhcat/svr/src/main/bin/webhcat_server.sh


 webhcat_server.sh foreground option does not work as expected
 -

 Key: HIVE-5514
 URL: https://issues.apache.org/jira/browse/HIVE-5514
 Project: Hive
  Issue Type: Improvement
Reporter: Brock Noland
Assignee: Brock Noland
Priority: Minor
 Fix For: 0.13.0

 Attachments: HIVE-5514.patch


 Executing webhcat script webhcat_server.sh with the foreground option, it 
 calls calls hadoop without using exec. When you kill the webhcat_server.sh 
 process, it does not kill the real webhcat server.
 Just need to add the word exec below in webhcat_server.sh:
 {noformat}
 function foreground_webhcat() {
 exec $start_cmd
 }
 {noformat}
 NO PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HIVE-5600) Fix PTest2 Maven support

2013-10-23 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13802631#comment-13802631
 ] 

Hudson commented on HIVE-5600:
--

FAILURE: Integrated in Hive-trunk-hadoop2 #517 (See 
[https://builds.apache.org/job/Hive-trunk-hadoop2/517/])
HIVE-5600 - Fix PTest2 Maven support (brock: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1534648)
* /hive/trunk/testutils/ptest2/src/main/resources/batch-exec.vm
* /hive/trunk/testutils/ptest2/src/main/resources/smart-apply-patch.sh
* /hive/trunk/testutils/ptest2/src/main/resources/source-prep.vm


 Fix PTest2 Maven support
 

 Key: HIVE-5600
 URL: https://issues.apache.org/jira/browse/HIVE-5600
 Project: Hive
  Issue Type: Sub-task
Reporter: Brock Noland
Assignee: Brock Noland
 Fix For: 0.13.0

 Attachments: HIVE-5600.patch


 At present we don't download all the dependencies required in the source prep 
 phase therefore tests fail when the maven repo has been cleared.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HIVE-5441) Async query execution doesn't return resultset status

2013-10-23 Thread Prasad Mujumdar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13802645#comment-13802645
 ] 

Prasad Mujumdar commented on HIVE-5441:
---

That's correct. The existing logic for checking fetch task is not changed as 
part of this patch.

 Async query execution doesn't return resultset status
 -

 Key: HIVE-5441
 URL: https://issues.apache.org/jira/browse/HIVE-5441
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Affects Versions: 0.12.0
Reporter: Prasad Mujumdar
Assignee: Prasad Mujumdar
 Attachments: HIVE-5441.1.patch, HIVE-5441.3.patch


 For synchronous statement execution (SQL as well as metadata and other), the 
 operation handle includes a boolean flag indicating whether the statement 
 returns a resultset. In case of async execution, that's always set to false.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Moved] (HIVE-5621) Target tar does not exist in the project hcatalog.

2013-10-23 Thread Andreas Veithen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andreas Veithen moved ABDERA-353 to HIVE-5621:
--

Fix Version/s: (was: 0.4.0)
Affects Version/s: (was: 1.1.2)
 Workflow: no-reopen-closed, patch-avail  (was: classic default 
workflow)
  Key: HIVE-5621  (was: ABDERA-353)
  Project: Hive  (was: Abdera)

 Target tar does not exist in the project hcatalog.
 --

 Key: HIVE-5621
 URL: https://issues.apache.org/jira/browse/HIVE-5621
 Project: Hive
  Issue Type: Bug
Reporter: tony

 Buildfile: /home/murkuser/hcatalog-src-0.5.0-incubating/build.xml
 BUILD FAILED
 Target tar does not exist in the project hcatalog. 
 Total time: 0 seconds



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HIVE-5403) Move loading of filesystem, ugi, metastore client to hive session

2013-10-23 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13802656#comment-13802656
 ] 

Hive QA commented on HIVE-5403:
---



{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12609737/HIVE-5403.4.patch

{color:green}SUCCESS:{color} +1 4430 tests passed

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1201/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1201/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

 Move loading of filesystem, ugi, metastore client to hive session
 -

 Key: HIVE-5403
 URL: https://issues.apache.org/jira/browse/HIVE-5403
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0
Reporter: Vikram Dixit K
Assignee: Vikram Dixit K
 Attachments: HIVE-5403.1.patch, HIVE-5403.2.patch, HIVE-5403.3.patch, 
 HIVE-5403.4.patch


 As part of HIVE-5184, the metastore connection, loading filesystem were done 
 as part of the tez session so as to speed up query times while paying a cost 
 at startup. We can do this more generally in hive to apply to both the 
 mapreduce and tez side of things.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Created] (HIVE-5622) Add minHS2 for HiveServer2 testing

2013-10-23 Thread Prasad Mujumdar (JIRA)
Prasad Mujumdar created HIVE-5622:
-

 Summary: Add minHS2 for HiveServer2 testing
 Key: HIVE-5622
 URL: https://issues.apache.org/jira/browse/HIVE-5622
 Project: Hive
  Issue Type: Sub-task
  Components: HiveServer2, Testing Infrastructure, Tests
Affects Versions: 0.12.0, 0.11.0
Reporter: Prasad Mujumdar
Assignee: Prasad Mujumdar






--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HIVE-5351) Secure-Socket-Layer (SSL) support for HiveServer2

2013-10-23 Thread Prasad Mujumdar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasad Mujumdar updated HIVE-5351:
--

Attachment: HIVE-5351.1.patch

 Secure-Socket-Layer (SSL) support for HiveServer2
 -

 Key: HIVE-5351
 URL: https://issues.apache.org/jira/browse/HIVE-5351
 Project: Hive
  Issue Type: Improvement
  Components: Authorization, HiveServer2, JDBC
Affects Versions: 0.11.0, 0.12.0
Reporter: Prasad Mujumdar
Assignee: Prasad Mujumdar
 Attachments: HIVE-5351.1.patch


 HiveServer2 and JDBC driver should support encrypted communication using SSL



--
This message was sent by Atlassian JIRA
(v6.1#6144)


Review Request 14870: HIVE-5351: Secure-Socket-Layer (SSL) support for HiveServer2

2013-10-23 Thread Prasad Mujumdar

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/14870/
---

Review request for hive, Brock Noland and Thejas Nair.


Bugs: HIVE-5351
https://issues.apache.org/jira/browse/HIVE-5351


Repository: hive-git


Description
---

Add support for encrypted communication for Plain SASL for binary thrift 
transport.
 - Optional thrift SSL transport on server side if configured.
 - Optional thrift SSL transport for JDBC client with configurable trust store
 - Added a miniHS2 class that for running a hiveserver2 for testing


Diffs
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java d0895e1 
  data/files/keystore.jks PRE-CREATION 
  data/files/truststore.jks PRE-CREATION 
  eclipse-templates/TestJdbcMiniHS2.launchtemplate PRE-CREATION 
  jdbc/ivy.xml b9d0cea 
  jdbc/src/java/org/apache/hive/jdbc/HiveConnection.java f155686 
  jdbc/src/test/org/apache/hive/jdbc/TestJdbcWithMiniHS2.java PRE-CREATION 
  jdbc/src/test/org/apache/hive/jdbc/TestSSL.java PRE-CREATION 
  metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 
24b1832 
  service/src/java/org/apache/hive/service/auth/HiveAuthFactory.java 5a66a6c 
  
service/src/java/org/apache/hive/service/cli/thrift/ThriftBinaryCLIService.java 
9c8f5c1 
  service/src/test/org/apache/hive/service/miniHS2/AbstarctHiveService.java 
PRE-CREATION 
  service/src/test/org/apache/hive/service/miniHS2/MiniHS2.java PRE-CREATION 
  service/src/test/org/apache/hive/service/miniHS2/TestHiveServer2.java 
PRE-CREATION 
  shims/src/0.20/java/org/apache/hadoop/hive/shims/Hadoop20Shims.java f57f09e 

Diff: https://reviews.apache.org/r/14870/diff/


Testing
---

- Basic HiveServer2 test cases with miniHS2
- Added multiple test cases for SSL transport


Thanks,

Prasad Mujumdar



[jira] [Updated] (HIVE-5351) Secure-Socket-Layer (SSL) support for HiveServer2

2013-10-23 Thread Prasad Mujumdar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasad Mujumdar updated HIVE-5351:
--

Status: Patch Available  (was: Open)

Patch attached

 Secure-Socket-Layer (SSL) support for HiveServer2
 -

 Key: HIVE-5351
 URL: https://issues.apache.org/jira/browse/HIVE-5351
 Project: Hive
  Issue Type: Improvement
  Components: Authorization, HiveServer2, JDBC
Affects Versions: 0.12.0, 0.11.0
Reporter: Prasad Mujumdar
Assignee: Prasad Mujumdar
 Attachments: HIVE-5351.1.patch


 HiveServer2 and JDBC driver should support encrypted communication using SSL



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HIVE-5351) Secure-Socket-Layer (SSL) support for HiveServer2

2013-10-23 Thread Prasad Mujumdar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13802663#comment-13802663
 ] 

Prasad Mujumdar commented on HIVE-5351:
---

The patch HIVE-5351.1.patch includes the miniHS2 test framework as well.

 Secure-Socket-Layer (SSL) support for HiveServer2
 -

 Key: HIVE-5351
 URL: https://issues.apache.org/jira/browse/HIVE-5351
 Project: Hive
  Issue Type: Improvement
  Components: Authorization, HiveServer2, JDBC
Affects Versions: 0.11.0, 0.12.0
Reporter: Prasad Mujumdar
Assignee: Prasad Mujumdar
 Attachments: HIVE-5351.1.patch


 HiveServer2 and JDBC driver should support encrypted communication using SSL



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Created] (HIVE-5623) ORC accessing array column that's empty will fail with java out of bound exception

2013-10-23 Thread Eric Chu (JIRA)
Eric Chu created HIVE-5623:
--

 Summary: ORC accessing array column that's empty will fail with 
java out of bound exception
 Key: HIVE-5623
 URL: https://issues.apache.org/jira/browse/HIVE-5623
 Project: Hive
  Issue Type: Bug
  Components: File Formats
Affects Versions: 0.11.0
Reporter: Eric Chu
Priority: Critical


In our ORC tests we saw that queries that work on RCFile failed on the 
corresponding ORC version with Java IndexOutOfBoundsException in 
OrcStruct.java. The queries failed b/c the table has an array type column and 
there are rows with an empty array.  We noticed that the getList(Object list, 
int i) method in OrcStruct.java simply returns the i-th element from list 
without checking if list is not null or if i is within valid range. After 
fixing that the queries run fine. The fix is really simple, but maybe there are 
other similar cases that need to be handled.
The fix is to check if listObj is null and if i falls within range:

public Object getListElement(Object listObj, int i) {
  if (listObj == null) {
  return null;
  }
  List list = ((List) listObj);
  if (i  0 || i = list.size()) {
  return null;
  }
  return list.get(i);
}





--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HIVE-5577) Remove TestNegativeCliDriver script_broken_pipe1

2013-10-23 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13802715#comment-13802715
 ] 

Navis commented on HIVE-5577:
-

+1

 Remove TestNegativeCliDriver script_broken_pipe1
 

 Key: HIVE-5577
 URL: https://issues.apache.org/jira/browse/HIVE-5577
 Project: Hive
  Issue Type: Bug
Reporter: Brock Noland
Assignee: Brock Noland

 TestNegativeCliDriver script_broken_pipe1 is extremely flaky and not a 
 terribly important test. Let's remove it.
 Failures
 https://builds.apache.org/user/brock/my-views/view/hive/job/Hive-trunk-hadoop1-ptest/206/testReport/org.apache.hadoop.hive.cli/TestNegativeCliDriver/testNegativeCliDriver_script_broken_pipe1/
 https://builds.apache.org/user/brock/my-views/view/hive/job/Hive-trunk-hadoop1-ptest/206/testReport/junit/org.apache.hadoop.hive.cli/TestNegativeCliDriver/testNegativeCliDriver_script_broken_pipe1/
 https://builds.apache.org/user/brock/my-views/view/hive/job/Hive-trunk-hadoop1-ptest/204/testReport/org.apache.hadoop.hive.cli/TestNegativeCliDriver/testNegativeCliDriver_script_broken_pipe1/



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HIVE-2747) UNION ALL with subquery which selects NULL and performs group by fails

2013-10-23 Thread jeff little (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13802729#comment-13802729
 ] 

jeff little commented on HIVE-2747:
---

Hi, Kevin Wilfong.
You can try the hql: from (select key, value, cast( count(1) as int) count 
from src group by key, value union all select NULL as key, value,cast( count(1) 
as int) count from src group by value) a select count;. You should modify the 
data type of ’count‘, otherwise the data type of  'count' in the  intermediate 
result  is void type, so it will cause java.lang.NullPointerException. In 
addition, if the hql sentences have union all operator, you should use 'AS' as 
the column's alias.

 UNION ALL with subquery which selects NULL and performs group by fails
 --

 Key: HIVE-2747
 URL: https://issues.apache.org/jira/browse/HIVE-2747
 Project: Hive
  Issue Type: Bug
Reporter: Kevin Wilfong

 Queries like the following
 from (select key, value, count(1) as count from src group by key, value union 
 all select NULL as key, value, count(1) as count from src group by value) a 
 select count(*);
 fail with the exception
 java.lang.NullPointerException
   at 
 org.apache.hadoop.hive.serde2.objectinspector.StructObjectInspector.toString(StructObjectInspector.java:60)
   at java.lang.String.valueOf(String.java:2826)
   at java.lang.StringBuilder.append(StringBuilder.java:115)
   at 
 org.apache.hadoop.hive.ql.exec.UnionOperator.initializeOp(UnionOperator.java:110)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:357)
   at 
 org.apache.hadoop.hive.ql.exec.MapOperator.initializeOp(MapOperator.java:427)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:357)
   at org.apache.hadoop.hive.ql.exec.ExecMapper.configure(ExecMapper.java:98)
   ... 18 more
 This should at least provide a more informative error message if not work.
 It works without the group by.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HIVE-4994) Add WebHCat (Templeton) documentation to Hive wiki

2013-10-23 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13802757#comment-13802757
 ] 

Lefty Leverenz commented on HIVE-4994:
--

All done now.  Dynamic partitions, error logs, and storage formats are linked 
to the Hive docs, and various Hive docs are linked to each other and to the 
HCatalog/WebHCat docs.

Any further changes can be considered improvements.  The doc conversion is 
finished.  Whew.

 Add WebHCat (Templeton) documentation to Hive wiki
 --

 Key: HIVE-4994
 URL: https://issues.apache.org/jira/browse/HIVE-4994
 Project: Hive
  Issue Type: Bug
  Components: Documentation
Affects Versions: 0.11.0
Reporter: Lefty Leverenz
Assignee: Lefty Leverenz

 WebHCat (Templeton) documentation in the Apache incubator had xml source 
 files which generated html  pdf output files.  Now that HCatalog and WebHCat 
 are part of the Hive project, all the WebHCat documents need to be added to 
 the Hive wiki.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HIVE-5547) webhcat pig job submission should ship hive tar if -usehcatalog is specified

2013-10-23 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13802763#comment-13802763
 ] 

Hive QA commented on HIVE-5547:
---



{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12609763/HIVE-5547.2.patch

{color:green}SUCCESS:{color} +1 4430 tests passed

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1204/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1204/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

 webhcat pig job submission should ship hive tar if -usehcatalog is specified
 

 Key: HIVE-5547
 URL: https://issues.apache.org/jira/browse/HIVE-5547
 Project: Hive
  Issue Type: Bug
  Components: WebHCat
Affects Versions: 0.12.0
Reporter: Eugene Koifman
Assignee: Eugene Koifman
 Attachments: HIVE-5547.2.patch, HIVE-5547.patch


 Currently when when a Pig job is submitted through WebHCat and the Pig script 
 uses HCatalog, that means that Hive should be installed on the node in the 
 cluster which ends up executing the job.  For large clusters is this a 
 manageability issue so we should use DistributedCache to ship the Hive tar 
 file to the target node as part of job submission
 TestPig_11 in hcatalog/src/test/e2e/templeton/tests/jobsubmission.conf has 
 the test case for this



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HIVE-5506) Hive SPLIT function does not return array correctly

2013-10-23 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13802826#comment-13802826
 ] 

Hudson commented on HIVE-5506:
--

SUCCESS: Integrated in Hive-trunk-hadoop1-ptest #214 (See 
[https://builds.apache.org/job/Hive-trunk-hadoop1-ptest/214/])
HIVE-5506 : Hive SPLIT function does not return array correctly (Vikram Dixit 
via Ashutosh Chauhan) (hashutosh: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1534775)
* /hive/trunk/data/files/input.txt
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFSplit.java
* /hive/trunk/ql/src/test/queries/clientpositive/split.q
* /hive/trunk/ql/src/test/results/clientpositive/split.q.out
* /hive/trunk/ql/src/test/results/clientpositive/udf_split.q.out


 Hive SPLIT function does not return array correctly
 ---

 Key: HIVE-5506
 URL: https://issues.apache.org/jira/browse/HIVE-5506
 Project: Hive
  Issue Type: Bug
  Components: SQL, UDF
Affects Versions: 0.9.0, 0.10.0, 0.11.0
 Environment: Hive
Reporter: John Omernik
Assignee: Vikram Dixit K
 Fix For: 0.13.0

 Attachments: HIVE-5506.1.patch, HIVE-5506.2.patch


 Hello all, I think I have outlined a bug in the hive split function:
 Summary: When calling split on a string of data, it will only return all 
 array items if the the last array item has a value. For example, if I have a 
 string of text delimited by tab with 7 columns, and the first four are 
 filled, but the last three are blank, split will only return a 4 position 
 array. If  any number of middle columns are empty, but the last item still 
 has a value, then it will return the proper number of columns.  This was 
 tested in Hive 0.9 and hive 0.11. 
 Data:
 (Note \t represents a tab char, \x09 the line endings should be \n (UNIX 
 style) not sure what email will do to them).  Basically my data is 7 lines of 
 data with the first 7 letters separated by tab.  On some lines I've left out 
 certain letters, but kept the number of tabs exactly the same.  
 input.txt
 a\tb\tc\td\te\tf\tg
 a\tb\tc\td\te\t\tg
 a\tb\t\td\t\tf\tg
 \t\t\td\te\tf\tg
 a\tb\tc\td\t\t\t
 a\t\t\t\te\tf\tg
 a\t\t\td\t\t\tg
 I then created a table with one column from that data:
 DROP TABLE tmp_jo_tab_test;
 CREATE table tmp_jo_tab_test (message_line STRING)
 STORED AS TEXTFILE;
  
 LOAD DATA LOCAL INPATH '/tmp/input.txt'
 OVERWRITE INTO TABLE tmp_jo_tab_test;
 Ok just to validate I created a python counting script:
 #!/usr/bin/python
  
 import sys
  
  
 for line in sys.stdin:
 line = line[0:-1]
 out = line.split(\t)
 print len(out)
 The output there is : 
 $ cat input.txt |./cnt_tabs.py
 7
 7
 7
 7
 7
 7
 7
 Based on that information, split on tab should return me 7 for each line as 
 well:
 hive -e select size(split(message_line, '\\t')) from tmp_jo_tab_test;
  
 7
 7
 7
 7
 4
 7
 7
 However it does not.  It would appear that the line where only the first four 
 letters are filled in(and blank is passed in on the last three) only returns 
 4 splits, where there should technically be 7, 4 for letters included, and 
 three blanks.  
 a\tb\tc\td\t\t\t 



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HIVE-3952) merge map-job followed by map-reduce job

2013-10-23 Thread Tianyuan Fu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tianyuan Fu updated HIVE-3952:
--

Description: 
Consider the query like:

select count(*)FROM
( select idOne, idTwo, value FROM
  bigTable   
  JOIN  
  
  smallTableOne on (bigTable.idOne = smallTableOne.idOne)   

  ) firstjoin   
  
JOIN
  
smallTableTwo on (firstjoin.idTwo = smallTableTwo.idTwo);


where smallTableOne and smallTableTwo are smaller than 
hive.auto.convert.join.noconditionaltask.size and
hive.auto.convert.join.noconditionaltask is set to true.

The joins are collapsed into mapjoins, and it leads to a map-only job
(for the map-joins) followed by a map-reduce job (for the group by).
Ideally, the map-only job should be merged with the following map-reduce job.

  was:
Consider the query like:

select count(*) FROM
( select idOne, idTwo, value FROM
  bigTable   
  JOIN  
  
  smallTableOne on (bigTable.idOne = smallTableOne.idOne)   

  ) firstjoin   
  
JOIN
  
smallTableTwo on (firstjoin.idTwo = smallTableTwo.idTwo);


where smallTableOne and smallTableTwo are smaller than 
hive.auto.convert.join.noconditionaltask.size and
hive.auto.convert.join.noconditionaltask is set to true.

The joins are collapsed into mapjoins, and it leads to a map-only job
(for the map-joins) followed by a map-reduce job (for the group by).
Ideally, the map-only job should be merged with the following map-reduce job.


 merge map-job followed by map-reduce job
 

 Key: HIVE-3952
 URL: https://issues.apache.org/jira/browse/HIVE-3952
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Namit Jain
Assignee: Vinod Kumar Vavilapalli
 Fix For: 0.11.0

 Attachments: hive.3952.1.patch, HIVE-3952-20130226.txt, 
 HIVE-3952-20130227.1.txt, HIVE-3952-20130301.txt, HIVE-3952-20130421.txt, 
 HIVE-3952-20130424.txt, HIVE-3952-20130428-branch-0.11-bugfix.txt, 
 HIVE-3952-20130428-branch-0.11.txt, HIVE-3952-20130428-branch-0.11-v2.txt


 Consider the query like:
 select count(*)FROM
 ( select idOne, idTwo, value FROM
   bigTable   
   JOIN
 
   smallTableOne on (bigTable.idOne = smallTableOne.idOne) 
   
   ) firstjoin 
 
 JOIN  
 
 smallTableTwo on (firstjoin.idTwo = smallTableTwo.idTwo);
 where smallTableOne and smallTableTwo are smaller than 
 hive.auto.convert.join.noconditionaltask.size and
 hive.auto.convert.join.noconditionaltask is set to true.
 The joins are collapsed into mapjoins, and it leads to a map-only job
 (for the map-joins) followed by a map-reduce job (for the group by).
 Ideally, the map-only job should be merged with the following map-reduce job.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HIVE-5276) Skip useless string encoding stage for hiveserver2

2013-10-23 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13802871#comment-13802871
 ] 

Hive QA commented on HIVE-5276:
---



{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12609770/HIVE-5276.4.patch.txt

{color:green}SUCCESS:{color} +1 4430 tests passed

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1205/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1205/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

 Skip useless string encoding stage for hiveserver2
 --

 Key: HIVE-5276
 URL: https://issues.apache.org/jira/browse/HIVE-5276
 Project: Hive
  Issue Type: Improvement
  Components: HiveServer2
Reporter: Navis
Assignee: Navis
Priority: Minor
 Attachments: HIVE-5276.3.patch.txt, HIVE-5276.4.patch.txt


 Current hiveserver2 acquires rows in string format which is used for cli 
 output. Then convert them into row again and convert to final format lastly. 
 This is inefficient and memory consuming. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HIVE-5403) Move loading of filesystem, ugi, metastore client to hive session

2013-10-23 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-5403:
---

   Resolution: Fixed
Fix Version/s: 0.13.0
   Status: Resolved  (was: Patch Available)

Committed to trunk. Thanks, Vikram!

 Move loading of filesystem, ugi, metastore client to hive session
 -

 Key: HIVE-5403
 URL: https://issues.apache.org/jira/browse/HIVE-5403
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0
Reporter: Vikram Dixit K
Assignee: Vikram Dixit K
 Fix For: 0.13.0

 Attachments: HIVE-5403.1.patch, HIVE-5403.2.patch, HIVE-5403.3.patch, 
 HIVE-5403.4.patch


 As part of HIVE-5184, the metastore connection, loading filesystem were done 
 as part of the tez session so as to speed up query times while paying a cost 
 at startup. We can do this more generally in hive to apply to both the 
 mapreduce and tez side of things.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HIVE-784) Support uncorrelated subqueries in the WHERE clause

2013-10-23 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-784:
--

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed to trunk. Thanks, Harish!

 Support uncorrelated subqueries in the WHERE clause
 ---

 Key: HIVE-784
 URL: https://issues.apache.org/jira/browse/HIVE-784
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor
Reporter: Ning Zhang
Assignee: Harish Butani
 Fix For: 0.13.0

 Attachments: D13443.1.patch, D13443.2.patch, HIVE-784.1.patch.txt, 
 HIVE-784.2.patch, SubQuerySpec.pdf, tpchQueriesUsingSubQueryClauses.sql


 Hive currently only support views in the FROM-clause, some Facebook use cases 
 suggest that Hive should support subqueries such as those connected by 
 IN/EXISTS in the WHERE-clause. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HIVE-5605) AddResourceOperation, DeleteResourceOperation, DfsOperation, SetOperation should be removed from org.apache.hive.service.cli.operation

2013-10-23 Thread Brock Noland (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-5605:
---

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Thank you for the contribution Vaibhav! I have committed this to trunk.

 AddResourceOperation, DeleteResourceOperation, DfsOperation, SetOperation 
 should be removed from org.apache.hive.service.cli.operation 
 ---

 Key: HIVE-5605
 URL: https://issues.apache.org/jira/browse/HIVE-5605
 Project: Hive
  Issue Type: Improvement
  Components: HiveServer2
Affects Versions: 0.13.0
Reporter: Vaibhav Gumashta
Assignee: Vaibhav Gumashta
Priority: Minor
 Fix For: 0.13.0

 Attachments: HIVE-5605.1.patch


 These classes are not used as the processing for Add, Delete, DFS and Set 
 commands is done by HiveCommandOperation



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HIVE-5350) Cleanup exception handling around parallel orderby

2013-10-23 Thread Brock Noland (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-5350:
---

   Resolution: Fixed
Fix Version/s: 0.13.0
   Status: Resolved  (was: Patch Available)

Thank you very much for the contribution Navis! I have committed this to trunk!

 Cleanup exception handling around parallel orderby
 --

 Key: HIVE-5350
 URL: https://issues.apache.org/jira/browse/HIVE-5350
 Project: Hive
  Issue Type: Bug
Reporter: Brock Noland
Assignee: Navis
Priority: Minor
 Fix For: 0.13.0

 Attachments: D13617.1.patch


 I think we should log the message to the console and the full exception to 
 the log:
 ExecDriver:
 {noformat}
 try {
   handleSampling(driverContext, mWork, job, conf);
   job.setPartitionerClass(HiveTotalOrderPartitioner.class);
 } catch (Exception e) {
   console.printInfo(Not enough sampling data.. Rolling back to 
 single reducer task);
   rWork.setNumReduceTasks(1);
   job.setNumReduceTasks(1);
 }
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HIVE-5599) Change default logging level to INFO

2013-10-23 Thread Brock Noland (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5599?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-5599:
---

   Resolution: Fixed
Fix Version/s: 0.13.0
   Status: Resolved  (was: Patch Available)

Thank you for the review Thejas! I have committed this to trunk.

 Change default logging level to INFO
 

 Key: HIVE-5599
 URL: https://issues.apache.org/jira/browse/HIVE-5599
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0
Reporter: Brock Noland
Assignee: Brock Noland
 Fix For: 0.13.0

 Attachments: HIVE-5599.patch


 The default logging level is warn:
 https://github.com/apache/hive/blob/trunk/common/src/java/conf/hive-log4j.properties#L19
 but hive logs lot's of good information at INFO level. Additionally most 
 hadoop projects log at INFO by default. Let's change the logging level to 
 INFO by default.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HIVE-5616) fix saveVersion.sh to work on mac

2013-10-23 Thread Brock Noland (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-5616:
---

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Thank you for the contribution Owen! I have committed this to branch.

 fix saveVersion.sh to work on mac
 -

 Key: HIVE-5616
 URL: https://issues.apache.org/jira/browse/HIVE-5616
 Project: Hive
  Issue Type: Sub-task
Reporter: Owen O'Malley
Assignee: Owen O'Malley
 Attachments: h-5616.patch


 There is no reason to not support builds on macs.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Created] (HIVE-5624) Remove ant artifacts from project

2013-10-23 Thread Brock Noland (JIRA)
Brock Noland created HIVE-5624:
--

 Summary: Remove ant artifacts from project
 Key: HIVE-5624
 URL: https://issues.apache.org/jira/browse/HIVE-5624
 Project: Hive
  Issue Type: Sub-task
Reporter: Brock Noland


Before marking HIVE-5107 resolved we should remove the build.xml files and 
other ant artifacts.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HIVE-5430) Refactor VectorizationContext and handle NOT expression with nulls.

2013-10-23 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-5430:
---

   Resolution: Fixed
Fix Version/s: 0.13.0
   Status: Resolved  (was: Patch Available)

Committed to trunk. Thanks, Jitendra!

 Refactor VectorizationContext and handle NOT expression with nulls.
 ---

 Key: HIVE-5430
 URL: https://issues.apache.org/jira/browse/HIVE-5430
 Project: Hive
  Issue Type: Sub-task
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Fix For: 0.13.0

 Attachments: HIVE-5430.1.patch, HIVE-5430.2.patch, HIVE-5430.3.patch, 
 HIVE-5430.4.patch, HIVE-5430.5.patch, HIVE-5430.6.patch


 NOT expression doesn't handle nulls correctly.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


Please add Harsh J as a contributor

2013-10-23 Thread Brock Noland
So I can attribute a patch to him.

Thanks!
Brock


[jira] [Updated] (HIVE-5454) HCatalog runs a partition listing with an empty filter

2013-10-23 Thread Brock Noland (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-5454:
---

   Resolution: Fixed
Fix Version/s: 0.13.0
 Assignee: Brock Noland
   Status: Resolved  (was: Patch Available)

Thank you for the contribution Harsh! I have committed this to trunk and will 
attribute it to you when you are added as a contributor.

Note: I am assigning it to myself in the interim so I don't forget.

 HCatalog runs a partition listing with an empty filter
 --

 Key: HIVE-5454
 URL: https://issues.apache.org/jira/browse/HIVE-5454
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.12.0
Reporter: Harsh J
Assignee: Brock Noland
 Fix For: 0.13.0

 Attachments: D13317.1.patch, D13317.2.patch, D13317.3.patch


 This is a HCATALOG-527 caused regression, wherein the HCatLoader's way of 
 calling HCatInputFormat causes it to do 2x partition lookups - once without 
 the filter, and then again with the filter.
 For tables with large number partitions (10, say), the non-filter lookup 
 proves fatal both to the client (Read timed out errors from 
 ThriftMetaStoreClient cause the server doesn't respond) and to the server 
 (too much data loaded into the cache, OOME, or slowdown).
 The fix would be to use a single call that also passes a partition filter 
 information, as was in the case of HCatalog 0.4 sources before HCATALOG-527.
 (HCatalog-release-wise, this affects all 0.5.x users)



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HIVE-5560) Hive produces incorrect results on multi-distinct query

2013-10-23 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-5560:
---

   Resolution: Fixed
Fix Version/s: 0.13.0
   Status: Resolved  (was: Patch Available)

Committed to trunk. Thanks, Navis!

 Hive produces incorrect results on multi-distinct query
 ---

 Key: HIVE-5560
 URL: https://issues.apache.org/jira/browse/HIVE-5560
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.11.0, 0.12.0
Reporter: Vikram Dixit K
Assignee: Navis
 Fix For: 0.13.0

 Attachments: D13599.1.patch, D13599.2.patch


 {noformat}
 select key, count(distinct key) + count(distinct value) from src tablesample 
 (10 ROWS) group by key
 POSTHOOK: type: QUERY
 POSTHOOK: Input: default@src
  A masked pattern was here 
 165 1
 val_165 1
 238 1
 val_238 1
 255 1
 val_255 1
 27  1
 val_27  1
 278 1
 val_278 1
 311 1
 val_311 1
 409 1
 val_409 1
 484 1
 val_484 1
 86  1
 val_86  1
 98  1
 val_98  1
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.1#6144)


Re: Please add Harsh J as a contributor

2013-10-23 Thread Ashutosh Chauhan
Done.


On Wed, Oct 23, 2013 at 8:15 AM, Brock Noland br...@cloudera.com wrote:

 So I can attribute a patch to him.

 Thanks!
 Brock



[jira] [Commented] (HIVE-5403) Move loading of filesystem, ugi, metastore client to hive session

2013-10-23 Thread Brock Noland (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13802959#comment-13802959
 ] 

Brock Noland commented on HIVE-5403:


Hey guys, thank you very much for your work on this!  I know this is already 
committed, but the following is incorrect:
{noformat}
+// session creation should fail since the schema didn't get created
+try {
+  SessionState.start(new CliSessionState(hiveConf));
+} catch (RuntimeException re) {
+  assertTrue(re.getCause().getCause() instanceof MetaException);
+}
{noformat}
It should be

{noformat}
+// session creation should fail since the schema didn't get created
+try {
+  SessionState.start(new CliSessionState(hiveConf));
fail(Expected exception);
+} catch (RuntimeException re) {
+  assertTrue(re.getCause().getCause() instanceof MetaException);
+}
{noformat}

Can you do a follow up jira?

 Move loading of filesystem, ugi, metastore client to hive session
 -

 Key: HIVE-5403
 URL: https://issues.apache.org/jira/browse/HIVE-5403
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0
Reporter: Vikram Dixit K
Assignee: Vikram Dixit K
 Fix For: 0.13.0

 Attachments: HIVE-5403.1.patch, HIVE-5403.2.patch, HIVE-5403.3.patch, 
 HIVE-5403.4.patch


 As part of HIVE-5184, the metastore connection, loading filesystem were done 
 as part of the tez session so as to speed up query times while paying a cost 
 at startup. We can do this more generally in hive to apply to both the 
 mapreduce and tez side of things.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HIVE-5454) HCatalog runs a partition listing with an empty filter

2013-10-23 Thread Brock Noland (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-5454:
---

Assignee: Harsh J  (was: Brock Noland)

 HCatalog runs a partition listing with an empty filter
 --

 Key: HIVE-5454
 URL: https://issues.apache.org/jira/browse/HIVE-5454
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.12.0
Reporter: Harsh J
Assignee: Harsh J
 Fix For: 0.13.0

 Attachments: D13317.1.patch, D13317.2.patch, D13317.3.patch


 This is a HCATALOG-527 caused regression, wherein the HCatLoader's way of 
 calling HCatInputFormat causes it to do 2x partition lookups - once without 
 the filter, and then again with the filter.
 For tables with large number partitions (10, say), the non-filter lookup 
 proves fatal both to the client (Read timed out errors from 
 ThriftMetaStoreClient cause the server doesn't respond) and to the server 
 (too much data loaded into the cache, OOME, or slowdown).
 The fix would be to use a single call that also passes a partition filter 
 information, as was in the case of HCatalog 0.4 sources before HCATALOG-527.
 (HCatalog-release-wise, this affects all 0.5.x users)



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HIVE-4523) round() function with specified decimal places not consistent with mysql

2013-10-23 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13802978#comment-13802978
 ] 

Xuefu Zhang commented on HIVE-4523:
---

The problem (most of it) stated here will be addressed in decimal 
precision/scale initiative.

 round() function with specified decimal places not consistent with mysql 
 -

 Key: HIVE-4523
 URL: https://issues.apache.org/jira/browse/HIVE-4523
 Project: Hive
  Issue Type: Improvement
  Components: UDF
Affects Versions: 0.7.1
Reporter: Fred Desing
Assignee: Xuefu Zhang
Priority: Minor
 Attachments: HIVE-4523.patch


 // hive
 hive select round(150.000, 2) from temp limit 1;
 150.0
 hive select round(150, 2) from temp limit 1;
 150.0
 // mysql
 mysql select round(150.000, 2) from DUAL limit 1;
 round(150.000, 2)
 150.00
 mysql select round(150, 2) from DUAL limit 1;
 round(150, 2)
 150
 http://dev.mysql.com/doc/refman/5.1/en/mathematical-functions.html#function_round



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HIVE-5483) use metastore statistics to optimize max/min/etc. queries

2013-10-23 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13802986#comment-13802986
 ] 

Ashutosh Chauhan commented on HIVE-5483:


Fair points, Prashanth. I think option 2) is better because of two reasons. 
First, not all file formats have this capability, so tying these kind of 
optimization with a particular format should be avoided whenever possible. 
Secondly, we anyway would want to have stats fresh as much as possible in 
metastore for query planning purposes, so we are already down the path of 
making stats fresh. By the way, there is already a way to collect stats fast 
without full scan, for RC (via HIVE-3958 ). We can do same for ORC via HIVE-4177

I also agree we need to streamline our stats collection, stats storage and 
stats access api.

 use metastore statistics to optimize max/min/etc. queries
 -

 Key: HIVE-5483
 URL: https://issues.apache.org/jira/browse/HIVE-5483
 Project: Hive
  Issue Type: Improvement
Reporter: Sergey Shelukhin
Assignee: Ashutosh Chauhan
 Attachments: HIVE-5483.patch


 We have discussed this a little bit.
 Hive can answer queries such as select max(c1) from t purely from metastore 
 using partition statistics, provided that we know the statistics are up to 
 date.
 All data changes (e.g. adding new partitions) currently go thru metastore so 
 we can track up-to-date-ness. If they are not up-to-date, the queries will 
 have to read data (at least for outdated partitions) until someone runs 
 analyze table. We can also analyze new partitions after add, if that is 
 configured/specified in the command.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Created] (HIVE-5625) Fix issue with metastore version revision test.

2013-10-23 Thread Vikram Dixit K (JIRA)
Vikram Dixit K created HIVE-5625:


 Summary: Fix issue with metastore version revision test.
 Key: HIVE-5625
 URL: https://issues.apache.org/jira/browse/HIVE-5625
 Project: Hive
  Issue Type: Bug
  Components: Tests
Affects Versions: 0.13.0
Reporter: Vikram Dixit K
Assignee: Vikram Dixit K


Based on Brock's comments, the change made in HIVE-5403 change the nature of 
the test.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HIVE-5481) WebHCat e2e test: TestStreaming -ve tests should also check for job completion success

2013-10-23 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13803002#comment-13803002
 ] 

Eugene Koifman commented on HIVE-5481:
--

currently all webhcat e2e tests pass on trunk with Hadoop1 even w/o this patch. 
 How do you explain this?

 WebHCat e2e test: TestStreaming -ve tests should also check for job 
 completion success
 --

 Key: HIVE-5481
 URL: https://issues.apache.org/jira/browse/HIVE-5481
 Project: Hive
  Issue Type: Bug
  Components: WebHCat
Reporter: Vaibhav Gumashta
Assignee: Vaibhav Gumashta
Priority: Minor
 Fix For: 0.13.0

 Attachments: HIVE-5481.1.patch


 Since TempletonController will anyway succeed for the -ve tests as well. 
 However, the exit value should be non-zero.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HIVE-5625) Fix issue with metastore version restriction test.

2013-10-23 Thread Vikram Dixit K (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikram Dixit K updated HIVE-5625:
-

Summary: Fix issue with metastore version restriction test.  (was: Fix 
issue with metastore version revision test.)

 Fix issue with metastore version restriction test.
 --

 Key: HIVE-5625
 URL: https://issues.apache.org/jira/browse/HIVE-5625
 Project: Hive
  Issue Type: Bug
  Components: Tests
Affects Versions: 0.13.0
Reporter: Vikram Dixit K
Assignee: Vikram Dixit K

 Based on Brock's comments, the change made in HIVE-5403 change the nature of 
 the test.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HIVE-5220) Add option for removing intermediate directory for partition, which is empty

2013-10-23 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13803013#comment-13803013
 ] 

Hive QA commented on HIVE-5220:
---



{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12609774/D12729.2.patch

{color:green}SUCCESS:{color} +1 4470 tests passed

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1207/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1207/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

 Add option for removing intermediate directory for partition, which is empty
 

 Key: HIVE-5220
 URL: https://issues.apache.org/jira/browse/HIVE-5220
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Minor
 Attachments: D12729.2.patch, HIVE-5220.D12729.1.patch


 For deeply nested partitioned table, intermediate directories are not removed 
 even if there is no partitions in it by removing them.
 {noformat}
 /deep_part/c=09/d=01
 /deep_part/c=09/d=01/e=01
 /deep_part/c=09/d=01/e=02
 /deep_part/c=09/d=02
 /deep_part/c=09/d=02/e=01
 /deep_part/c=09/d=02/e=02
 {noformat}
 After removing partition (c='09'), directory remains like this, 
 {noformat}
 /deep_part/c=09/d=01
 /deep_part/c=09/d=02
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HIVE-5220) Add option for removing intermediate directory for partition, which is empty

2013-10-23 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-5220:
---

   Resolution: Fixed
Fix Version/s: 0.13.0
   Status: Resolved  (was: Patch Available)

Committed to trunk. Thanks, Navis!

 Add option for removing intermediate directory for partition, which is empty
 

 Key: HIVE-5220
 URL: https://issues.apache.org/jira/browse/HIVE-5220
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Minor
 Fix For: 0.13.0

 Attachments: D12729.2.patch, HIVE-5220.D12729.1.patch


 For deeply nested partitioned table, intermediate directories are not removed 
 even if there is no partitions in it by removing them.
 {noformat}
 /deep_part/c=09/d=01
 /deep_part/c=09/d=01/e=01
 /deep_part/c=09/d=01/e=02
 /deep_part/c=09/d=02
 /deep_part/c=09/d=02/e=01
 /deep_part/c=09/d=02/e=02
 {noformat}
 After removing partition (c='09'), directory remains like this, 
 {noformat}
 /deep_part/c=09/d=01
 /deep_part/c=09/d=02
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HIVE-5218) datanucleus does not work with MS SQLServer in Hive metastore

2013-10-23 Thread Andy Jefferson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13803019#comment-13803019
 ] 

Andy Jefferson commented on HIVE-5218:
--

FYI 3.2.7 of datanucleus-rdbms is released

 datanucleus does not work with MS SQLServer in Hive metastore
 -

 Key: HIVE-5218
 URL: https://issues.apache.org/jira/browse/HIVE-5218
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.12.0
Reporter: shanyu zhao
 Attachments: 
 0001-HIVE-5218-datanucleus-does-not-work-with-SQLServer-i.patch, 
 HIVE-5218.patch


 HIVE-3632 upgraded datanucleus version to 3.2.x, however, this version of 
 datanucleus doesn't work with SQLServer as the metastore. The problem is that 
 datanucleus tries to use fully qualified object name to find a table in the 
 database but couldn't find it.
 If I downgrade the version to HIVE-2084, SQLServer works fine.
 It could be a bug in datanucleus.
 This is the detailed exception I'm getting when using datanucleus 3.2.x with 
 SQL Server:
 FAILED: Execution Error, return code 1 from 
 org.apache.hadoop.hive.ql.exec.DDLTa
 sk. MetaException(message:javax.jdo.JDOException: Exception thrown calling 
 table
 .exists() for a2ee36af45e9f46c19e995bfd2d9b5fd1hivemetastore..SEQUENCE_TABLE
 at 
 org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusExc
 eption(NucleusJDOHelper.java:596)
 at 
 org.datanucleus.api.jdo.JDOPersistenceManager.jdoMakePersistent(JDOPe
 rsistenceManager.java:732)
 …
 at 
 org.apache.hadoop.hive.metastore.RetryingRawStore.invoke(RetryingRawS
 tore.java:111)
 at $Proxy0.createTable(Unknown Source)
 at 
 org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_tabl
 e_core(HiveMetaStore.java:1071)
 at 
 org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_tabl
 e_with_environment_context(HiveMetaStore.java:1104)
 …
 at $Proxy11.create_table_with_environment_context(Unknown Source)
 at 
 org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$cr
 eate_table_with_environment_context.getResult(ThriftHiveMetastore.java:6417)
 at 
 org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$cr
 eate_table_with_environment_context.getResult(ThriftHiveMetastore.java:6401)
 NestedThrowablesStackTrace:
 com.microsoft.sqlserver.jdbc.SQLServerException: There is already an object 
 name
 d 'SEQUENCE_TABLE' in the database.
 at 
 com.microsoft.sqlserver.jdbc.SQLServerException.makeFromDatabaseError
 (SQLServerException.java:197)
 at 
 com.microsoft.sqlserver.jdbc.SQLServerStatement.getNextResult(SQLServ
 erStatement.java:1493)
 at 
 com.microsoft.sqlserver.jdbc.SQLServerStatement.doExecuteStatement(SQ
 LServerStatement.java:775)
 at 
 com.microsoft.sqlserver.jdbc.SQLServerStatement$StmtExecCmd.doExecute
 (SQLServerStatement.java:676)
 at com.microsoft.sqlserver.jdbc.TDSCommand.execute(IOBuffer.java:4615)
 at 
 com.microsoft.sqlserver.jdbc.SQLServerConnection.executeCommand(SQLSe
 rverConnection.java:1400)
 at 
 com.microsoft.sqlserver.jdbc.SQLServerStatement.executeCommand(SQLSer
 verStatement.java:179)
 at 
 com.microsoft.sqlserver.jdbc.SQLServerStatement.executeStatement(SQLS
 erverStatement.java:154)
 at 
 com.microsoft.sqlserver.jdbc.SQLServerStatement.execute(SQLServerStat
 ement.java:649)
 at com.jolbox.bonecp.StatementHandle.execute(StatementHandle.java:300)
 at 
 org.datanucleus.store.rdbms.table.AbstractTable.executeDdlStatement(A
 bstractTable.java:760)
 at 
 org.datanucleus.store.rdbms.table.AbstractTable.executeDdlStatementLi
 st(AbstractTable.java:711)
 at 
 org.datanucleus.store.rdbms.table.AbstractTable.create(AbstractTable.
 java:425)
 at 
 org.datanucleus.store.rdbms.table.AbstractTable.exists(AbstractTable.
 java:488)
 at 
 org.datanucleus.store.rdbms.valuegenerator.TableGenerator.repositoryE
 xists(TableGenerator.java:242)
 at 
 org.datanucleus.store.rdbms.valuegenerator.AbstractRDBMSGenerator.obt
 ainGenerationBlock(AbstractRDBMSGenerator.java:86)
 at 
 org.datanucleus.store.valuegenerator.AbstractGenerator.obtainGenerati
 onBlock(AbstractGenerator.java:197)
 at 
 org.datanucleus.store.valuegenerator.AbstractGenerator.next(AbstractG
 enerator.java:105)
 at 
 org.datanucleus.store.rdbms.RDBMSStoreManager.getStrategyValueForGene
 rator(RDBMSStoreManager.java:2019)
 at 
 org.datanucleus.store.AbstractStoreManager.getStrategyValue(AbstractS
 toreManager.java:1385)
 at 
 org.datanucleus.ExecutionContextImpl.newObjectId(ExecutionContextImpl
 .java:3727)
 at 
 

[jira] [Commented] (HIVE-5605) AddResourceOperation, DeleteResourceOperation, DfsOperation, SetOperation should be removed from org.apache.hive.service.cli.operation

2013-10-23 Thread Vaibhav Gumashta (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13803020#comment-13803020
 ] 

Vaibhav Gumashta commented on HIVE-5605:


Thanks Brock!

 AddResourceOperation, DeleteResourceOperation, DfsOperation, SetOperation 
 should be removed from org.apache.hive.service.cli.operation 
 ---

 Key: HIVE-5605
 URL: https://issues.apache.org/jira/browse/HIVE-5605
 Project: Hive
  Issue Type: Improvement
  Components: HiveServer2
Affects Versions: 0.13.0
Reporter: Vaibhav Gumashta
Assignee: Vaibhav Gumashta
Priority: Minor
 Fix For: 0.13.0

 Attachments: HIVE-5605.1.patch


 These classes are not used as the processing for Add, Delete, DFS and Set 
 commands is done by HiveCommandOperation



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HIVE-5625) Fix issue with metastore version restriction test.

2013-10-23 Thread Vikram Dixit K (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikram Dixit K updated HIVE-5625:
-

Status: Patch Available  (was: Open)

 Fix issue with metastore version restriction test.
 --

 Key: HIVE-5625
 URL: https://issues.apache.org/jira/browse/HIVE-5625
 Project: Hive
  Issue Type: Bug
  Components: Tests
Affects Versions: 0.13.0
Reporter: Vikram Dixit K
Assignee: Vikram Dixit K
 Attachments: HIVE-5625.1.patch


 Based on Brock's comments, the change made in HIVE-5403 change the nature of 
 the test.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HIVE-5625) Fix issue with metastore version restriction test.

2013-10-23 Thread Vikram Dixit K (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikram Dixit K updated HIVE-5625:
-

Attachment: HIVE-5625.1.patch

 Fix issue with metastore version restriction test.
 --

 Key: HIVE-5625
 URL: https://issues.apache.org/jira/browse/HIVE-5625
 Project: Hive
  Issue Type: Bug
  Components: Tests
Affects Versions: 0.13.0
Reporter: Vikram Dixit K
Assignee: Vikram Dixit K
 Attachments: HIVE-5625.1.patch


 Based on Brock's comments, the change made in HIVE-5403 change the nature of 
 the test.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


Review Request 14877: HIVE-5625: Fix issue with metastore version restriction test.

2013-10-23 Thread Vikram Dixit Kumaraswamy

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/14877/
---

Review request for hive, Ashutosh Chauhan and Brock Noland.


Bugs: HIVE-5625
https://issues.apache.org/jira/browse/HIVE-5625


Repository: hive-git


Description
---

Fix issue with metastore version restriction test.


Diffs
-

  metastore/src/test/org/apache/hadoop/hive/metastore/TestMetastoreVersion.java 
d7761f4 

Diff: https://reviews.apache.org/r/14877/diff/


Testing
---

Ran all metastore tests.


Thanks,

Vikram Dixit Kumaraswamy



[jira] [Commented] (HIVE-5625) Fix issue with metastore version restriction test.

2013-10-23 Thread Brock Noland (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13803026#comment-13803026
 ] 

Brock Noland commented on HIVE-5625:


+1

 Fix issue with metastore version restriction test.
 --

 Key: HIVE-5625
 URL: https://issues.apache.org/jira/browse/HIVE-5625
 Project: Hive
  Issue Type: Bug
  Components: Tests
Affects Versions: 0.13.0
Reporter: Vikram Dixit K
Assignee: Vikram Dixit K
 Attachments: HIVE-5625.1.patch


 Based on Brock's comments, the change made in HIVE-5403 change the nature of 
 the test.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HIVE-5619) Allow concat() to accept mixed string/binary args

2013-10-23 Thread Jason Dere (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-5619:
-

Status: Patch Available  (was: Open)

my test run got botched, submitting patch to allow pre-commit build to run

 Allow concat() to accept mixed string/binary args
 -

 Key: HIVE-5619
 URL: https://issues.apache.org/jira/browse/HIVE-5619
 Project: Hive
  Issue Type: Improvement
Reporter: Jason Dere
Assignee: Jason Dere
 Attachments: HIVE-5619.1.patch


 concat() is currently strict about allowing either all binary or all 
 non-binary arguments. Loosen this to permit mixed params.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HIVE-5218) datanucleus does not work with MS SQLServer in Hive metastore

2013-10-23 Thread Brock Noland (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13803036#comment-13803036
 ] 

Brock Noland commented on HIVE-5218:


Great! @shanyu, I'd be happy to review a patch upgrading to 3.2.7.

 datanucleus does not work with MS SQLServer in Hive metastore
 -

 Key: HIVE-5218
 URL: https://issues.apache.org/jira/browse/HIVE-5218
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.12.0
Reporter: shanyu zhao
 Attachments: 
 0001-HIVE-5218-datanucleus-does-not-work-with-SQLServer-i.patch, 
 HIVE-5218.patch


 HIVE-3632 upgraded datanucleus version to 3.2.x, however, this version of 
 datanucleus doesn't work with SQLServer as the metastore. The problem is that 
 datanucleus tries to use fully qualified object name to find a table in the 
 database but couldn't find it.
 If I downgrade the version to HIVE-2084, SQLServer works fine.
 It could be a bug in datanucleus.
 This is the detailed exception I'm getting when using datanucleus 3.2.x with 
 SQL Server:
 FAILED: Execution Error, return code 1 from 
 org.apache.hadoop.hive.ql.exec.DDLTa
 sk. MetaException(message:javax.jdo.JDOException: Exception thrown calling 
 table
 .exists() for a2ee36af45e9f46c19e995bfd2d9b5fd1hivemetastore..SEQUENCE_TABLE
 at 
 org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusExc
 eption(NucleusJDOHelper.java:596)
 at 
 org.datanucleus.api.jdo.JDOPersistenceManager.jdoMakePersistent(JDOPe
 rsistenceManager.java:732)
 …
 at 
 org.apache.hadoop.hive.metastore.RetryingRawStore.invoke(RetryingRawS
 tore.java:111)
 at $Proxy0.createTable(Unknown Source)
 at 
 org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_tabl
 e_core(HiveMetaStore.java:1071)
 at 
 org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_tabl
 e_with_environment_context(HiveMetaStore.java:1104)
 …
 at $Proxy11.create_table_with_environment_context(Unknown Source)
 at 
 org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$cr
 eate_table_with_environment_context.getResult(ThriftHiveMetastore.java:6417)
 at 
 org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$cr
 eate_table_with_environment_context.getResult(ThriftHiveMetastore.java:6401)
 NestedThrowablesStackTrace:
 com.microsoft.sqlserver.jdbc.SQLServerException: There is already an object 
 name
 d 'SEQUENCE_TABLE' in the database.
 at 
 com.microsoft.sqlserver.jdbc.SQLServerException.makeFromDatabaseError
 (SQLServerException.java:197)
 at 
 com.microsoft.sqlserver.jdbc.SQLServerStatement.getNextResult(SQLServ
 erStatement.java:1493)
 at 
 com.microsoft.sqlserver.jdbc.SQLServerStatement.doExecuteStatement(SQ
 LServerStatement.java:775)
 at 
 com.microsoft.sqlserver.jdbc.SQLServerStatement$StmtExecCmd.doExecute
 (SQLServerStatement.java:676)
 at com.microsoft.sqlserver.jdbc.TDSCommand.execute(IOBuffer.java:4615)
 at 
 com.microsoft.sqlserver.jdbc.SQLServerConnection.executeCommand(SQLSe
 rverConnection.java:1400)
 at 
 com.microsoft.sqlserver.jdbc.SQLServerStatement.executeCommand(SQLSer
 verStatement.java:179)
 at 
 com.microsoft.sqlserver.jdbc.SQLServerStatement.executeStatement(SQLS
 erverStatement.java:154)
 at 
 com.microsoft.sqlserver.jdbc.SQLServerStatement.execute(SQLServerStat
 ement.java:649)
 at com.jolbox.bonecp.StatementHandle.execute(StatementHandle.java:300)
 at 
 org.datanucleus.store.rdbms.table.AbstractTable.executeDdlStatement(A
 bstractTable.java:760)
 at 
 org.datanucleus.store.rdbms.table.AbstractTable.executeDdlStatementLi
 st(AbstractTable.java:711)
 at 
 org.datanucleus.store.rdbms.table.AbstractTable.create(AbstractTable.
 java:425)
 at 
 org.datanucleus.store.rdbms.table.AbstractTable.exists(AbstractTable.
 java:488)
 at 
 org.datanucleus.store.rdbms.valuegenerator.TableGenerator.repositoryE
 xists(TableGenerator.java:242)
 at 
 org.datanucleus.store.rdbms.valuegenerator.AbstractRDBMSGenerator.obt
 ainGenerationBlock(AbstractRDBMSGenerator.java:86)
 at 
 org.datanucleus.store.valuegenerator.AbstractGenerator.obtainGenerati
 onBlock(AbstractGenerator.java:197)
 at 
 org.datanucleus.store.valuegenerator.AbstractGenerator.next(AbstractG
 enerator.java:105)
 at 
 org.datanucleus.store.rdbms.RDBMSStoreManager.getStrategyValueForGene
 rator(RDBMSStoreManager.java:2019)
 at 
 org.datanucleus.store.AbstractStoreManager.getStrategyValue(AbstractS
 toreManager.java:1385)
 at 
 org.datanucleus.ExecutionContextImpl.newObjectId(ExecutionContextImpl
 .java:3727)
 at 
 

[jira] [Commented] (HIVE-5511) percentComplete returned by job status from WebHCat is null

2013-10-23 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13803078#comment-13803078
 ] 

Eugene Koifman commented on HIVE-5511:
--

previous comment should read:
blocks HIVE-5547 since this patch needs to be applied first

 percentComplete returned by job status from WebHCat is null
 ---

 Key: HIVE-5511
 URL: https://issues.apache.org/jira/browse/HIVE-5511
 Project: Hive
  Issue Type: Bug
  Components: WebHCat
Affects Versions: 0.12.0
Reporter: Eugene Koifman
Assignee: Eugene Koifman
 Attachments: HIVE-5511.3.patch


 In hadoop1 the logging from MR is sent to stderr.  In H2, by default, to 
 syslog.  templeton.tool.LaunchMapper expects to see the output on stderr to 
 produce 'percentComplete' in job status.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HIVE-5481) WebHCat e2e test: TestStreaming -ve tests should also check for job completion success

2013-10-23 Thread Vaibhav Gumashta (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13803093#comment-13803093
 ] 

Vaibhav Gumashta commented on HIVE-5481:


[~ekoifman] It could be because of 
[HIVE-5510|https://issues.apache.org/jira/browse/HIVE-5510] due to which the 
values returned were a mix of TempletonController job and the launced job (and 
possibly the job completion was for the launched job), which I believe will now 
be changed to return the values only for the TempletonController job. Thus, 
TempletonController job should always succeed unless it's a -ve test for the 
TempletonController job.

 WebHCat e2e test: TestStreaming -ve tests should also check for job 
 completion success
 --

 Key: HIVE-5481
 URL: https://issues.apache.org/jira/browse/HIVE-5481
 Project: Hive
  Issue Type: Bug
  Components: WebHCat
Reporter: Vaibhav Gumashta
Assignee: Vaibhav Gumashta
Priority: Minor
 Fix For: 0.13.0

 Attachments: HIVE-5481.1.patch


 Since TempletonController will anyway succeed for the -ve tests as well. 
 However, the exit value should be non-zero.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Assigned] (HIVE-5581) Implement vectorized year/month/day... etc. for string arguments

2013-10-23 Thread Teddy Choi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Teddy Choi reassigned HIVE-5581:


Assignee: Teddy Choi

 Implement vectorized year/month/day... etc. for string arguments
 

 Key: HIVE-5581
 URL: https://issues.apache.org/jira/browse/HIVE-5581
 Project: Hive
  Issue Type: Sub-task
  Components: Query Processor
Affects Versions: 0.13.0
Reporter: Eric Hanson
Assignee: Teddy Choi

 Functions year(), month(), day(), weekofyear(), hour(), minute(), second() 
 need to be implemented for string arguments in vectorized mode. 
 They already work for timestamp arguments.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HIVE-5626) enable metastore direct SQL for drop/similar queries

2013-10-23 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13803119#comment-13803119
 ] 

Sergey Shelukhin commented on HIVE-5626:


[~ashutoshc] fyi

 enable metastore direct SQL for drop/similar queries
 

 Key: HIVE-5626
 URL: https://issues.apache.org/jira/browse/HIVE-5626
 Project: Hive
  Issue Type: Improvement
Reporter: Sergey Shelukhin
Priority: Minor

 Metastore direct SQL is currently disabled for any queries running inside 
 external transaction (i.e. all modification queries, like dropping stuff).
 This was done to keep the strictly performance-optimization behavior when 
 using Postgres, which unlike other RDBMS-es fails the tx on any syntax error; 
 so, if direct SQL is broken there's no way to fall back. So, it is disabled 
 for these cases.
 It is not as important because drop commands are rare, but we might want to 
 address that. Either by some config setting or by making it work on 
 non-postgres DBs.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Created] (HIVE-5626) enable metastore direct SQL for drop/similar queries

2013-10-23 Thread Sergey Shelukhin (JIRA)
Sergey Shelukhin created HIVE-5626:
--

 Summary: enable metastore direct SQL for drop/similar queries
 Key: HIVE-5626
 URL: https://issues.apache.org/jira/browse/HIVE-5626
 Project: Hive
  Issue Type: Improvement
Reporter: Sergey Shelukhin
Priority: Minor


Metastore direct SQL is currently disabled for any queries running inside 
external transaction (i.e. all modification queries, like dropping stuff).
This was done to keep the strictly performance-optimization behavior when using 
Postgres, which unlike other RDBMS-es fails the tx on any syntax error; so, if 
direct SQL is broken there's no way to fall back. So, it is disabled for these 
cases.

It is not as important because drop commands are rare, but we might want to 
address that. Either by some config setting or by making it work on 
non-postgres DBs.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Resolved] (HIVE-4994) Add WebHCat (Templeton) documentation to Hive wiki

2013-10-23 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair resolved HIVE-4994.
-

Resolution: Fixed

Marking it as resolved. Thanks for the contribution Lefty!


 Add WebHCat (Templeton) documentation to Hive wiki
 --

 Key: HIVE-4994
 URL: https://issues.apache.org/jira/browse/HIVE-4994
 Project: Hive
  Issue Type: Bug
  Components: Documentation
Affects Versions: 0.11.0
Reporter: Lefty Leverenz
Assignee: Lefty Leverenz

 WebHCat (Templeton) documentation in the Apache incubator had xml source 
 files which generated html  pdf output files.  Now that HCatalog and WebHCat 
 are part of the Hive project, all the WebHCat documents need to be added to 
 the Hive wiki.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HIVE-4446) [HCatalog] Documentation for HIVE-4442, HIVE-4443, HIVE-4444

2013-10-23 Thread Daniel Dai (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13803152#comment-13803152
 ] 

Daniel Dai commented on HIVE-4446:
--

Thanks Lefty, the document for this Jira looks good for me. There's additional 
document change not ported to cwiki yet, such as HIVE-5031, HIVE-4531, etc.

 [HCatalog] Documentation for HIVE-4442, HIVE-4443, HIVE-
 

 Key: HIVE-4446
 URL: https://issues.apache.org/jira/browse/HIVE-4446
 Project: Hive
  Issue Type: Improvement
  Components: HCatalog
Reporter: Daniel Dai
Assignee: Lefty Leverenz
 Attachments: HIVE-4446-1.patch






--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Created] (HIVE-5627) Document 'usehcatalog' parameter on WebHCat calls

2013-10-23 Thread Eugene Koifman (JIRA)
Eugene Koifman created HIVE-5627:


 Summary: Document 'usehcatalog' parameter on WebHCat calls
 Key: HIVE-5627
 URL: https://issues.apache.org/jira/browse/HIVE-5627
 Project: Hive
  Issue Type: Sub-task
  Components: WebHCat
Affects Versions: 0.12.0
Reporter: Eugene Koifman
Assignee: Lefty Leverenz


The following REST calls in WebHCat:
1. mapreduce/jar
2. pig
3. hive
now support an additional parameter 'usehcatalog'.

The JavaDoc on corresponding methods in  
org.apache.hive.hcatalog.templeton.Server describe this parameter.  

Additionally, templeton.hive.archive, etc

This should be added to the sections in 
https://cwiki.apache.org/confluence/display/Hive/WebHCat+Reference that 
correspond to these methods.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Created] (HIVE-5628) ListBucketingPrunnerTest and DynamicMultiDimeCollectionTest should start with Test not end with it

2013-10-23 Thread Brock Noland (JIRA)
Brock Noland created HIVE-5628:
--

 Summary: ListBucketingPrunnerTest and 
DynamicMultiDimeCollectionTest should start with Test not end with it
 Key: HIVE-5628
 URL: https://issues.apache.org/jira/browse/HIVE-5628
 Project: Hive
  Issue Type: Bug
Reporter: Brock Noland
Assignee: Brock Noland


ListBucketingPrunnerTest and DynamicMultiDimeCollectionTest will not be run by 
PTest because they end with Test and PTest requires tests start with Test.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Created] (HIVE-5629) Fix two javadoc failures in HCatalog

2013-10-23 Thread Brock Noland (JIRA)
Brock Noland created HIVE-5629:
--

 Summary: Fix two javadoc failures in HCatalog
 Key: HIVE-5629
 URL: https://issues.apache.org/jira/browse/HIVE-5629
 Project: Hive
  Issue Type: Bug
Reporter: Brock Noland


I am seeing two javadoc failures on HCatalog. These are not being seen by PTest 
and indeed I cannot reproduce on my Mac but can on Linux. Regardless they 
should be fixed.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HIVE-5629) Fix two javadoc failures in HCatalog

2013-10-23 Thread Brock Noland (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-5629:
---

Attachment: HIVE-5629.patch

 Fix two javadoc failures in HCatalog
 

 Key: HIVE-5629
 URL: https://issues.apache.org/jira/browse/HIVE-5629
 Project: Hive
  Issue Type: Bug
Reporter: Brock Noland
 Attachments: HIVE-5629.patch


 I am seeing two javadoc failures on HCatalog. These are not being seen by 
 PTest and indeed I cannot reproduce on my Mac but can on Linux. Regardless 
 they should be fixed.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HIVE-5269) Use thrift binary type for conveying binary values in hiveserver2

2013-10-23 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13803182#comment-13803182
 ] 

Hive QA commented on HIVE-5269:
---



{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12609771/HIVE-5269.2.patch.txt

{color:green}SUCCESS:{color} +1 4470 tests passed

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1208/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1208/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

 Use thrift binary type for conveying binary values in hiveserver2
 -

 Key: HIVE-5269
 URL: https://issues.apache.org/jira/browse/HIVE-5269
 Project: Hive
  Issue Type: Improvement
  Components: HiveServer2
Reporter: Navis
Assignee: Navis
Priority: Trivial
 Attachments: HIVE-5269.2.patch.txt, HIVE-5269.D12873.1.patch


 Currently, binary type is encoded to string in hiveserver2 and decoded in 
 client. Just using binary type might make it simpler.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HIVE-5629) Fix two javadoc failures in HCatalog

2013-10-23 Thread Brock Noland (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-5629:
---

Assignee: Brock Noland
  Status: Patch Available  (was: Open)

 Fix two javadoc failures in HCatalog
 

 Key: HIVE-5629
 URL: https://issues.apache.org/jira/browse/HIVE-5629
 Project: Hive
  Issue Type: Bug
Reporter: Brock Noland
Assignee: Brock Noland
 Attachments: HIVE-5629.patch


 I am seeing two javadoc failures on HCatalog. These are not being seen by 
 PTest and indeed I cannot reproduce on my Mac but can on Linux. Regardless 
 they should be fixed.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HIVE-5627) Document 'usehcatalog' parameter on WebHCat calls

2013-10-23 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-5627:
-

Description: 
The following REST calls in WebHCat:
1. mapreduce/jar
2. pig
now support an additional parameter 'usehcatalog'.  This is a mechanism for the 
caller to tell WebHCat that the submitted job uses HCat, and thus needs to 
access the metastore, which requires additional steps for WebHCat to perform.  

The JavaDoc on corresponding methods in  
org.apache.hive.hcatalog.templeton.Server describe this parameter.  

Additionally, if templeton.hive.archive, templeton.hive.home and 
templeton.hcat.home are defined in webhcat-site.xml (documented in 
webhcat-default.xml) then WebHCat will ship the Hive tar to the target node 
where the job actually runs.  This means that Hive doesn't need to be installed 
on every node in the Hadoop cluster.  (This part was added in HIVE-5547)

This should be added to the sections in 
https://cwiki.apache.org/confluence/display/Hive/WebHCat+Reference that 
correspond to these methods.

  was:
The following REST calls in WebHCat:
1. mapreduce/jar
2. pig
3. hive
now support an additional parameter 'usehcatalog'.

The JavaDoc on corresponding methods in  
org.apache.hive.hcatalog.templeton.Server describe this parameter.  

Additionally, templeton.hive.archive, etc

This should be added to the sections in 
https://cwiki.apache.org/confluence/display/Hive/WebHCat+Reference that 
correspond to these methods.


 Document 'usehcatalog' parameter on WebHCat calls
 -

 Key: HIVE-5627
 URL: https://issues.apache.org/jira/browse/HIVE-5627
 Project: Hive
  Issue Type: Sub-task
  Components: WebHCat
Affects Versions: 0.12.0
Reporter: Eugene Koifman
Assignee: Lefty Leverenz
 Fix For: 0.13.0


 The following REST calls in WebHCat:
 1. mapreduce/jar
 2. pig
 now support an additional parameter 'usehcatalog'.  This is a mechanism for 
 the caller to tell WebHCat that the submitted job uses HCat, and thus needs 
 to access the metastore, which requires additional steps for WebHCat to 
 perform.  
 The JavaDoc on corresponding methods in  
 org.apache.hive.hcatalog.templeton.Server describe this parameter.  
 Additionally, if templeton.hive.archive, templeton.hive.home and 
 templeton.hcat.home are defined in webhcat-site.xml (documented in 
 webhcat-default.xml) then WebHCat will ship the Hive tar to the target node 
 where the job actually runs.  This means that Hive doesn't need to be 
 installed on every node in the Hadoop cluster.  (This part was added in 
 HIVE-5547)
 This should be added to the sections in 
 https://cwiki.apache.org/confluence/display/Hive/WebHCat+Reference that 
 correspond to these methods.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HIVE-5629) Fix two javadoc failures in HCatalog

2013-10-23 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13803229#comment-13803229
 ] 

Ashutosh Chauhan commented on HIVE-5629:


+1

 Fix two javadoc failures in HCatalog
 

 Key: HIVE-5629
 URL: https://issues.apache.org/jira/browse/HIVE-5629
 Project: Hive
  Issue Type: Bug
Reporter: Brock Noland
Assignee: Brock Noland
 Attachments: HIVE-5629.patch


 I am seeing two javadoc failures on HCatalog. These are not being seen by 
 PTest and indeed I cannot reproduce on my Mac but can on Linux. Regardless 
 they should be fixed.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HIVE-5627) Document 'usehcatalog' parameter on WebHCat calls

2013-10-23 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-5627:
-

Description: 
The following REST calls in WebHCat:
1. mapreduce/jar
2. pig
now support an additional parameter 'usehcatalog'.  This is a mechanism for the 
caller to tell WebHCat that the submitted job uses HCat, and thus needs to 
access the metastore, which requires additional steps for WebHCat to perform in 
a secure cluster.  

The JavaDoc on corresponding methods in  
org.apache.hive.hcatalog.templeton.Server describe this parameter.  

Additionally, if templeton.hive.archive, templeton.hive.home and 
templeton.hcat.home are defined in webhcat-site.xml (documented in 
webhcat-default.xml) then WebHCat will ship the Hive tar to the target node 
where the job actually runs.  This means that Hive doesn't need to be installed 
on every node in the Hadoop cluster.  (This part was added in HIVE-5547).  This 
is independent of security, but improves manageability.

This should be added to the sections in 
https://cwiki.apache.org/confluence/display/Hive/WebHCat+Reference that 
correspond to these methods.

  was:
The following REST calls in WebHCat:
1. mapreduce/jar
2. pig
now support an additional parameter 'usehcatalog'.  This is a mechanism for the 
caller to tell WebHCat that the submitted job uses HCat, and thus needs to 
access the metastore, which requires additional steps for WebHCat to perform.  

The JavaDoc on corresponding methods in  
org.apache.hive.hcatalog.templeton.Server describe this parameter.  

Additionally, if templeton.hive.archive, templeton.hive.home and 
templeton.hcat.home are defined in webhcat-site.xml (documented in 
webhcat-default.xml) then WebHCat will ship the Hive tar to the target node 
where the job actually runs.  This means that Hive doesn't need to be installed 
on every node in the Hadoop cluster.  (This part was added in HIVE-5547)

This should be added to the sections in 
https://cwiki.apache.org/confluence/display/Hive/WebHCat+Reference that 
correspond to these methods.


 Document 'usehcatalog' parameter on WebHCat calls
 -

 Key: HIVE-5627
 URL: https://issues.apache.org/jira/browse/HIVE-5627
 Project: Hive
  Issue Type: Sub-task
  Components: WebHCat
Affects Versions: 0.12.0
Reporter: Eugene Koifman
Assignee: Lefty Leverenz
 Fix For: 0.13.0


 The following REST calls in WebHCat:
 1. mapreduce/jar
 2. pig
 now support an additional parameter 'usehcatalog'.  This is a mechanism for 
 the caller to tell WebHCat that the submitted job uses HCat, and thus needs 
 to access the metastore, which requires additional steps for WebHCat to 
 perform in a secure cluster.  
 The JavaDoc on corresponding methods in  
 org.apache.hive.hcatalog.templeton.Server describe this parameter.  
 Additionally, if templeton.hive.archive, templeton.hive.home and 
 templeton.hcat.home are defined in webhcat-site.xml (documented in 
 webhcat-default.xml) then WebHCat will ship the Hive tar to the target node 
 where the job actually runs.  This means that Hive doesn't need to be 
 installed on every node in the Hadoop cluster.  (This part was added in 
 HIVE-5547).  This is independent of security, but improves manageability.
 This should be added to the sections in 
 https://cwiki.apache.org/confluence/display/Hive/WebHCat+Reference that 
 correspond to these methods.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HIVE-5629) Fix two javadoc failures in HCatalog

2013-10-23 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13803240#comment-13803240
 ] 

Eugene Koifman commented on HIVE-5629:
--

Why is {@link HCatInputFormat#setInput(org.apache.hadoop.mapreduce.Job, 
InputJobInfo)} causing an issue?

this is standard JavaDoc 
http://www.oracle.com/technetwork/java/javase/documentation/index-137868.html#examples


 Fix two javadoc failures in HCatalog
 

 Key: HIVE-5629
 URL: https://issues.apache.org/jira/browse/HIVE-5629
 Project: Hive
  Issue Type: Bug
Reporter: Brock Noland
Assignee: Brock Noland
 Attachments: HIVE-5629.patch


 I am seeing two javadoc failures on HCatalog. These are not being seen by 
 PTest and indeed I cannot reproduce on my Mac but can on Linux. Regardless 
 they should be fixed.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HIVE-5629) Fix two javadoc failures in HCatalog

2013-10-23 Thread Brock Noland (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13803242#comment-13803242
 ] 

Brock Noland commented on HIVE-5629:


That method was removed in HIVE-5454 therefore I removed the javadoc markup as 
that comment is still useful for legacy purposes without the link.

 Fix two javadoc failures in HCatalog
 

 Key: HIVE-5629
 URL: https://issues.apache.org/jira/browse/HIVE-5629
 Project: Hive
  Issue Type: Bug
Reporter: Brock Noland
Assignee: Brock Noland
 Attachments: HIVE-5629.patch


 I am seeing two javadoc failures on HCatalog. These are not being seen by 
 PTest and indeed I cannot reproduce on my Mac but can on Linux. Regardless 
 they should be fixed.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HIVE-5547) webhcat pig job submission should ship hive tar if -usehcatalog is specified

2013-10-23 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13803248#comment-13803248
 ] 

Eugene Koifman commented on HIVE-5547:
--

HIVE-5627 covers the doc for this bug

 webhcat pig job submission should ship hive tar if -usehcatalog is specified
 

 Key: HIVE-5547
 URL: https://issues.apache.org/jira/browse/HIVE-5547
 Project: Hive
  Issue Type: Bug
  Components: WebHCat
Affects Versions: 0.12.0
Reporter: Eugene Koifman
Assignee: Eugene Koifman
 Attachments: HIVE-5547.2.patch, HIVE-5547.patch


 Currently when when a Pig job is submitted through WebHCat and the Pig script 
 uses HCatalog, that means that Hive should be installed on the node in the 
 cluster which ends up executing the job.  For large clusters is this a 
 manageability issue so we should use DistributedCache to ship the Hive tar 
 file to the target node as part of job submission
 TestPig_11 in hcatalog/src/test/e2e/templeton/tests/jobsubmission.conf has 
 the test case for this



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Created] (HIVE-5630) http://hive.apache.org/docs/r0.12.0/api/ does not include any HCat classes

2013-10-23 Thread Eugene Koifman (JIRA)
Eugene Koifman created HIVE-5630:


 Summary: http://hive.apache.org/docs/r0.12.0/api/ does not include 
any HCat classes
 Key: HIVE-5630
 URL: https://issues.apache.org/jira/browse/HIVE-5630
 Project: Hive
  Issue Type: Bug
  Components: Documentation
Affects Versions: 0.12.0, 0.11.0, 0.10.0
Reporter: Eugene Koifman


same is true for 0.10 and 0.11



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HIVE-5630) http://hive.apache.org/docs/r0.12.0/api/ does not include any HCat classes

2013-10-23 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-5630:
-

Component/s: HCatalog

 http://hive.apache.org/docs/r0.12.0/api/ does not include any HCat classes
 --

 Key: HIVE-5630
 URL: https://issues.apache.org/jira/browse/HIVE-5630
 Project: Hive
  Issue Type: Bug
  Components: Documentation, HCatalog
Affects Versions: 0.10.0, 0.11.0, 0.12.0
Reporter: Eugene Koifman

 same is true for 0.10 and 0.11



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HIVE-4388) HBase tests fail against Hadoop 2

2013-10-23 Thread Sushanth Sowmyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-4388:
---

Attachment: HIVE-4388.10.patch

Attaching updated patch and re-marking patch-available so that precommit tests 
pick it up.

 HBase tests fail against Hadoop 2
 -

 Key: HIVE-4388
 URL: https://issues.apache.org/jira/browse/HIVE-4388
 Project: Hive
  Issue Type: Bug
  Components: HBase Handler
Reporter: Gunther Hagleitner
Assignee: Brock Noland
 Attachments: HIVE-4388.10.patch, HIVE-4388.patch, HIVE-4388.patch, 
 HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, 
 HIVE-4388.patch, HIVE-4388.patch, HIVE-4388-wip.txt


 Currently we're building by default against 0.92. When you run against hadoop 
 2 (-Dhadoop.mr.rev=23) builds fail because of: HBASE-5963.
 HIVE-3861 upgrades the version of hbase used. This will get you past the 
 problem in HBASE-5963 (which was fixed in 0.94.1) but fails with: HBASE-6396.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HIVE-4388) HBase tests fail against Hadoop 2

2013-10-23 Thread Sushanth Sowmyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-4388:
---

Status: Open  (was: Patch Available)

 HBase tests fail against Hadoop 2
 -

 Key: HIVE-4388
 URL: https://issues.apache.org/jira/browse/HIVE-4388
 Project: Hive
  Issue Type: Bug
  Components: HBase Handler
Reporter: Gunther Hagleitner
Assignee: Brock Noland
 Attachments: HIVE-4388.10.patch, HIVE-4388.patch, HIVE-4388.patch, 
 HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, 
 HIVE-4388.patch, HIVE-4388.patch, HIVE-4388-wip.txt


 Currently we're building by default against 0.92. When you run against hadoop 
 2 (-Dhadoop.mr.rev=23) builds fail because of: HBASE-5963.
 HIVE-3861 upgrades the version of hbase used. This will get you past the 
 problem in HBASE-5963 (which was fixed in 0.94.1) but fails with: HBASE-6396.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HIVE-4388) HBase tests fail against Hadoop 2

2013-10-23 Thread Sushanth Sowmyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-4388:
---

Status: Patch Available  (was: Open)

 HBase tests fail against Hadoop 2
 -

 Key: HIVE-4388
 URL: https://issues.apache.org/jira/browse/HIVE-4388
 Project: Hive
  Issue Type: Bug
  Components: HBase Handler
Reporter: Gunther Hagleitner
Assignee: Brock Noland
 Attachments: HIVE-4388.10.patch, HIVE-4388.patch, HIVE-4388.patch, 
 HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, 
 HIVE-4388.patch, HIVE-4388.patch, HIVE-4388-wip.txt


 Currently we're building by default against 0.92. When you run against hadoop 
 2 (-Dhadoop.mr.rev=23) builds fail because of: HBASE-5963.
 HIVE-3861 upgrades the version of hbase used. This will get you past the 
 problem in HBASE-5963 (which was fixed in 0.94.1) but fails with: HBASE-6396.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HIVE-4388) HBase tests fail against Hadoop 2

2013-10-23 Thread Brock Noland (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13803276#comment-13803276
 ] 

Brock Noland commented on HIVE-4388:


For my part it looks good! The only things I noted were:

* I don't see the version upgrade? I think the protobufs stuff will be invalid 
without hbase 0.96?
* 0.96 has been released so I think we can remove the SNAPSHOT stuff in 
addition to adding apache SNAPSHOT's to the hcat pom.

 HBase tests fail against Hadoop 2
 -

 Key: HIVE-4388
 URL: https://issues.apache.org/jira/browse/HIVE-4388
 Project: Hive
  Issue Type: Bug
  Components: HBase Handler
Reporter: Gunther Hagleitner
Assignee: Brock Noland
 Attachments: HIVE-4388.10.patch, HIVE-4388.patch, HIVE-4388.patch, 
 HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, 
 HIVE-4388.patch, HIVE-4388.patch, HIVE-4388-wip.txt


 Currently we're building by default against 0.92. When you run against hadoop 
 2 (-Dhadoop.mr.rev=23) builds fail because of: HBASE-5963.
 HIVE-3861 upgrades the version of hbase used. This will get you past the 
 problem in HBASE-5963 (which was fixed in 0.94.1) but fails with: HBASE-6396.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HIVE-5519) Use paging mechanism for templeton get requests.

2013-10-23 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-5519:


Description: 
Issuing a command to retrieve the jobs field using

https://mwinkledemo.azurehdinsight.net:563/templeton/v1/queue/job_id?user.name=adminfields=*
 --user u:p
will result in timeout in windows machine. The issue happens because of the 
amount of data that needs to be fetched. The proposal is to use paging based 
encoding scheme so that we flush the contents regularly and the client does not 
time out.

  was:
Issuing a command to retrieve the jobs field using

https://mwinkledemo.azurehdinsight.net:563/templeton/v1/queue/job_id?user.name=adminfields=*
 --user u:p
will result in timeout in windows machine. The issue happens because of the 
amount of data that needs to be fetched. The proposal is to introduce a new api 
to retrieve a list of job details rather than retrieve all the information 
using a single command.

Summary: Use paging mechanism for templeton get requests.  (was: 
Support ranges of job ids for templeton)

 Use paging mechanism for templeton get requests.
 

 Key: HIVE-5519
 URL: https://issues.apache.org/jira/browse/HIVE-5519
 Project: Hive
  Issue Type: Bug
  Components: WebHCat
Reporter: Hari Sankar Sivarama Subramaniyan
Assignee: Hari Sankar Sivarama Subramaniyan

 Issuing a command to retrieve the jobs field using
 https://mwinkledemo.azurehdinsight.net:563/templeton/v1/queue/job_id?user.name=adminfields=*
  --user u:p
 will result in timeout in windows machine. The issue happens because of the 
 amount of data that needs to be fetched. The proposal is to use paging based 
 encoding scheme so that we flush the contents regularly and the client does 
 not time out.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HIVE-4969) HCatalog HBaseHCatStorageHandler is not returning all the data

2013-10-23 Thread Venki Korukanti (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13803287#comment-13803287
 ] 

Venki Korukanti commented on HIVE-4969:
---

I haven't tested this on latest trunk. I will test it and attach a unittest.

 HCatalog HBaseHCatStorageHandler is not returning all the data
 --

 Key: HIVE-4969
 URL: https://issues.apache.org/jira/browse/HIVE-4969
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.11.0
Reporter: Venki Korukanti
Priority: Critical
 Attachments: HIVE-4969-1.patch


 Repro steps:
 1) Create an HCatalog table mapped to HBase table.
 hcat -e CREATE TABLE studentHCat(rownum int, name string, age int, gpa float)
  STORED BY 'org.apache.hcatalog.hbase.HBaseHCatStorageHandler'
  TBLPROPERTIES('hbase.table.name' ='studentHBase',  
'hbase.columns.mapping' =
 ':key,onecf:name,twocf:age,threecf:gpa');
 2) Load the following data from Pig.
 cat student_data
 1^Asarah laertes^A23^A2.40
 2^Atom allen^A72^A1.57
 3^Abob ovid^A61^A2.67
 4^Aethan nixon^A38^A2.15
 5^Acalvin robinson^A28^A2.53
 6^Airene ovid^A65^A2.56
 7^Ayuri garcia^A36^A1.65
 8^Acalvin nixon^A41^A1.04
 9^Ajessica davidson^A48^A2.11
 10^Akatie king^A39^A1.05
 grunt A = LOAD 'student_data' AS 
 (rownum:int,name:chararray,age:int,gpa:float);
 grunt STORE A INTO 'studentHCat' USING org.apache.hcatalog.pig.HCatStorer();
 3) Now from HBase do a scan on the studentHBase table
 hbase(main):026:0 scan 'studentPig', {LIMIT = 5}
 4) From pig access the data in table
 grunt A = LOAD 'studentHCat' USING org.apache.hcatalog.pig.HCatLoader();
 grunt STORE A INTO '/user/root/studentPig';
 5) Verify the output written in StudentPig
 hadoop fs -cat /user/root/studentPig/part-r-0
 1  23
 2  72
 3  61
 4  38
 5  28
 6  65
 7  36
 8  41
 9  48
 10 39
 The data returned has only two fields (rownum and age).
 Problem:
 While reading the data from HBase table, HbaseSnapshotRecordReader gets data 
 row in Result (org.apache.hadoop.hbase.client.Result) object and processes 
 the KeyValue fields in it. After processing, it creates another Result object 
 out of the processed KeyValue array. Problem here is KeyValue array is not 
 sorted. Result object expects the input KeyValue array to have sorted 
 elements. When we call the Result.getValue() it returns no value for some of 
 the fields as it does a binary search on un-ordered array.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Created] (HIVE-5631) Index creation on a skew table fails

2013-10-23 Thread Venki Korukanti (JIRA)
Venki Korukanti created HIVE-5631:
-

 Summary: Index creation on a skew table fails
 Key: HIVE-5631
 URL: https://issues.apache.org/jira/browse/HIVE-5631
 Project: Hive
  Issue Type: Bug
  Components: Database/Schema
Affects Versions: 0.12.0
Reporter: Venki Korukanti
Assignee: Venki Korukanti
 Fix For: 0.13.0


REPRO STEPS:

create database skewtest;
use skewtest;
create table skew (id bigint, acct string) skewed by (acct) on ('CC','CH');
create index skew_indx on table skew (id) as 
'org.apache.hadoop.hive.ql.index.compact.CompactIndexHandler' WITH DEFERRED 
REBUILD;

Last DDL fails with following error.
FAILED: Execution Error, return code 1 from 
org.apache.hadoop.hive.ql.exec.DDLTask. InvalidObjectException(message:Invalid 
skew column [acct])

When creating a table, Hive has sanity tests to make sure the columns have 
proper names and the skewed columns are subset of the table columns. Here we 
fail because index table has skewed column info. Index tables's skewed columns 
include {acct} and the columns are {id, _bucketname, _offsets}. As the skewed 
column {acct} is not part of the table columns Hive throws the exception.

The reason why Index table got skewed column info even though its definition 
has no such info is: When creating the index table a deep copy of the base 
table's StorageDescriptor (SD) (in this case 'skew') is made. And in that 
copied SD, index specific parameters are set and unrelated parameters are 
reset. Here skewed column info is not reset (there are few other params that 
are not reset). That's why the index table contains the skewed column info.

Fix: Instead of deep copying the base table StorageDescriptor, create a new one 
from gathered info. This way it avoids the index table to inherit unnecessary 
properties in SD from base table.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HIVE-5403) Move loading of filesystem, ugi, metastore client to hive session

2013-10-23 Thread Gunther Hagleitner (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13803292#comment-13803292
 ] 

Gunther Hagleitner commented on HIVE-5403:
--

Another one: PerfLogger doesn't work on the backend with this change anymore. 
The problem is that the SessionState now uses MetaException in the code path to 
start the session. That's not available on the backend. PerfLogger has logic to 
determine whether it runs front or backend. It does so by checking 
SessionState.get() == null.

That check cannot be executed anymore because loading the SessionState tries to 
resolve the MetaException (MetaStore api).

The easiest fix would be to collapse the exception handlers to one that catches 
exception (super class of meta store) and wraps that into a runtime exception. 
Logically that's no different from what's performed right now.
 Can we have a follow up to this one as well?

 Move loading of filesystem, ugi, metastore client to hive session
 -

 Key: HIVE-5403
 URL: https://issues.apache.org/jira/browse/HIVE-5403
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0
Reporter: Vikram Dixit K
Assignee: Vikram Dixit K
 Fix For: 0.13.0

 Attachments: HIVE-5403.1.patch, HIVE-5403.2.patch, HIVE-5403.3.patch, 
 HIVE-5403.4.patch


 As part of HIVE-5184, the metastore connection, loading filesystem were done 
 as part of the tez session so as to speed up query times while paying a cost 
 at startup. We can do this more generally in hive to apply to both the 
 mapreduce and tez side of things.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


Review Request 14887: Subquery support: disallow nesting of SubQueries

2013-10-23 Thread Harish Butani

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/14887/
---

Review request for hive and Ashutosh Chauhan.


Bugs: HIVE-5613
https://issues.apache.org/jira/browse/HIVE-5613


Repository: hive-git


Description
---

This is Restriction 9 from the SubQuery design doc:
We will not do algebraic transformations for these kinds of queries:

{noformat}
-query 1
select ...
from x
where  
x.b in (select u 
from y 
where y.c = 10 and
  exists (select m from z where z.A = x.C)
   )
- query 2
select ...
from x
where  
x.b in (select u 
from y 
where y.c = 10 and
  exists (select m from z where z.A = y.D)
{noformat}


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/parse/QB.java 50b5a77 
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 6fc3cd5 
  ql/src/java/org/apache/hadoop/hive/ql/parse/SubQueryUtils.java 2d7775c 
  ql/src/test/queries/clientnegative/subquery_nested_subquery.q PRE-CREATION 
  ql/src/test/results/clientnegative/subquery_nested_subquery.q.out 
PRE-CREATION 

Diff: https://reviews.apache.org/r/14887/diff/


Testing
---

tested subquery tests
added new subquery_nested_subquery.q negative test


Thanks,

Harish Butani



[jira] [Commented] (HIVE-5629) Fix two javadoc failures in HCatalog

2013-10-23 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13803314#comment-13803314
 ] 

Eugene Koifman commented on HIVE-5629:
--

I see

 Fix two javadoc failures in HCatalog
 

 Key: HIVE-5629
 URL: https://issues.apache.org/jira/browse/HIVE-5629
 Project: Hive
  Issue Type: Bug
Reporter: Brock Noland
Assignee: Brock Noland
 Attachments: HIVE-5629.patch


 I am seeing two javadoc failures on HCatalog. These are not being seen by 
 PTest and indeed I cannot reproduce on my Mac but can on Linux. Regardless 
 they should be fixed.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Created] (HIVE-5632) Eliminate splits based on SARGs using stripe statistics in ORC

2013-10-23 Thread Prasanth J (JIRA)
Prasanth J created HIVE-5632:


 Summary: Eliminate splits based on SARGs using stripe statistics 
in ORC
 Key: HIVE-5632
 URL: https://issues.apache.org/jira/browse/HIVE-5632
 Project: Hive
  Issue Type: Improvement
Affects Versions: 0.13.0
Reporter: Prasanth J
Assignee: Prasanth J


HIVE-5562 provides stripe level statistics in ORC. Stripe level statistics 
combined with predicate pushdown in ORC (HIVE-4246) can be used to eliminate 
the stripes (thereby splits) that doesn't satisfy the predicate condition. This 
can greatly reduce unnecessary reads.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HIVE-5632) Eliminate splits based on SARGs using stripe statistics in ORC

2013-10-23 Thread Prasanth J (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth J updated HIVE-5632:
-

Attachment: HIVE-5632.1.patch.txt

 Eliminate splits based on SARGs using stripe statistics in ORC
 --

 Key: HIVE-5632
 URL: https://issues.apache.org/jira/browse/HIVE-5632
 Project: Hive
  Issue Type: Improvement
Affects Versions: 0.13.0
Reporter: Prasanth J
Assignee: Prasanth J
  Labels: orcfile
 Attachments: HIVE-5632.1.patch.txt


 HIVE-5562 provides stripe level statistics in ORC. Stripe level statistics 
 combined with predicate pushdown in ORC (HIVE-4246) can be used to eliminate 
 the stripes (thereby splits) that doesn't satisfy the predicate condition. 
 This can greatly reduce unnecessary reads.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HIVE-5632) Eliminate splits based on SARGs using stripe statistics in ORC

2013-10-23 Thread Prasanth J (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13803342#comment-13803342
 ] 

Prasanth J commented on HIVE-5632:
--

This patch is generated on top of HIVE-5562. Test cases needs to be added.

 Eliminate splits based on SARGs using stripe statistics in ORC
 --

 Key: HIVE-5632
 URL: https://issues.apache.org/jira/browse/HIVE-5632
 Project: Hive
  Issue Type: Improvement
Affects Versions: 0.13.0
Reporter: Prasanth J
Assignee: Prasanth J
  Labels: orcfile
 Attachments: HIVE-5632.1.patch.txt


 HIVE-5562 provides stripe level statistics in ORC. Stripe level statistics 
 combined with predicate pushdown in ORC (HIVE-4246) can be used to eliminate 
 the stripes (thereby splits) that doesn't satisfy the predicate condition. 
 This can greatly reduce unnecessary reads.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HIVE-4388) HBase tests fail against Hadoop 2

2013-10-23 Thread Sushanth Sowmyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-4388:
---

Status: Open  (was: Patch Available)

 HBase tests fail against Hadoop 2
 -

 Key: HIVE-4388
 URL: https://issues.apache.org/jira/browse/HIVE-4388
 Project: Hive
  Issue Type: Bug
  Components: HBase Handler
Reporter: Gunther Hagleitner
Assignee: Brock Noland
 Attachments: HIVE-4388.10.patch, HIVE-4388.patch, HIVE-4388.patch, 
 HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, 
 HIVE-4388.patch, HIVE-4388.patch, HIVE-4388-wip.txt


 Currently we're building by default against 0.92. When you run against hadoop 
 2 (-Dhadoop.mr.rev=23) builds fail because of: HBASE-5963.
 HIVE-3861 upgrades the version of hbase used. This will get you past the 
 problem in HBASE-5963 (which was fixed in 0.94.1) but fails with: HBASE-6396.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HIVE-4388) HBase tests fail against Hadoop 2

2013-10-23 Thread Sushanth Sowmyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13803348#comment-13803348
 ] 

Sushanth Sowmyan commented on HIVE-4388:


Ack, that's because I was building with a -Dhbase.version.with.hadoop.version 
whenever I built. Sorry, updating.

And agreed, it makes sense to remove that ${use.hbase.snapshot} special casing. 
Removing it.

 HBase tests fail against Hadoop 2
 -

 Key: HIVE-4388
 URL: https://issues.apache.org/jira/browse/HIVE-4388
 Project: Hive
  Issue Type: Bug
  Components: HBase Handler
Reporter: Gunther Hagleitner
Assignee: Brock Noland
 Attachments: HIVE-4388.10.patch, HIVE-4388.patch, HIVE-4388.patch, 
 HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, 
 HIVE-4388.patch, HIVE-4388.patch, HIVE-4388-wip.txt


 Currently we're building by default against 0.92. When you run against hadoop 
 2 (-Dhadoop.mr.rev=23) builds fail because of: HBASE-5963.
 HIVE-3861 upgrades the version of hbase used. This will get you past the 
 problem in HBASE-5963 (which was fixed in 0.94.1) but fails with: HBASE-6396.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HIVE-4388) HBase tests fail against Hadoop 2

2013-10-23 Thread Sushanth Sowmyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-4388:
---

Attachment: HIVE-4388.11.patch

 HBase tests fail against Hadoop 2
 -

 Key: HIVE-4388
 URL: https://issues.apache.org/jira/browse/HIVE-4388
 Project: Hive
  Issue Type: Bug
  Components: HBase Handler
Reporter: Gunther Hagleitner
Assignee: Brock Noland
 Attachments: HIVE-4388.10.patch, HIVE-4388.11.patch, HIVE-4388.patch, 
 HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, 
 HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388-wip.txt


 Currently we're building by default against 0.92. When you run against hadoop 
 2 (-Dhadoop.mr.rev=23) builds fail because of: HBASE-5963.
 HIVE-3861 upgrades the version of hbase used. This will get you past the 
 problem in HBASE-5963 (which was fixed in 0.94.1) but fails with: HBASE-6396.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HIVE-4388) HBase tests fail against Hadoop 2

2013-10-23 Thread Sushanth Sowmyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-4388:
---

Status: Patch Available  (was: Open)

 HBase tests fail against Hadoop 2
 -

 Key: HIVE-4388
 URL: https://issues.apache.org/jira/browse/HIVE-4388
 Project: Hive
  Issue Type: Bug
  Components: HBase Handler
Reporter: Gunther Hagleitner
Assignee: Brock Noland
 Attachments: HIVE-4388.10.patch, HIVE-4388.11.patch, HIVE-4388.patch, 
 HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, 
 HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388-wip.txt


 Currently we're building by default against 0.92. When you run against hadoop 
 2 (-Dhadoop.mr.rev=23) builds fail because of: HBASE-5963.
 HIVE-3861 upgrades the version of hbase used. This will get you past the 
 problem in HBASE-5963 (which was fixed in 0.94.1) but fails with: HBASE-6396.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


Review Request 14890: Index creation on a skew table fails

2013-10-23 Thread Venki Korukanti

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/14890/
---

Review request for hive, Ashutosh Chauhan and Thejas Nair.


Bugs: HIVE-5631
https://issues.apache.org/jira/browse/HIVE-5631


Repository: hive-git


Description
---

Repro steps:
CREATE DATABASE skewtest;
USE skewtest;
CREATE TABLE skew (id bigint, acct string) SKEWED BY (acct) ON ('CC','CH');
CREATE INDEX skew_indx ON TABLE skew (id) as 
'org.apache.hadoop.hive.ql.index.compact.CompactIndexHandler' WITH DEFERRED 
REBUILD;

Last DDL fails with following error.
FAILED: Execution Error, return code 1 from 
org.apache.hadoop.hive.ql.exec.DDLTask. InvalidObjectException(message:Invalid 
skew column [acct])

When creating a table, Hive has sanity tests to make sure the columns have 
proper names and the skewed columns are subset of the table columns. Here we 
fail because index table has skewed column info. Index tables's skewed columns 
include {acct} and the columns are {id, _bucketname, _offsets}. As the skewed 
column {acct} is not part of the table columns Hive throws the exception. 

The reason why Index table got skewed column info even though its definition 
has no such info is: When creating the index table a deep copy of the base 
table's StorageDescriptor (SD) (in this case 'skew') is made. And in that 
copied SD, index specific parameters are set and unrelated parameters are 
reset. Here skewed column info is not reset (there are few other params that 
are not reset). That's why the index table contains the skewed column info.

Fix: Instead of deep copying the base table StorageDescriptor, create a new one 
from gathered info. This way it avoids the index table to inherit unnecessary 
properties in SD from base table.


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java b0f124b 
  ql/src/test/org/apache/hadoop/hive/ql/metadata/TestHive.java d0cbed6 
  ql/src/test/queries/clientpositive/index_skewtable.q PRE-CREATION 
  ql/src/test/results/clientpositive/index_skewtable.q.out PRE-CREATION 

Diff: https://reviews.apache.org/r/14890/diff/


Testing
---

Added unittest and ran the index related unittest queries


Thanks,

Venki Korukanti



[jira] [Commented] (HIVE-5631) Index creation on a skew table fails

2013-10-23 Thread Venki Korukanti (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13803363#comment-13803363
 ] 

Venki Korukanti commented on HIVE-5631:
---

Review: https://reviews.apache.org/r/14890/

 Index creation on a skew table fails
 

 Key: HIVE-5631
 URL: https://issues.apache.org/jira/browse/HIVE-5631
 Project: Hive
  Issue Type: Bug
  Components: Database/Schema
Affects Versions: 0.12.0
Reporter: Venki Korukanti
Assignee: Venki Korukanti
 Fix For: 0.13.0


 REPRO STEPS:
 create database skewtest;
 use skewtest;
 create table skew (id bigint, acct string) skewed by (acct) on ('CC','CH');
 create index skew_indx on table skew (id) as 
 'org.apache.hadoop.hive.ql.index.compact.CompactIndexHandler' WITH DEFERRED 
 REBUILD;
 Last DDL fails with following error.
 FAILED: Execution Error, return code 1 from 
 org.apache.hadoop.hive.ql.exec.DDLTask. 
 InvalidObjectException(message:Invalid skew column [acct])
 When creating a table, Hive has sanity tests to make sure the columns have 
 proper names and the skewed columns are subset of the table columns. Here we 
 fail because index table has skewed column info. Index tables's skewed 
 columns include {acct} and the columns are {id, _bucketname, _offsets}. As 
 the skewed column {acct} is not part of the table columns Hive throws the 
 exception.
 The reason why Index table got skewed column info even though its definition 
 has no such info is: When creating the index table a deep copy of the base 
 table's StorageDescriptor (SD) (in this case 'skew') is made. And in that 
 copied SD, index specific parameters are set and unrelated parameters are 
 reset. Here skewed column info is not reset (there are few other params that 
 are not reset). That's why the index table contains the skewed column info.
 Fix: Instead of deep copying the base table StorageDescriptor, create a new 
 one from gathered info. This way it avoids the index table to inherit 
 unnecessary properties in SD from base table.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


Re: Review Request 14870: HIVE-5351: Secure-Socket-Layer (SSL) support for HiveServer2

2013-10-23 Thread Prasad Mujumdar

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/14870/
---

(Updated Oct. 23, 2013, 9:44 p.m.)


Review request for hive, Brock Noland and Thejas Nair.


Bugs: HIVE-5351
https://issues.apache.org/jira/browse/HIVE-5351


Repository: hive-git


Description
---

Add support for encrypted communication for Plain SASL for binary thrift 
transport.
 - Optional thrift SSL transport on server side if configured.
 - Optional thrift SSL transport for JDBC client with configurable trust store
 - Added a miniHS2 class that for running a hiveserver2 for testing


Diffs (updated)
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java abfde42 
  data/files/keystore.jks PRE-CREATION 
  data/files/truststore.jks PRE-CREATION 
  eclipse-templates/TestJdbcMiniHS2.launchtemplate PRE-CREATION 
  jdbc/ivy.xml b9d0cea 
  jdbc/src/java/org/apache/hive/jdbc/HiveConnection.java f155686 
  jdbc/src/test/org/apache/hive/jdbc/TestJdbcWithMiniHS2.java PRE-CREATION 
  jdbc/src/test/org/apache/hive/jdbc/TestSSL.java PRE-CREATION 
  metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 
24b1832 
  service/src/java/org/apache/hive/service/auth/HiveAuthFactory.java 5a66a6c 
  
service/src/java/org/apache/hive/service/cli/thrift/ThriftBinaryCLIService.java 
9c8f5c1 
  service/src/test/org/apache/hive/service/miniHS2/AbstarctHiveService.java 
PRE-CREATION 
  service/src/test/org/apache/hive/service/miniHS2/MiniHS2.java PRE-CREATION 
  service/src/test/org/apache/hive/service/miniHS2/TestHiveServer2.java 
PRE-CREATION 
  shims/src/0.20/java/org/apache/hadoop/hive/shims/Hadoop20Shims.java 623ebcd 

Diff: https://reviews.apache.org/r/14870/diff/


Testing
---

- Basic HiveServer2 test cases with miniHS2
- Added multiple test cases for SSL transport


Thanks,

Prasad Mujumdar



[jira] [Updated] (HIVE-5351) Secure-Socket-Layer (SSL) support for HiveServer2

2013-10-23 Thread Prasad Mujumdar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasad Mujumdar updated HIVE-5351:
--

Attachment: HIVE-5351.2.patch

 Secure-Socket-Layer (SSL) support for HiveServer2
 -

 Key: HIVE-5351
 URL: https://issues.apache.org/jira/browse/HIVE-5351
 Project: Hive
  Issue Type: Improvement
  Components: Authorization, HiveServer2, JDBC
Affects Versions: 0.11.0, 0.12.0
Reporter: Prasad Mujumdar
Assignee: Prasad Mujumdar
 Attachments: HIVE-5351.1.patch, HIVE-5351.2.patch


 HiveServer2 and JDBC driver should support encrypted communication using SSL



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HIVE-5351) Secure-Socket-Layer (SSL) support for HiveServer2

2013-10-23 Thread Prasad Mujumdar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasad Mujumdar updated HIVE-5351:
--

Attachment: HIVE-5351.2.patch

 Secure-Socket-Layer (SSL) support for HiveServer2
 -

 Key: HIVE-5351
 URL: https://issues.apache.org/jira/browse/HIVE-5351
 Project: Hive
  Issue Type: Improvement
  Components: Authorization, HiveServer2, JDBC
Affects Versions: 0.11.0, 0.12.0
Reporter: Prasad Mujumdar
Assignee: Prasad Mujumdar
 Attachments: HIVE-5351.1.patch, HIVE-5351.2.patch


 HiveServer2 and JDBC driver should support encrypted communication using SSL



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HIVE-5351) Secure-Socket-Layer (SSL) support for HiveServer2

2013-10-23 Thread Prasad Mujumdar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasad Mujumdar updated HIVE-5351:
--

Attachment: (was: HIVE-5351.2.patch)

 Secure-Socket-Layer (SSL) support for HiveServer2
 -

 Key: HIVE-5351
 URL: https://issues.apache.org/jira/browse/HIVE-5351
 Project: Hive
  Issue Type: Improvement
  Components: Authorization, HiveServer2, JDBC
Affects Versions: 0.11.0, 0.12.0
Reporter: Prasad Mujumdar
Assignee: Prasad Mujumdar
 Attachments: HIVE-5351.1.patch, HIVE-5351.2.patch


 HiveServer2 and JDBC driver should support encrypted communication using SSL



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HIVE-5631) Index creation on a skew table fails

2013-10-23 Thread Venki Korukanti (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Venki Korukanti updated HIVE-5631:
--

Attachment: HIVE-5631.1.patch.txt

 Index creation on a skew table fails
 

 Key: HIVE-5631
 URL: https://issues.apache.org/jira/browse/HIVE-5631
 Project: Hive
  Issue Type: Bug
  Components: Database/Schema
Affects Versions: 0.12.0
Reporter: Venki Korukanti
Assignee: Venki Korukanti
 Fix For: 0.13.0

 Attachments: HIVE-5631.1.patch.txt


 REPRO STEPS:
 create database skewtest;
 use skewtest;
 create table skew (id bigint, acct string) skewed by (acct) on ('CC','CH');
 create index skew_indx on table skew (id) as 
 'org.apache.hadoop.hive.ql.index.compact.CompactIndexHandler' WITH DEFERRED 
 REBUILD;
 Last DDL fails with following error.
 FAILED: Execution Error, return code 1 from 
 org.apache.hadoop.hive.ql.exec.DDLTask. 
 InvalidObjectException(message:Invalid skew column [acct])
 When creating a table, Hive has sanity tests to make sure the columns have 
 proper names and the skewed columns are subset of the table columns. Here we 
 fail because index table has skewed column info. Index tables's skewed 
 columns include {acct} and the columns are {id, _bucketname, _offsets}. As 
 the skewed column {acct} is not part of the table columns Hive throws the 
 exception.
 The reason why Index table got skewed column info even though its definition 
 has no such info is: When creating the index table a deep copy of the base 
 table's StorageDescriptor (SD) (in this case 'skew') is made. And in that 
 copied SD, index specific parameters are set and unrelated parameters are 
 reset. Here skewed column info is not reset (there are few other params that 
 are not reset). That's why the index table contains the skewed column info.
 Fix: Instead of deep copying the base table StorageDescriptor, create a new 
 one from gathered info. This way it avoids the index table to inherit 
 unnecessary properties in SD from base table.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HIVE-5631) Index creation on a skew table fails

2013-10-23 Thread Venki Korukanti (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Venki Korukanti updated HIVE-5631:
--

Status: Patch Available  (was: Open)

 Index creation on a skew table fails
 

 Key: HIVE-5631
 URL: https://issues.apache.org/jira/browse/HIVE-5631
 Project: Hive
  Issue Type: Bug
  Components: Database/Schema
Affects Versions: 0.12.0
Reporter: Venki Korukanti
Assignee: Venki Korukanti
 Fix For: 0.13.0

 Attachments: HIVE-5631.1.patch.txt


 REPRO STEPS:
 create database skewtest;
 use skewtest;
 create table skew (id bigint, acct string) skewed by (acct) on ('CC','CH');
 create index skew_indx on table skew (id) as 
 'org.apache.hadoop.hive.ql.index.compact.CompactIndexHandler' WITH DEFERRED 
 REBUILD;
 Last DDL fails with following error.
 FAILED: Execution Error, return code 1 from 
 org.apache.hadoop.hive.ql.exec.DDLTask. 
 InvalidObjectException(message:Invalid skew column [acct])
 When creating a table, Hive has sanity tests to make sure the columns have 
 proper names and the skewed columns are subset of the table columns. Here we 
 fail because index table has skewed column info. Index tables's skewed 
 columns include {acct} and the columns are {id, _bucketname, _offsets}. As 
 the skewed column {acct} is not part of the table columns Hive throws the 
 exception.
 The reason why Index table got skewed column info even though its definition 
 has no such info is: When creating the index table a deep copy of the base 
 table's StorageDescriptor (SD) (in this case 'skew') is made. And in that 
 copied SD, index specific parameters are set and unrelated parameters are 
 reset. Here skewed column info is not reset (there are few other params that 
 are not reset). That's why the index table contains the skewed column info.
 Fix: Instead of deep copying the base table StorageDescriptor, create a new 
 one from gathered info. This way it avoids the index table to inherit 
 unnecessary properties in SD from base table.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HIVE-5506) Hive SPLIT function does not return array correctly

2013-10-23 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13803378#comment-13803378
 ] 

Hudson commented on HIVE-5506:
--

ABORTED: Integrated in Hive-trunk-hadoop2 #518 (See 
[https://builds.apache.org/job/Hive-trunk-hadoop2/518/])
HIVE-5506 : Hive SPLIT function does not return array correctly (Vikram Dixit 
via Ashutosh Chauhan) (hashutosh: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1534775)
* /hive/trunk/data/files/input.txt
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFSplit.java
* /hive/trunk/ql/src/test/queries/clientpositive/split.q
* /hive/trunk/ql/src/test/results/clientpositive/split.q.out
* /hive/trunk/ql/src/test/results/clientpositive/udf_split.q.out


 Hive SPLIT function does not return array correctly
 ---

 Key: HIVE-5506
 URL: https://issues.apache.org/jira/browse/HIVE-5506
 Project: Hive
  Issue Type: Bug
  Components: SQL, UDF
Affects Versions: 0.9.0, 0.10.0, 0.11.0
 Environment: Hive
Reporter: John Omernik
Assignee: Vikram Dixit K
 Fix For: 0.13.0

 Attachments: HIVE-5506.1.patch, HIVE-5506.2.patch


 Hello all, I think I have outlined a bug in the hive split function:
 Summary: When calling split on a string of data, it will only return all 
 array items if the the last array item has a value. For example, if I have a 
 string of text delimited by tab with 7 columns, and the first four are 
 filled, but the last three are blank, split will only return a 4 position 
 array. If  any number of middle columns are empty, but the last item still 
 has a value, then it will return the proper number of columns.  This was 
 tested in Hive 0.9 and hive 0.11. 
 Data:
 (Note \t represents a tab char, \x09 the line endings should be \n (UNIX 
 style) not sure what email will do to them).  Basically my data is 7 lines of 
 data with the first 7 letters separated by tab.  On some lines I've left out 
 certain letters, but kept the number of tabs exactly the same.  
 input.txt
 a\tb\tc\td\te\tf\tg
 a\tb\tc\td\te\t\tg
 a\tb\t\td\t\tf\tg
 \t\t\td\te\tf\tg
 a\tb\tc\td\t\t\t
 a\t\t\t\te\tf\tg
 a\t\t\td\t\t\tg
 I then created a table with one column from that data:
 DROP TABLE tmp_jo_tab_test;
 CREATE table tmp_jo_tab_test (message_line STRING)
 STORED AS TEXTFILE;
  
 LOAD DATA LOCAL INPATH '/tmp/input.txt'
 OVERWRITE INTO TABLE tmp_jo_tab_test;
 Ok just to validate I created a python counting script:
 #!/usr/bin/python
  
 import sys
  
  
 for line in sys.stdin:
 line = line[0:-1]
 out = line.split(\t)
 print len(out)
 The output there is : 
 $ cat input.txt |./cnt_tabs.py
 7
 7
 7
 7
 7
 7
 7
 Based on that information, split on tab should return me 7 for each line as 
 well:
 hive -e select size(split(message_line, '\\t')) from tmp_jo_tab_test;
  
 7
 7
 7
 7
 4
 7
 7
 However it does not.  It would appear that the line where only the first four 
 letters are filled in(and blank is passed in on the last three) only returns 
 4 splits, where there should technically be 7, 4 for letters included, and 
 three blanks.  
 a\tb\tc\td\t\t\t 



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Assigned] (HIVE-5216) Need to annotate public API in HCatalog

2013-10-23 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman reassigned HIVE-5216:


Assignee: Eugene Koifman

 Need to annotate public API in HCatalog
 ---

 Key: HIVE-5216
 URL: https://issues.apache.org/jira/browse/HIVE-5216
 Project: Hive
  Issue Type: Bug
  Components: HCatalog, WebHCat
Affects Versions: 0.12.0
Reporter: Eugene Koifman
Assignee: Eugene Koifman
 Attachments: HIVE-5216.patch


 need to annotate which API is considered public using something like
 @InterfaceAudience.Public
 @InterfaceStability.Evolving
 Currently this is what is considered (at a minimum) public API
 HCatLoader
 HCatStorer
 HCatInputFormat
 HCatOutputFormat
 HCatReader
 HCatWriter
 HCatRecord
 HCatSchema
 This is needed so that clients/dependent projects know which API they can 
 rely on and which can change w/o notice.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HIVE-5216) Need to annotate public API in HCatalog

2013-10-23 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-5216:
-

Status: Patch Available  (was: Open)

 Need to annotate public API in HCatalog
 ---

 Key: HIVE-5216
 URL: https://issues.apache.org/jira/browse/HIVE-5216
 Project: Hive
  Issue Type: Bug
  Components: HCatalog, WebHCat
Affects Versions: 0.12.0
Reporter: Eugene Koifman
Assignee: Eugene Koifman
 Attachments: HIVE-5216.patch


 need to annotate which API is considered public using something like
 @InterfaceAudience.Public
 @InterfaceStability.Evolving
 Currently this is what is considered (at a minimum) public API
 HCatLoader
 HCatStorer
 HCatInputFormat
 HCatOutputFormat
 HCatReader
 HCatWriter
 HCatRecord
 HCatSchema
 This is needed so that clients/dependent projects know which API they can 
 rely on and which can change w/o notice.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HIVE-5216) Need to annotate public API in HCatalog

2013-10-23 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-5216:
-

Attachment: HIVE-5216.patch

 Need to annotate public API in HCatalog
 ---

 Key: HIVE-5216
 URL: https://issues.apache.org/jira/browse/HIVE-5216
 Project: Hive
  Issue Type: Bug
  Components: HCatalog, WebHCat
Affects Versions: 0.12.0
Reporter: Eugene Koifman
Assignee: Eugene Koifman
 Attachments: HIVE-5216.patch


 need to annotate which API is considered public using something like
 @InterfaceAudience.Public
 @InterfaceStability.Evolving
 Currently this is what is considered (at a minimum) public API
 HCatLoader
 HCatStorer
 HCatInputFormat
 HCatOutputFormat
 HCatReader
 HCatWriter
 HCatRecord
 HCatSchema
 This is needed so that clients/dependent projects know which API they can 
 rely on and which can change w/o notice.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HIVE-5283) Merge vectorization branch to trunk

2013-10-23 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13803389#comment-13803389
 ] 

Thejas M Nair commented on HIVE-5283:
-

Added fix version of 0.13 in addition to vectorization-branch for these 106 
jiras (fixed+ fix-version=vectorization-branch ).


 Merge vectorization branch to trunk
 ---

 Key: HIVE-5283
 URL: https://issues.apache.org/jira/browse/HIVE-5283
 Project: Hive
  Issue Type: Bug
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Fix For: 0.13.0

 Attachments: alltypesorc, HIVE-5283.1.patch, HIVE-5283.2.patch, 
 HIVE-5283.3.patch, HIVE-5283.4.patch


 The purpose of this jira is to upload vectorization patch, run tests etc. The 
 actual work will continue under HIVE-4160 umbrella jira.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HIVE-5216) Need to annotate public API in HCatalog

2013-10-23 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13803392#comment-13803392
 ] 

Thejas M Nair commented on HIVE-5216:
-

+1

 Need to annotate public API in HCatalog
 ---

 Key: HIVE-5216
 URL: https://issues.apache.org/jira/browse/HIVE-5216
 Project: Hive
  Issue Type: Bug
  Components: HCatalog, WebHCat
Affects Versions: 0.12.0
Reporter: Eugene Koifman
Assignee: Eugene Koifman
 Attachments: HIVE-5216.patch


 need to annotate which API is considered public using something like
 @InterfaceAudience.Public
 @InterfaceStability.Evolving
 Currently this is what is considered (at a minimum) public API
 HCatLoader
 HCatStorer
 HCatInputFormat
 HCatOutputFormat
 HCatReader
 HCatWriter
 HCatRecord
 HCatSchema
 This is needed so that clients/dependent projects know which API they can 
 rely on and which can change w/o notice.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


  1   2   >