Re: Review Request: HIVE-4356 - remove duplicate impersonation parameters for hiveserver2

2013-05-09 Thread Carl Steinbach

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/10554/#review20369
---



common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
https://reviews.apache.org/r/10554/#comment41943

Why was hive.server2.enable.doAs chosen over 
hive.server2.enable.impersonation? Also, this behavior was disabled by default 
in both CDH4 and HDP 1.2. I'm not sure that flipping the switch now is a great 
idea, and long term (i.e. with fine-grained authorization) this is definitely 
the wrong default behavior.



service/src/java/org/apache/hive/service/cli/thrift/ThriftCLIService.java
https://reviews.apache.org/r/10554/#comment41944

I think this is a lot less clear now that the username code has been pulled 
out into a separate method. The name getUserName implies that it's extracting 
this information from TOpenSessionReq when that's not actually the case. Please 
move this back into OpenSession() and add a comment there if you think it's 
necessary.



service/src/java/org/apache/hive/service/cli/thrift/ThriftCLIService.java
https://reviews.apache.org/r/10554/#comment41945

Is refactoring this code into a separate protected method really necessary? 
Every other method in this class is self-contained. I'd like to see that 
pattern maintained.



service/src/test/org/apache/hive/service/auth/TestPlainSaslHelper.java
https://reviews.apache.org/r/10554/#comment41946

Formatting.



service/src/test/org/apache/hive/service/cli/thrift/TestThriftCLIService.java
https://reviews.apache.org/r/10554/#comment41947

Formatting



service/src/test/org/apache/hive/service/cli/thrift/TestThriftCLIService.java
https://reviews.apache.org/r/10554/#comment41950

Formatting



service/src/test/org/apache/hive/service/cli/thrift/TestThriftCLIService.java
https://reviews.apache.org/r/10554/#comment41948

Formatting



service/src/test/org/apache/hive/service/cli/thrift/TestThriftCLIService.java
https://reviews.apache.org/r/10554/#comment41949

Formatting


- Carl Steinbach


On April 16, 2013, 9:46 p.m., Thejas Nair wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/10554/
 ---
 
 (Updated April 16, 2013, 9:46 p.m.)
 
 
 Review request for hive.
 
 
 Description
 ---
 
 remove duplicate impersonation parameters for hiveserver2
 
 
 This addresses bug HIVE-4356.
 https://issues.apache.org/jira/browse/HIVE-4356
 
 
 Diffs
 -
 
   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 78d9cc9 
   conf/hive-default.xml.template e266ce7 
   service/src/java/org/apache/hive/service/auth/PlainSaslHelper.java 18d4aae 
   service/src/java/org/apache/hive/service/cli/CLIService.java b53599b 
   service/src/java/org/apache/hive/service/cli/thrift/ThriftCLIService.java 
 43d79aa 
   service/src/test/org/apache/hive/service/auth/TestPlainSaslHelper.java 
 PRE-CREATION 
   
 service/src/test/org/apache/hive/service/cli/thrift/TestThriftCLIService.java 
 PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/10554/diff/
 
 
 Testing
 ---
 
 Unit tests included.
 Manually tested on (kerberos) secure and unsecure cluster.
 
 
 Thanks,
 
 Thejas Nair
 




[jira] [Commented] (HIVE-4498) TestBeeLineWithArgs.testPositiveScriptFile fails

2013-05-09 Thread Carl Steinbach (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13652802#comment-13652802
 ] 

Carl Steinbach commented on HIVE-4498:
--

@Thejas: Thanks for providing a patch. I verified that it fixes the problem. I 
also took a look at HIVE-4356 and left some comments on the original 
reviewboard request here: https://reviews.apache.org/r/10554/

I would really appreciate it if you take a look. Thanks.

 TestBeeLineWithArgs.testPositiveScriptFile fails
 

 Key: HIVE-4498
 URL: https://issues.apache.org/jira/browse/HIVE-4498
 Project: Hive
  Issue Type: Bug
  Components: CLI, HiveServer2, JDBC
Reporter: Thejas M Nair
Assignee: Thejas M Nair
Priority: Blocker
 Fix For: 0.11.0

 Attachments: HIVE-4498.1.patch


 TestBeeLineWithArgs.testPositiveScriptFile fails -
 {code}
[junit] 0: jdbc:hive2://localhost:1  STARTED 
 testBreakOnErrorScriptFile
 [junit] Output: Connecting to jdbc:hive2://localhost:1
 [junit] Connected to: Hive (version 0.12.0-SNAPSHOT)
 [junit] Driver: Hive (version 0.12.0-SNAPSHOT)
 [junit] Transaction isolation: TRANSACTION_REPEATABLE_READ
 [junit] Beeline version 0.12.0-SNAPSHOT by Apache Hive
 [junit] ++
 [junit] | database_name  |
 [junit] ++
 [junit] ++
 [junit] No rows selected (0.899 seconds)
 [junit] Closing: org.apache.hive.jdbc.HiveConnection
 [junit]
 [junit]  FAILED testPositiveScriptFile (ERROR) (2s)
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4467) HiveConnection does not handle failures correctly

2013-05-09 Thread Thiruvel Thirumoolan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvel Thirumoolan updated HIVE-4467:
---

Attachment: HIVE-4467_1.patch

Updated patch on phabricator and https://reviews.facebook.net/D10629 and also 
uploaded here (HIVE-4467_1.patch).

 HiveConnection does not handle failures correctly
 -

 Key: HIVE-4467
 URL: https://issues.apache.org/jira/browse/HIVE-4467
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2, JDBC
Affects Versions: 0.11.0, 0.12.0
Reporter: Thiruvel Thirumoolan
Assignee: Thiruvel Thirumoolan
 Attachments: HIVE-4467_1.patch, HIVE-4467.patch


 HiveConnection uses Utils.verifySuccess* routines to check if there is any 
 error from the server side. This is not handled well. In 
 Utils.verifySuccess() when withInfo is 'false', the condition evaluates to 
 'false' and no SQLexception is thrown even though there could be a problem on 
 the server.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4457) Queries not supported by vectorized code path should fall back to non vector path.

2013-05-09 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4457?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-4457:
---

Attachment: HIVE-4457.1.patch

 Queries not supported by vectorized code path should fall back to non vector 
 path.
 --

 Key: HIVE-4457
 URL: https://issues.apache.org/jira/browse/HIVE-4457
 Project: Hive
  Issue Type: Sub-task
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-4457.1.patch


 Queries not supported by vectorized code path should fall back to non vector 
 path.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4115) Introduce cube abstraction in hive

2013-05-09 Thread Phabricator (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HIVE-4115:
--

Attachment: HIVE-4115.D10689.1.patch

Amareshwari requested code review of HIVE-4115 [jira] Introduce cube 
abstraction in hive.

Reviewers: JIRA

HIVE-4115. Cube Abstraction in Hive

We would like to define a cube abstraction so that user can query at cube layer 
and do not know anything about storage and rollups.

Will describe the model more in following comments.

TEST PLAN
  EMPTY

REVISION DETAIL
  https://reviews.facebook.net/D10689

AFFECTED FILES
  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
  ql/src/java/org/apache/hadoop/hive/ql/Driver.java
  ql/src/java/org/apache/hadoop/hive/ql/cube/metadata/AbstractCubeTable.java
  ql/src/java/org/apache/hadoop/hive/ql/cube/metadata/BaseDimension.java
  ql/src/java/org/apache/hadoop/hive/ql/cube/metadata/ColumnMeasure.java
  ql/src/java/org/apache/hadoop/hive/ql/cube/metadata/Cube.java
  ql/src/java/org/apache/hadoop/hive/ql/cube/metadata/CubeDimension.java
  ql/src/java/org/apache/hadoop/hive/ql/cube/metadata/CubeDimensionTable.java
  ql/src/java/org/apache/hadoop/hive/ql/cube/metadata/CubeFactTable.java
  ql/src/java/org/apache/hadoop/hive/ql/cube/metadata/CubeMeasure.java
  ql/src/java/org/apache/hadoop/hive/ql/cube/metadata/CubeMetastoreClient.java
  ql/src/java/org/apache/hadoop/hive/ql/cube/metadata/CubeTableType.java
  ql/src/java/org/apache/hadoop/hive/ql/cube/metadata/ExprMeasure.java
  ql/src/java/org/apache/hadoop/hive/ql/cube/metadata/HDFSStorage.java
  ql/src/java/org/apache/hadoop/hive/ql/cube/metadata/HierarchicalDimension.java
  ql/src/java/org/apache/hadoop/hive/ql/cube/metadata/InlineDimension.java
  ql/src/java/org/apache/hadoop/hive/ql/cube/metadata/MetastoreConstants.java
  ql/src/java/org/apache/hadoop/hive/ql/cube/metadata/MetastoreUtil.java
  ql/src/java/org/apache/hadoop/hive/ql/cube/metadata/Named.java
  ql/src/java/org/apache/hadoop/hive/ql/cube/metadata/ReferencedDimension.java
  ql/src/java/org/apache/hadoop/hive/ql/cube/metadata/Storage.java
  ql/src/java/org/apache/hadoop/hive/ql/cube/metadata/StorageConstants.java
  ql/src/java/org/apache/hadoop/hive/ql/cube/metadata/TableReference.java
  ql/src/java/org/apache/hadoop/hive/ql/cube/metadata/UpdatePeriod.java
  ql/src/java/org/apache/hadoop/hive/ql/cube/parse/AggregateResolver.java
  ql/src/java/org/apache/hadoop/hive/ql/cube/parse/AliasReplacer.java
  ql/src/java/org/apache/hadoop/hive/ql/cube/parse/CheckColumnMapping.java
  ql/src/java/org/apache/hadoop/hive/ql/cube/parse/CheckDateRange.java
  ql/src/java/org/apache/hadoop/hive/ql/cube/parse/CheckTableNames.java
  ql/src/java/org/apache/hadoop/hive/ql/cube/parse/ContextRewriter.java
  ql/src/java/org/apache/hadoop/hive/ql/cube/parse/CubeQueryContext.java
  ql/src/java/org/apache/hadoop/hive/ql/cube/parse/CubeQueryExpr.java
  ql/src/java/org/apache/hadoop/hive/ql/cube/parse/CubeQueryRewriter.java
  ql/src/java/org/apache/hadoop/hive/ql/cube/parse/CubeSemanticAnalyzer.java
  ql/src/java/org/apache/hadoop/hive/ql/cube/parse/DateUtil.java
  ql/src/java/org/apache/hadoop/hive/ql/cube/parse/GroupbyResolver.java
  ql/src/java/org/apache/hadoop/hive/ql/cube/parse/HQLParser.java
  ql/src/java/org/apache/hadoop/hive/ql/cube/parse/JoinResolver.java
  ql/src/java/org/apache/hadoop/hive/ql/cube/parse/LeastDimensionResolver.java
  ql/src/java/org/apache/hadoop/hive/ql/cube/parse/LeastPartitionResolver.java
  ql/src/java/org/apache/hadoop/hive/ql/cube/parse/PartitionResolver.java
  ql/src/java/org/apache/hadoop/hive/ql/cube/parse/StorageTableResolver.java
  ql/src/java/org/apache/hadoop/hive/ql/cube/parse/ValidationRule.java
  ql/src/java/org/apache/hadoop/hive/ql/cube/processors/CubeDriver.java
  ql/src/java/org/apache/hadoop/hive/ql/processors/CommandProcessorFactory.java
  
ql/src/test/org/apache/hadoop/hive/ql/cube/metadata/TestCubeMetastoreClient.java
  ql/src/test/org/apache/hadoop/hive/ql/cube/parse/CubeTestSetup.java
  ql/src/test/org/apache/hadoop/hive/ql/cube/parse/TestCubeSemanticAnalyzer.java
  ql/src/test/org/apache/hadoop/hive/ql/cube/parse/TestDateUtil.java
  ql/src/test/org/apache/hadoop/hive/ql/cube/parse/TestMaxUpdateInterval.java
  ql/src/test/org/apache/hadoop/hive/ql/cube/processors/TestCubeDriver.java

MANAGE HERALD RULES
  https://reviews.facebook.net/herald/view/differential/

WHY DID I GET THIS EMAIL?
  https://reviews.facebook.net/herald/transcript/25647/

To: JIRA, Amareshwari


 Introduce cube abstraction in hive
 --

 Key: HIVE-4115
 URL: https://issues.apache.org/jira/browse/HIVE-4115
 Project: Hive
  Issue Type: New Feature
Reporter: Amareshwari Sriramadasu
Assignee: Amareshwari Sriramadasu
 Attachments: cube-design-2.pdf, cube-design.docx, 
 HIVE-4115.D10689.1.patch


 We 

[jira] [Commented] (HIVE-4115) Introduce cube abstraction in hive

2013-05-09 Thread Amareshwari Sriramadasu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13652819#comment-13652819
 ] 

Amareshwari Sriramadasu commented on HIVE-4115:
---

The branch HIVE-4115 is ready for review. Also, created the phabricator entry.

Changes include :
* ql/src/java/org/apache/hadoop/hive/ql/cube/metadata/ has classes for Cube 
Metastore adn CubeMetastoreClient.java has the api to create cube, fact and 
dimension tables.
* ql/src/java/org/apache/hadoop/hive/ql/cube/parse/ has code for validating the 
cube ql and converting the cube ql to HQL involving final storage tables
* ql/src/java/org/apache/hadoop/hive/ql/cube/processors/CubeDriver.java is the 
entry point for the cube query. If query start with 'cube', it will be 
processed by CubeDriver.

Will add Cube DDL in a followup jira.

 Introduce cube abstraction in hive
 --

 Key: HIVE-4115
 URL: https://issues.apache.org/jira/browse/HIVE-4115
 Project: Hive
  Issue Type: New Feature
Reporter: Amareshwari Sriramadasu
Assignee: Amareshwari Sriramadasu
 Attachments: cube-design-2.pdf, cube-design.docx, 
 HIVE-4115.D10689.1.patch


 We would like to define a cube abstraction so that user can query at cube 
 layer and do not know anything about storage and rollups. 
 Will describe the model more in following comments.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4498) TestBeeLineWithArgs.testPositiveScriptFile fails

2013-05-09 Thread Carl Steinbach (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13652829#comment-13652829
 ] 

Carl Steinbach commented on HIVE-4498:
--

+1. If someone else can test this that would be great.

 TestBeeLineWithArgs.testPositiveScriptFile fails
 

 Key: HIVE-4498
 URL: https://issues.apache.org/jira/browse/HIVE-4498
 Project: Hive
  Issue Type: Bug
  Components: CLI, HiveServer2, JDBC
Reporter: Thejas M Nair
Assignee: Thejas M Nair
Priority: Blocker
 Fix For: 0.11.0

 Attachments: HIVE-4498.1.patch


 TestBeeLineWithArgs.testPositiveScriptFile fails -
 {code}
[junit] 0: jdbc:hive2://localhost:1  STARTED 
 testBreakOnErrorScriptFile
 [junit] Output: Connecting to jdbc:hive2://localhost:1
 [junit] Connected to: Hive (version 0.12.0-SNAPSHOT)
 [junit] Driver: Hive (version 0.12.0-SNAPSHOT)
 [junit] Transaction isolation: TRANSACTION_REPEATABLE_READ
 [junit] Beeline version 0.12.0-SNAPSHOT by Apache Hive
 [junit] ++
 [junit] | database_name  |
 [junit] ++
 [junit] ++
 [junit] No rows selected (0.899 seconds)
 [junit] Closing: org.apache.hive.jdbc.HiveConnection
 [junit]
 [junit]  FAILED testPositiveScriptFile (ERROR) (2s)
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4494) ORC map columns get class cast exception in some context

2013-05-09 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13652874#comment-13652874
 ] 

Hudson commented on HIVE-4494:
--

Integrated in Hive-trunk-h0.21 #2093 (See 
[https://builds.apache.org/job/Hive-trunk-h0.21/2093/])
HIVE-4494 ORC map columns get class cast exception in some contexts 
(omalley) (Revision 1480460)

 Result = FAILURE
omalley : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1480460
Files : 
* /hive/trunk
* /hive/trunk/data/files/orc_create.txt
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcStruct.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcUnion.java
* /hive/trunk/ql/src/test/queries/clientpositive/orc_create.q
* /hive/trunk/ql/src/test/results/clientpositive/orc_create.q.out


 ORC map columns get class cast exception in some context
 

 Key: HIVE-4494
 URL: https://issues.apache.org/jira/browse/HIVE-4494
 Project: Hive
  Issue Type: Bug
Reporter: Owen O'Malley
Assignee: Owen O'Malley
 Fix For: 0.11.0

 Attachments: HIVE-4494.D10653.1.patch, HIVE-4494.D10653.2.patch


 Setting up the test case like:
 {quote}
 create table map_text (
   name string,
   m mapstring,string
 ) row format delimited
 fields terminated by '|'
 collection items terminated by ','
 map keys terminated by ':';
 create table map_orc (
   name string,
   m mapstring,string
 ) stored as orc;
 cat map.txt
 name1|key11:value11,key12:value12,key13:value13
 name2|key21:value21,key22:value22,key23:value23
 name3|key31:value31,key32:value32,key33:value33
 load data local   inpath 'map.txt' into table map_text;
 insert overwrite table map_orc select * from map_text;
 {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4500) HS2 holding too many file handles of hive_job_log_hive_*.txt files

2013-05-09 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13652873#comment-13652873
 ] 

Hudson commented on HIVE-4500:
--

Integrated in Hive-trunk-h0.21 #2093 (See 
[https://builds.apache.org/job/Hive-trunk-h0.21/2093/])
HIVE-4500 Ensure that HiveServer 2 closes log files. (Alan Gates via 
omalley) (Revision 1480390)

 Result = FAILURE
omalley : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1480390
Files : 
* /hive/trunk
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/history/HiveHistory.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java
* 
/hive/trunk/service/src/java/org/apache/hive/service/cli/operation/HiveCommandOperation.java
* 
/hive/trunk/service/src/java/org/apache/hive/service/cli/operation/Operation.java
* 
/hive/trunk/service/src/java/org/apache/hive/service/cli/session/HiveSessionImpl.java


 HS2 holding too many file handles of hive_job_log_hive_*.txt files
 --

 Key: HIVE-4500
 URL: https://issues.apache.org/jira/browse/HIVE-4500
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Affects Versions: 0.11.0
Reporter: Alan Gates
Assignee: Alan Gates
 Fix For: 0.11.0

 Attachments: HIVE-4500-2.patch, HIVE-4500-3.patch, HIVE-4500.patch


 In the hiveserver2 setup used for testing, we see that it has 2444 files open 
 and of them 2152 are /tmp/hive/hive_job_log_hive_*.txt files

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4233) The TGT gotten from class 'CLIService' should be renewed on time

2013-05-09 Thread Dongyong Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13652877#comment-13652877
 ] 

Dongyong Wang commented on HIVE-4233:
-

Thanks to  Thejas and Jitendra's reply. I agree with your solution, my patch is 
too complex, waiting for your new patch.

 The TGT gotten from class 'CLIService'  should be renewed on time
 -

 Key: HIVE-4233
 URL: https://issues.apache.org/jira/browse/HIVE-4233
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Affects Versions: 0.10.0
 Environment: CentOS release 6.3 (Final)
 jdk1.6.0_31
 HiveServer2  0.10.0-cdh4.2.0
 Kerberos Security 
Reporter: Dongyong Wang
Priority: Critical
 Attachments: 0001-FIX-HIVE-4233.patch


 When the HIveServer2 have started more than 7 days, I use beeline  shell  to  
 connect the HiveServer2,all operation failed.
 The log of HiveServer2 shows it was caused by the Kerberos auth failure,the 
 exception stack trace is:
 2013-03-26 11:55:20,932 ERROR hive.ql.metadata.Hive: 
 java.lang.RuntimeException: Unable to instantiate 
 org.apache.hadoop.hive.metastore.HiveMetaStoreClient
 at 
 org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1084)
 at 
 org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.init(RetryingMetaStoreClient.java:51)
 at 
 org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:61)
 at 
 org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:2140)
 at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:2151)
 at 
 org.apache.hadoop.hive.ql.metadata.Hive.getDelegationToken(Hive.java:2275)
 at 
 org.apache.hive.service.cli.CLIService.getDelegationTokenFromMetaStore(CLIService.java:358)
 at 
 org.apache.hive.service.cli.thrift.ThriftCLIService.OpenSession(ThriftCLIService.java:127)
 at 
 org.apache.hive.service.cli.thrift.TCLIService$Processor$OpenSession.getResult(TCLIService.java:1073)
 at 
 org.apache.hive.service.cli.thrift.TCLIService$Processor$OpenSession.getResult(TCLIService.java:1058)
 at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
 at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
 at 
 org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge20S$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge20S.java:565)
 at 
 org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:206)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 Caused by: java.lang.reflect.InvocationTargetException
 at sun.reflect.GeneratedConstructorAccessor52.newInstance(Unknown 
 Source)
 at 
 sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
 at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
 at 
 org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1082)
 ... 16 more
 Caused by: java.lang.IllegalStateException: This ticket is no longer valid
 at 
 javax.security.auth.kerberos.KerberosTicket.toString(KerberosTicket.java:601)
 at java.lang.String.valueOf(String.java:2826)
 at java.lang.StringBuilder.append(StringBuilder.java:115)
 at 
 sun.security.jgss.krb5.SubjectComber.findAux(SubjectComber.java:120)
 at sun.security.jgss.krb5.SubjectComber.find(SubjectComber.java:41)
 at sun.security.jgss.krb5.Krb5Util.getTicket(Krb5Util.java:130)
 at 
 sun.security.jgss.krb5.Krb5InitCredential$1.run(Krb5InitCredential.java:328)
 at java.security.AccessController.doPrivileged(Native Method)
 at 
 sun.security.jgss.krb5.Krb5InitCredential.getTgt(Krb5InitCredential.java:325)
 at 
 sun.security.jgss.krb5.Krb5InitCredential.getInstance(Krb5InitCredential.java:128)
 at 
 sun.security.jgss.krb5.Krb5MechFactory.getCredentialElement(Krb5MechFactory.java:106)
 at 
 sun.security.jgss.krb5.Krb5MechFactory.getMechanismContext(Krb5MechFactory.java:172)
 at 
 sun.security.jgss.GSSManagerImpl.getMechanismContext(GSSManagerImpl.java:209)
 at 
 sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:195)
 at 
 sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:162)
 at 
 com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:175)
 at 
 

[jira] [Updated] (HIVE-4502) NPE - subquery smb joins fails

2013-05-09 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-4502:


Assignee: Navis
  Status: Patch Available  (was: Open)

 NPE - subquery smb joins fails
 --

 Key: HIVE-4502
 URL: https://issues.apache.org/jira/browse/HIVE-4502
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.11.0
Reporter: Vikram Dixit K
Assignee: Navis
 Attachments: HIVE-4502.D10695.1.patch, smb_mapjoin_25.q


 Found this issue while running some SMB joins. Attaching test case that 
 causes this error.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4502) NPE - subquery smb joins fails

2013-05-09 Thread Phabricator (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HIVE-4502:
--

Attachment: HIVE-4502.D10695.1.patch

navis requested code review of HIVE-4502 [jira] NPE - subquery smb joins 
fails.

Reviewers: JIRA

HIVE-4502 NPE - subquery smb joins fails

Found this issue while running some SMB joins. Attaching test case that causes 
this error.

TEST PLAN
  EMPTY

REVISION DETAIL
  https://reviews.facebook.net/D10695

AFFECTED FILES
  ql/src/java/org/apache/hadoop/hive/ql/exec/Task.java
  ql/src/java/org/apache/hadoop/hive/ql/lib/DefaultGraphWalker.java
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMRFileSink1.java
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMROperator.java
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMRProcContext.java
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMRRedSink1.java
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMRRedSink2.java
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMRRedSink3.java
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMRTableScan1.java
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMRUnion1.java
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/MapJoinFactory.java
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/MapJoinProcessor.java
  ql/src/java/org/apache/hadoop/hive/ql/parse/GenMapRedWalker.java
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
  ql/src/test/queries/clientpositive/smb_mapjoin_25.q
  ql/src/test/results/clientpositive/auto_smb_mapjoin_14.q.out
  ql/src/test/results/clientpositive/auto_sortmerge_join_6.q.out
  ql/src/test/results/clientpositive/auto_sortmerge_join_9.q.out
  ql/src/test/results/clientpositive/bucketmapjoin1.q.out
  ql/src/test/results/clientpositive/bucketmapjoin2.q.out
  ql/src/test/results/clientpositive/bucketmapjoin3.q.out
  ql/src/test/results/clientpositive/bucketmapjoin4.q.out
  ql/src/test/results/clientpositive/bucketmapjoin_negative.q.out
  ql/src/test/results/clientpositive/bucketmapjoin_negative2.q.out
  ql/src/test/results/clientpositive/bucketsortoptimize_insert_2.q.out
  ql/src/test/results/clientpositive/bucketsortoptimize_insert_4.q.out
  ql/src/test/results/clientpositive/bucketsortoptimize_insert_5.q.out
  ql/src/test/results/clientpositive/bucketsortoptimize_insert_6.q.out
  ql/src/test/results/clientpositive/bucketsortoptimize_insert_7.q.out
  ql/src/test/results/clientpositive/bucketsortoptimize_insert_8.q.out
  ql/src/test/results/clientpositive/mapjoin_distinct.q.out
  ql/src/test/results/clientpositive/smb_mapjoin9.q.out
  ql/src/test/results/clientpositive/smb_mapjoin_11.q.out
  ql/src/test/results/clientpositive/smb_mapjoin_12.q.out
  ql/src/test/results/clientpositive/smb_mapjoin_14.q.out
  ql/src/test/results/clientpositive/smb_mapjoin_25.q.out
  ql/src/test/results/clientpositive/smb_mapjoin_6.q.out

MANAGE HERALD RULES
  https://reviews.facebook.net/herald/view/differential/

WHY DID I GET THIS EMAIL?
  https://reviews.facebook.net/herald/transcript/25653/

To: JIRA, navis


 NPE - subquery smb joins fails
 --

 Key: HIVE-4502
 URL: https://issues.apache.org/jira/browse/HIVE-4502
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.11.0
Reporter: Vikram Dixit K
Assignee: Navis
 Attachments: HIVE-4502.D10695.1.patch, smb_mapjoin_25.q


 Found this issue while running some SMB joins. Attaching test case that 
 causes this error.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-4528) auto_join14.q is failing due to changed configuration by HIVE-4146

2013-05-09 Thread Navis (JIRA)
Navis created HIVE-4528:
---

 Summary: auto_join14.q is failing due to changed configuration by 
HIVE-4146
 Key: HIVE-4528
 URL: https://issues.apache.org/jira/browse/HIVE-4528
 Project: Hive
  Issue Type: Test
  Components: Tests
Reporter: Navis
Assignee: Navis
Priority: Trivial


default value of hive.auto.convert.join.noconditionaltask is changed to true  
by HIVE-4146 but result of auto_join14.q seemed not applied that.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Hive-trunk-h0.21 - Build # 2093 - Still Failing

2013-05-09 Thread Apache Jenkins Server
Changes for Build #2070
[hashutosh] HIVE-4278 : HCat needs to get current Hive jars instead of pulling 
them from maven repo (Sushanth Sowmyan via Ashutosh Chauhan)


Changes for Build #2071
[khorgath] HCATALOG-621 bin/hcat should include hbase jar and dependencies in 
the classpath (Nick Dimiduk via Sushanth Sowmyan)

[omalley] HIVE-4178 : ORC fails with files with different numbers of columns


Changes for Build #2072
[hashutosh] HIVE-4304 : Remove unused builtins and pdk submodules (Travis 
Crawford via Ashutosh Chauhan)

[namit] HIVE-4310 optimize count(distinct) with hive.map.groupby.sorted
(Namit Jain via Gang Tim Liu)

[hashutosh] HIVE-4356 :  remove duplicate impersonation parameters for 
hiveserver2 (Gunther Hagleitner via Ashutosh Chauhan)

[hashutosh] HIVE-4318 : OperatorHooks hit performance even when not used 
(Gunther Hagleitner via Ashutosh Chauhan)

[hashutosh] HIVE-4378 : Counters hit performance even when not used (Gunther 
Hagleitner via Ashutosh Chauhan)

[omalley] HIVE-4189 : ORC fails with String column that ends in lots of nulls 
(Kevin
Wilfong)


Changes for Build #2073
[hashutosh] 4248 : Implement a memory manager for ORC. Missed one test file. 
(Owen Omalley via Ashutosh Chauhan)

[hashutosh] HIVE-4248 : Implement a memory manager for ORC (Owen Omalley via 
Ashutosh Chauhan)

[hashutosh] HIVE-4103 : Remove System.gc() call from the map-join local-task 
loop (Gopal V via Ashutosh Chauhan)


Changes for Build #2074
[namit] HIVE-4371 some issue with merging join trees
(Navis via namit)

[hashutosh] HIVE-4333 : most windowing tests fail on hadoop 2 (Harish Butani 
via Ashutosh Chauhan)

[namit] HIVE-4342 NPE for query involving UNION ALL with nested JOIN and UNION 
ALL
(Navis via namit)

[hashutosh] HIVE-4364 : beeline always exits with 0 status, should exit with 
non-zero status on error (Rob Weltman via Ashutosh Chauhan)

[hashutosh] HIVE-4130 : Bring the Lead/Lag UDFs interface in line with Lead/Lag 
UDAFs (Harish Butani via Ashutosh Chauhan)


Changes for Build #2075
[hashutosh] HIVE-2379 : Hive/HBase integration could be improved (Navis via 
Ashutosh Chauhan)

[hashutosh] HIVE-4295 : Lateral view makes invalid result if CP is disabled 
(Navis via Ashutosh Chauhan)

[hashutosh] HIVE-4365 : wrong result in left semi join (Navis via Ashutosh 
Chauhan)

[hashutosh] HIVE-3861 : Upgrade hbase dependency to 0.94 (Gunther Hagleitner 
via Ashutosh Chauhan)


Changes for Build #2076
[hashutosh] HIVE-3891 : physical optimizer changes for auto sort-merge join 
(Namit Jain via Ashutosh Chauhan)

[namit] HIVE-4393 Make the deleteData flag accessable from DropTable/Partition 
events
(Morgan Philips via namit)

[hashutosh] HIVE-4394 : test leadlag.q fails (Ashutosh Chauhan)

[namit] HIVE-4018 MapJoin failing with Distributed Cache error
(Amareshwari Sriramadasu via Namit Jain)


Changes for Build #2077
[namit] HIVE-4300 ant thriftif generated code that is checkedin is not 
up-to-date
(Roshan Naik via namit)


Changes for Build #2078
[namit] HIVE-4409 Prevent incompatible column type changes
(Dilip Joseph via namit)

[namit] HIVE-4095 Add exchange partition in Hive
(Dheeraj Kumar Singh via namit)

[namit] HIVE-4005 Column truncation
(Kevin Wilfong via namit)

[namit] HIVE-3952 merge map-job followed by map-reduce job
(Vinod Kumar Vavilapalli via namit)

[hashutosh] HIVE-4412 : PTFDesc tries serialize transient fields like OIs, etc. 
(Navis via Ashutosh Chauhan)

[khorgath] HIVE-4419 : webhcat - support ${WEBHCAT_PREFIX}/conf/ as config 
directory (Thejas M Nair via Sushanth Sowmyan)

[namit] HIVE-4181 Star argument without table alias for UDTF is not working
(Navis via namit)

[hashutosh] HIVE-4407 : TestHCatStorer.testStoreFuncAllSimpleTypes fails 
because of null case difference (Thejas Nair via Ashutosh Chauhan)

[hashutosh] HIVE-4369 : Many new failures on hadoop 2 (Vikram Dixit via 
Ashutosh Chauhan)


Changes for Build #2079
[namit] HIVE-4424 MetaStoreUtils.java.orig checked in mistakenly by HIVE-4409
(Namit Jain)

[hashutosh] HIVE-4358 : Check for Map side processing in PTFOp is no longer 
valid (Harish Butani via Ashutosh Chauhan)


Changes for Build #2080
[navis] HIVE-4068 Size of aggregation buffer which uses non-primitive type is 
not estimated correctly (Navis)

[khorgath] HIVE-4420 : HCatalog unit tests stop after a failure (Alan Gates via 
Sushanth Sowmyan)

[hashutosh] HIVE-3708 : Add mapreduce workflow information to job configuration 
(Billie Rinaldi via Ashutosh Chauhan)


Changes for Build #2081

Changes for Build #2082
[hashutosh] HIVE-4423 : Improve RCFile::sync(long) 10x (Gopal V via Ashutosh 
Chauhan)

[hashutosh] HIVE-4398 : HS2 Resource leak: operation handles not cleaned when 
originating session is closed (Ashish Vaidya via Ashutosh Chauhan)

[hashutosh] HIVE-4019 : Ability to create and drop temporary partition function 
(Brock Noland via Ashutosh Chauhan)


Changes for Build #2083
[navis] HIVE-4437 Missing file on HIVE-4068 (Navis)


Changes for Build #2084

Changes 

[jira] [Updated] (HIVE-4528) auto_join14.q is failing due to changed configuration by HIVE-4146

2013-05-09 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-4528:


Status: Patch Available  (was: Open)

 auto_join14.q is failing due to changed configuration by HIVE-4146
 --

 Key: HIVE-4528
 URL: https://issues.apache.org/jira/browse/HIVE-4528
 Project: Hive
  Issue Type: Test
  Components: Tests
Reporter: Navis
Assignee: Navis
Priority: Trivial
 Attachments: HIVE-4528.D10701.1.patch


 default value of hive.auto.convert.join.noconditionaltask is changed to true  
 by HIVE-4146 but result of auto_join14.q seemed not applied that.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4528) auto_join14.q is failing due to changed configuration by HIVE-4146

2013-05-09 Thread Phabricator (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HIVE-4528:
--

Attachment: HIVE-4528.D10701.1.patch

navis requested code review of HIVE-4528 [jira] auto_join14.q is failing due 
to changed configuration by HIVE-4146.

Reviewers: JIRA

HIVE-4528 auto_join14.q is failing due to changed configuration by HIVE-4146

default value of hive.auto.convert.join.noconditionaltask is changed to true  
by HIVE-4146 but result of auto_join14.q seemed not applied that.

TEST PLAN
  EMPTY

REVISION DETAIL
  https://reviews.facebook.net/D10701

AFFECTED FILES
  ql/src/test/queries/clientpositive/auto_join14.q
  ql/src/test/results/clientpositive/auto_join14.q.out

MANAGE HERALD RULES
  https://reviews.facebook.net/herald/view/differential/

WHY DID I GET THIS EMAIL?
  https://reviews.facebook.net/herald/transcript/25659/

To: JIRA, navis


 auto_join14.q is failing due to changed configuration by HIVE-4146
 --

 Key: HIVE-4528
 URL: https://issues.apache.org/jira/browse/HIVE-4528
 Project: Hive
  Issue Type: Test
  Components: Tests
Reporter: Navis
Assignee: Navis
Priority: Trivial
 Attachments: HIVE-4528.D10701.1.patch


 default value of hive.auto.convert.join.noconditionaltask is changed to true  
 by HIVE-4146 but result of auto_join14.q seemed not applied that.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4209) Cache evaluation result of deterministic expression and reuse it

2013-05-09 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-4209:


Fix Version/s: 0.12.0
   Status: Patch Available  (was: Open)

Committed to trunk. Thanks for a review, Namit.

 Cache evaluation result of deterministic expression and reuse it
 

 Key: HIVE-4209
 URL: https://issues.apache.org/jira/browse/HIVE-4209
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Minor
 Fix For: 0.12.0

 Attachments: HIVE-4209.6.patch.txt, HIVE-4209.D9585.1.patch, 
 HIVE-4209.D9585.2.patch, HIVE-4209.D9585.3.patch, HIVE-4209.D9585.4.patch, 
 HIVE-4209.D9585.5.patch


 For example, 
 {noformat}
 select key from src where key + 1  100 AND key + 1  200 limit 3;
 {noformat}
 key + 1 need not to be evaluated twice.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3746) TRowSet resultset structure should be column-oriented

2013-05-09 Thread Phil Prudich (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13652978#comment-13652978
 ] 

Phil Prudich commented on HIVE-3746:


bq. This limitation is a consequence of the fact that we're using a message 
oriented RPC layer (Thrift) to handle communication and data transfer between 
the client and server.

This limitation is only present for a client that is linked to the Thrift 
library.  A client that is coded directly to the protocol itself is freed to 
leave outstanding data on the wire while it returns the initial results.

 TRowSet resultset structure should be column-oriented
 -

 Key: HIVE-3746
 URL: https://issues.apache.org/jira/browse/HIVE-3746
 Project: Hive
  Issue Type: Sub-task
  Components: Server Infrastructure
Reporter: Carl Steinbach
Assignee: Carl Steinbach
  Labels: HiveServer2



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Committer review of HIVE-4508

2013-05-09 Thread Owen O'Malley
All,
   Can a committer review the changes I made to address Carl's suggestions
about the Hive 0.11 release? It will only take a minute to review unless
you read all of the discussion about whether it needs to be reviewed.
*smile*

https://issues.apache.org/jira/browse/HIVE-4508

Thanks!
   Owen


[jira] [Updated] (HIVE-4513) disable hivehistory logs by default

2013-05-09 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-4513:


Attachment: HIVE-4513.2.patch

HIVE-4513.2.patch - 
SessionState temp file gets created in same dir as hive history dir, but it was 
relying on hivehistory to create that dir. Now it independently does a 
-create-if-not-exist.


 disable hivehistory logs by default
 ---

 Key: HIVE-4513
 URL: https://issues.apache.org/jira/browse/HIVE-4513
 Project: Hive
  Issue Type: Bug
  Components: Configuration, Logging
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Attachments: HIVE-4513.1.patch, HIVE-4513.2.patch


 HiveHistory log files (hive_job_log_hive_*.txt files) store information about 
 hive query such as query string, plan , counters and MR job progress 
 information.
 There is no mechanism to delete these files and as a result they get 
 accumulated over time, using up lot of disk space. 
 I don't think this is used by most people, so I think it would better to turn 
 this off by default. Jobtracker logs already capture most of this 
 information, though it is not as structured as history logs.
 HIVE-4500 is introducing a new config parameter to turn this off, we should 
 use that to turn this off by default.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4513) disable hivehistory logs by default

2013-05-09 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-4513:


Attachment: HIVE-4513.3.patch

HIVE-4513.3.patch - additional comments


 disable hivehistory logs by default
 ---

 Key: HIVE-4513
 URL: https://issues.apache.org/jira/browse/HIVE-4513
 Project: Hive
  Issue Type: Bug
  Components: Configuration, Logging
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Attachments: HIVE-4513.1.patch, HIVE-4513.2.patch, HIVE-4513.3.patch


 HiveHistory log files (hive_job_log_hive_*.txt files) store information about 
 hive query such as query string, plan , counters and MR job progress 
 information.
 There is no mechanism to delete these files and as a result they get 
 accumulated over time, using up lot of disk space. 
 I don't think this is used by most people, so I think it would better to turn 
 this off by default. Jobtracker logs already capture most of this 
 information, though it is not as structured as history logs.
 HIVE-4500 is introducing a new config parameter to turn this off, we should 
 use that to turn this off by default.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Review Request: HIVE-4513

2013-05-09 Thread Thejas Nair

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/11029/
---

Review request for hive.


Description
---

HIVE-4513


This addresses bug HIVE-4513.
https://issues.apache.org/jira/browse/HIVE-4513


Diffs
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 1672453 
  conf/hive-default.xml.template 3a7d1dc 
  data/conf/hive-site.xml 544ba35 
  ql/src/java/org/apache/hadoop/hive/ql/history/HiveHistory.java e1c1ae3 
  ql/src/java/org/apache/hadoop/hive/ql/history/HiveHistoryImpl.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/history/HiveHistoryProxyHandler.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/history/HiveHistoryUtil.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/history/HiveHistoryViewer.java fdd56db 
  ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java 3d43451 
  ql/src/test/org/apache/hadoop/hive/ql/history/TestHiveHistory.java a783303 

Diff: https://reviews.apache.org/r/11029/diff/


Testing
---


Thanks,

Thejas Nair



[jira] [Updated] (HIVE-4513) disable hivehistory logs by default

2013-05-09 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-4513:


Status: Patch Available  (was: Open)

review board link with HIVE-4513.3.patch - https://reviews.apache.org/r/11029/

 disable hivehistory logs by default
 ---

 Key: HIVE-4513
 URL: https://issues.apache.org/jira/browse/HIVE-4513
 Project: Hive
  Issue Type: Bug
  Components: Configuration, Logging
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Attachments: HIVE-4513.1.patch, HIVE-4513.2.patch, HIVE-4513.3.patch


 HiveHistory log files (hive_job_log_hive_*.txt files) store information about 
 hive query such as query string, plan , counters and MR job progress 
 information.
 There is no mechanism to delete these files and as a result they get 
 accumulated over time, using up lot of disk space. 
 I don't think this is used by most people, so I think it would better to turn 
 this off by default. Jobtracker logs already capture most of this 
 information, though it is not as structured as history logs.
 HIVE-4500 is introducing a new config parameter to turn this off, we should 
 use that to turn this off by default.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: Review Request: HIVE-4513

2013-05-09 Thread Brock Noland

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/11029/#review20380
---


Thejas,

Thanks for all the work you've been doing! I was wondering, could you be able 
to place the JIRA description in the RB summary? It helps correlate RB items 
with JIRAs in the mind. A small description of the change is often helpful as 
well? Often times there are many ways to fix something, it's often useful to 
know why a particular solution was chosen.

Cheers and thanks again!


ql/src/java/org/apache/hadoop/hive/ql/history/HiveHistoryViewer.java
https://reviews.apache.org/r/11029/#comment41959

This is bad... I know it's not related to your change but can we fix this?


- Brock Noland


On May 9, 2013, 4:29 p.m., Thejas Nair wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/11029/
 ---
 
 (Updated May 9, 2013, 4:29 p.m.)
 
 
 Review request for hive.
 
 
 Description
 ---
 
 HIVE-4513
 
 
 This addresses bug HIVE-4513.
 https://issues.apache.org/jira/browse/HIVE-4513
 
 
 Diffs
 -
 
   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 1672453 
   conf/hive-default.xml.template 3a7d1dc 
   data/conf/hive-site.xml 544ba35 
   ql/src/java/org/apache/hadoop/hive/ql/history/HiveHistory.java e1c1ae3 
   ql/src/java/org/apache/hadoop/hive/ql/history/HiveHistoryImpl.java 
 PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/history/HiveHistoryProxyHandler.java 
 PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/history/HiveHistoryUtil.java 
 PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/history/HiveHistoryViewer.java 
 fdd56db 
   ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java 3d43451 
   ql/src/test/org/apache/hadoop/hive/ql/history/TestHiveHistory.java a783303 
 
 Diff: https://reviews.apache.org/r/11029/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Thejas Nair
 




[jira] [Updated] (HIVE-4467) HiveConnection does not handle failures correctly

2013-05-09 Thread Thiruvel Thirumoolan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvel Thirumoolan updated HIVE-4467:
---

Status: Patch Available  (was: Open)

 HiveConnection does not handle failures correctly
 -

 Key: HIVE-4467
 URL: https://issues.apache.org/jira/browse/HIVE-4467
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2, JDBC
Affects Versions: 0.11.0, 0.12.0
Reporter: Thiruvel Thirumoolan
Assignee: Thiruvel Thirumoolan
 Attachments: HIVE-4467_1.patch, HIVE-4467.patch


 HiveConnection uses Utils.verifySuccess* routines to check if there is any 
 error from the server side. This is not handled well. In 
 Utils.verifySuccess() when withInfo is 'false', the condition evaluates to 
 'false' and no SQLexception is thrown even though there could be a problem on 
 the server.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Review Request: HIVE-4508. Fix various release issues in 0.11.0rc1

2013-05-09 Thread Carl Steinbach

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/11030/
---

Review request for hive and Owen O'Malley.


Description
---

Review request submitted on behalf of Owen. I removed the updates to 
RELEASE_NOTES.txt since we should do those updates on the release branch and 
forward port them after the rc is approved.


This addresses bug HIVE-4508.
https://issues.apache.org/jira/browse/HIVE-4508


Diffs
-

  NOTICE 56883ec 
  README.txt ceed160 
  build.properties fdb0f93 
  docs/xdocs/index.xml f1df3fa 
  eclipse-templates/.classpath 2400756 

Diff: https://reviews.apache.org/r/11030/diff/


Testing
---


Thanks,

Carl Steinbach



[jira] [Commented] (HIVE-4508) Fix various release issues in 0.11.0rc1

2013-05-09 Thread Carl Steinbach (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13653081#comment-13653081
 ] 

Carl Steinbach commented on HIVE-4508:
--

Review request: https://reviews.apache.org/r/11030/


 Fix various release issues in 0.11.0rc1
 ---

 Key: HIVE-4508
 URL: https://issues.apache.org/jira/browse/HIVE-4508
 Project: Hive
  Issue Type: Bug
Reporter: Owen O'Malley
Assignee: Owen O'Malley
Priority: Blocker
 Fix For: 0.11.0

 Attachments: h-4508.patch, h-4508.patch


 Carl described some non-code issues in the 0.11.0rc1 and I want to fix them.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: Review Request: HIVE-4508. Fix various release issues in 0.11.0rc1

2013-05-09 Thread Carl Steinbach

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/11030/#review20390
---



README.txt
https://reviews.apache.org/r/11030/#comment41986

Hive 0.11.0 doesn't include metastore changes, but a user upgrading from 
0.9.0 will need to do an upgrade to account for the changes that were made in 
0.10.0. I think it would be helpful to include a list of Hive versions that 
include metastore schema changes so that users can quickly determine if they 
need to do an upgrade.



README.txt
https://reviews.apache.org/r/11030/#comment41987

0.10.0 included upgrade scripts for Derby, MySQL, Oracle and PostgreSQL 
(from 0.9.0 to 0.10.0), and I think the plan is to continue supporting all four 
platforms in future releases. It would be nice to include a table here that 
summarizes what's available in terms of upgrade scripts, or we could instead 
put the info on the wiki and provide a link here.



docs/xdocs/index.xml
https://reviews.apache.org/r/11030/#comment41983

Change to Apache Hive.



docs/xdocs/index.xml
https://reviews.apache.org/r/11030/#comment41982

dev@hive.apache.org probably makes more sense in this context, and we 
should change Hadoop Hive Documentation Team to Apache Hive Documentation 
Team



docs/xdocs/index.xml
https://reviews.apache.org/r/11030/#comment41984

Should we change the first mention of Hive to Apache Hive? I think the 
ASF branding guidelines may require that.



eclipse-templates/.classpath
https://reviews.apache.org/r/11030/#comment41985

The hcatalog entries cause an error. We should either remove this or fix it.


- Carl Steinbach


On May 9, 2013, 7:31 p.m., Carl Steinbach wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/11030/
 ---
 
 (Updated May 9, 2013, 7:31 p.m.)
 
 
 Review request for hive and Owen O'Malley.
 
 
 Description
 ---
 
 Review request submitted on behalf of Owen. I removed the updates to 
 RELEASE_NOTES.txt since we should do those updates on the release branch and 
 forward port them after the rc is approved.
 
 
 This addresses bug HIVE-4508.
 https://issues.apache.org/jira/browse/HIVE-4508
 
 
 Diffs
 -
 
   NOTICE 56883ec 
   README.txt ceed160 
   build.properties fdb0f93 
   docs/xdocs/index.xml f1df3fa 
   eclipse-templates/.classpath 2400756 
 
 Diff: https://reviews.apache.org/r/11030/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Carl Steinbach
 




Re: Committer review of HIVE-4508

2013-05-09 Thread Carl Steinbach
Hi Owen,

I added some comments to the review request here:

https://reviews.apache.org/r/11030/

Thanks for your help with this.

Carl


On Thu, May 9, 2013 at 8:42 AM, Owen O'Malley omal...@apache.org wrote:

 All,
Can a committer review the changes I made to address Carl's suggestions
 about the Hive 0.11 release? It will only take a minute to review unless
 you read all of the discussion about whether it needs to be reviewed.
 *smile*

 https://issues.apache.org/jira/browse/HIVE-4508

 Thanks!
Owen



[jira] [Created] (HIVE-4529) Add partition support for vectorized ORC Input format

2013-05-09 Thread Sarvesh Sakalanaga (JIRA)
Sarvesh Sakalanaga created HIVE-4529:


 Summary: Add partition support for vectorized ORC Input format
 Key: HIVE-4529
 URL: https://issues.apache.org/jira/browse/HIVE-4529
 Project: Hive
  Issue Type: Sub-task
Reporter: Sarvesh Sakalanaga
Assignee: Sarvesh Sakalanaga


Change VectorizedOrcInputFormat to support partition columns. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-4530) Enforce minmum ant version required in build script

2013-05-09 Thread Arup Malakar (JIRA)
Arup Malakar created HIVE-4530:
--

 Summary: Enforce minmum ant version required in build script 
 Key: HIVE-4530
 URL: https://issues.apache.org/jira/browse/HIVE-4530
 Project: Hive
  Issue Type: Improvement
  Components: Build Infrastructure
Reporter: Arup Malakar
Assignee: Arup Malakar
Priority: Minor


I observed that hive doesn't build with older versions of ant (I tried with 
1.6.5). It would be a good idea to have the check in our build.xml. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4529) Add partition support for vectorized ORC Input format

2013-05-09 Thread Sarvesh Sakalanaga (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sarvesh Sakalanaga updated HIVE-4529:
-

Attachment: Hive-4529.0.patch

Patch available.

 Add partition support for vectorized ORC Input format
 -

 Key: HIVE-4529
 URL: https://issues.apache.org/jira/browse/HIVE-4529
 Project: Hive
  Issue Type: Sub-task
Reporter: Sarvesh Sakalanaga
Assignee: Sarvesh Sakalanaga
 Attachments: Hive-4529.0.patch


 Change VectorizedOrcInputFormat to support partition columns. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4529) Add partition support for vectorized ORC Input format

2013-05-09 Thread Sarvesh Sakalanaga (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sarvesh Sakalanaga updated HIVE-4529:
-

Status: Patch Available  (was: Open)

 Add partition support for vectorized ORC Input format
 -

 Key: HIVE-4529
 URL: https://issues.apache.org/jira/browse/HIVE-4529
 Project: Hive
  Issue Type: Sub-task
Reporter: Sarvesh Sakalanaga
Assignee: Sarvesh Sakalanaga
 Attachments: Hive-4529.0.patch


 Change VectorizedOrcInputFormat to support partition columns. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Review Request: Enforce minmum ant version required in build script

2013-05-09 Thread Arup Malakar

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/11031/
---

Review request for hive.


Description
---

Enforce minmum ant version required in build script


This addresses bug HIVE-4530.
https://issues.apache.org/jira/browse/HIVE-4530


Diffs
-

  build.xml f1a03df157e889e732f948f83f4c1dc0812146ef 

Diff: https://reviews.apache.org/r/11031/diff/


Testing
---


Thanks,

Arup Malakar



[jira] [Updated] (HIVE-4530) Enforce minmum ant version required in build script

2013-05-09 Thread Arup Malakar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arup Malakar updated HIVE-4530:
---

Attachment: HIVE-4530-trunk-0.patch

I have put the minimum ant version as 1.8.0 currently, but if it is known to 
work for any other old version I can tune the minimum version required.

Review: https://reviews.apache.org/r/11031/

 Enforce minmum ant version required in build script 
 

 Key: HIVE-4530
 URL: https://issues.apache.org/jira/browse/HIVE-4530
 Project: Hive
  Issue Type: Improvement
  Components: Build Infrastructure
Reporter: Arup Malakar
Assignee: Arup Malakar
Priority: Minor
 Attachments: HIVE-4530-trunk-0.patch


 I observed that hive doesn't build with older versions of ant (I tried with 
 1.6.5). It would be a good idea to have the check in our build.xml. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4530) Enforce minmum ant version required in build script

2013-05-09 Thread Arup Malakar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arup Malakar updated HIVE-4530:
---

Fix Version/s: 0.12.0
   Status: Patch Available  (was: Open)

 Enforce minmum ant version required in build script 
 

 Key: HIVE-4530
 URL: https://issues.apache.org/jira/browse/HIVE-4530
 Project: Hive
  Issue Type: Improvement
  Components: Build Infrastructure
Reporter: Arup Malakar
Assignee: Arup Malakar
Priority: Minor
 Fix For: 0.12.0

 Attachments: HIVE-4530-trunk-0.patch


 I observed that hive doesn't build with older versions of ant (I tried with 
 1.6.5). It would be a good idea to have the check in our build.xml. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4530) Enforce minmum ant version required in build script

2013-05-09 Thread Carl Steinbach (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13653160#comment-13653160
 ] 

Carl Steinbach commented on HIVE-4530:
--

+1.

In case anyone's curious 1.8.0 is the minimum version we can use because HCat's 
build depends on the skipexisting attribute of the get task which was added 
in Ant v1.8.0. We also have some code that depends on features added in 1.7.1.

 Enforce minmum ant version required in build script 
 

 Key: HIVE-4530
 URL: https://issues.apache.org/jira/browse/HIVE-4530
 Project: Hive
  Issue Type: Improvement
  Components: Build Infrastructure
Reporter: Arup Malakar
Assignee: Arup Malakar
Priority: Minor
 Fix For: 0.12.0

 Attachments: HIVE-4530-trunk-0.patch


 I observed that hive doesn't build with older versions of ant (I tried with 
 1.6.5). It would be a good idea to have the check in our build.xml. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4500) HS2 holding too many file handles of hive_job_log_hive_*.txt files

2013-05-09 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13653221#comment-13653221
 ] 

Hudson commented on HIVE-4500:
--

Integrated in Hive-trunk-hadoop2 #189 (See 
[https://builds.apache.org/job/Hive-trunk-hadoop2/189/])
HIVE-4500 Ensure that HiveServer 2 closes log files. (Alan Gates via 
omalley) (Revision 1480390)

 Result = ABORTED
omalley : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1480390
Files : 
* /hive/trunk
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/history/HiveHistory.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java
* 
/hive/trunk/service/src/java/org/apache/hive/service/cli/operation/HiveCommandOperation.java
* 
/hive/trunk/service/src/java/org/apache/hive/service/cli/operation/Operation.java
* 
/hive/trunk/service/src/java/org/apache/hive/service/cli/session/HiveSessionImpl.java


 HS2 holding too many file handles of hive_job_log_hive_*.txt files
 --

 Key: HIVE-4500
 URL: https://issues.apache.org/jira/browse/HIVE-4500
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Affects Versions: 0.11.0
Reporter: Alan Gates
Assignee: Alan Gates
 Fix For: 0.11.0

 Attachments: HIVE-4500-2.patch, HIVE-4500-3.patch, HIVE-4500.patch


 In the hiveserver2 setup used for testing, we see that it has 2444 files open 
 and of them 2152 are /tmp/hive/hive_job_log_hive_*.txt files

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4494) ORC map columns get class cast exception in some context

2013-05-09 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13653222#comment-13653222
 ] 

Hudson commented on HIVE-4494:
--

Integrated in Hive-trunk-hadoop2 #189 (See 
[https://builds.apache.org/job/Hive-trunk-hadoop2/189/])
HIVE-4494 ORC map columns get class cast exception in some contexts 
(omalley) (Revision 1480460)

 Result = ABORTED
omalley : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1480460
Files : 
* /hive/trunk
* /hive/trunk/data/files/orc_create.txt
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcStruct.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcUnion.java
* /hive/trunk/ql/src/test/queries/clientpositive/orc_create.q
* /hive/trunk/ql/src/test/results/clientpositive/orc_create.q.out


 ORC map columns get class cast exception in some context
 

 Key: HIVE-4494
 URL: https://issues.apache.org/jira/browse/HIVE-4494
 Project: Hive
  Issue Type: Bug
Reporter: Owen O'Malley
Assignee: Owen O'Malley
 Fix For: 0.11.0

 Attachments: HIVE-4494.D10653.1.patch, HIVE-4494.D10653.2.patch


 Setting up the test case like:
 {quote}
 create table map_text (
   name string,
   m mapstring,string
 ) row format delimited
 fields terminated by '|'
 collection items terminated by ','
 map keys terminated by ':';
 create table map_orc (
   name string,
   m mapstring,string
 ) stored as orc;
 cat map.txt
 name1|key11:value11,key12:value12,key13:value13
 name2|key21:value21,key22:value22,key23:value23
 name3|key31:value31,key32:value32,key33:value33
 load data local   inpath 'map.txt' into table map_text;
 insert overwrite table map_orc select * from map_text;
 {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4531) [WebHCat] Collecting task logs to hdfs

2013-05-09 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-4531:
-

Attachment: HIVE-4531-1.patch

Attach initial patch.

 [WebHCat] Collecting task logs to hdfs
 --

 Key: HIVE-4531
 URL: https://issues.apache.org/jira/browse/HIVE-4531
 Project: Hive
  Issue Type: New Feature
  Components: HCatalog
Reporter: Daniel Dai
 Attachments: HIVE-4531-1.patch


 It would be nice we collect task logs after job finish. This is similar to 
 what Amazon EMR does.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4321) Add Compile/Execute support to Hive Server

2013-05-09 Thread Carl Steinbach (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13653318#comment-13653318
 ] 

Carl Steinbach commented on HIVE-4321:
--

bq. It would make sense to use 'prepare' if we are trying to address the 
'prepare' + 'executePrepared' use case. Unlike OLTP oriented databases, where 
prepare+executePrepared are going to be useful for doing things like large 
number of single row inserts, it is not going to be as useful in hive. 

In the ODBC/JDBC world prepare is synonymous with compile, e.g. here's a 
direct quote from Microsoft's ODBC docs:

{quote}
Prepared execution is an efficient way to execute a statement more than once. 
The statement is first compiled, or prepared, into an access plan. The access 
plan is then executed one or more times at a later time. For more information 
about access plans, see Processing an SQL Statement.
{quote}
Ref: 
http://msdn.microsoft.com/en-us/library/windows/desktop/ms716365(v=vs.85).aspx

In case I didn't make this clear I'm not suggesting that we support a 
BindParameter() call, but I also don't think we should use Compile() instead of 
Prepare() just because we don't plan on supporting parameter binding.

bq. My main worry is that this use case would add more state to be stored in 
hive server. Once we add support for high-availability, maintaining additional 
state on hive server 2 would come at additional costs (I am worried about costs 
of storing the whole plan in something like an rdbms or zookeeper).

How much additional state we plan to maintain is entirely up to us. Supporting 
the ability to execute a prepared statement multiple times does not mean that 
we have to save the query plan in between calls to execute(). I think the Hive 
JDBC driver already supports PreparedStatements, and we're basically just 
faking it. The fact that we're faking it doesn't matter because, as you 
mentioned, Hive is not an OLTP database.

Anyway, I'm getting a little off topic. The main reason I think we should call 
this Prepare instead of Compile is so that we maintain the close 
relationship between the CLIService API and the ODBC API. I designed it this 
way for the following reasons:

# People who are familiar with ODBC can look at the CLIService API and quickly 
understand how it works.
# We can reference the ODBC documentation instead of having to write our own. 
We already take advantage of this in the TCliService.thrift IDL file (though in 
some cases we also reference JDBC).
# Getting APIs right is tough. ODBC has gone through a bunch of revisions for 
precisely this reason. It still has bugs, but at least those bugs are well 
understood. As soon as we diverge significantly from ODBC we risk inventing new 
bugs, and then we will have the added challenge of figuring out how to 
reconcile our API bugs with the ODBC API bugs.

bq. HS2 is now a single point of failure in the system, I think we should start 
considering high-availability issues while adding features. Keeping state in 
client instead of server will help with that.

I agree that we should start thinking about HA, but I don't understand the part 
about maintaining state in the client. Wouldn't that imply that the client has 
to be HA as well? There are some security problems with this as well.



 Add Compile/Execute support to Hive Server
 --

 Key: HIVE-4321
 URL: https://issues.apache.org/jira/browse/HIVE-4321
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2, Thrift API
Reporter: Sarah Parra
Assignee: Sarah Parra
 Attachments: CompileExecute.patch


 Adds support for query compilation in Hive Server 2 and adds Thrift support 
 for compile/execute APIs.
 This enables scenarios that need to compile a query before it is executed, 
 e.g. and ODBC driver that implements SQLPrepare/SQLExecute. This is commonly 
 used for a client that needs metadata for the query before it is executed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4160) Vectorized Query Execution in Hive

2013-05-09 Thread Eric Hanson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Hanson updated HIVE-4160:
--

Attachment: Hive-Vectorized-Query-Execution-Design-rev5.docx

 Vectorized Query Execution in Hive
 --

 Key: HIVE-4160
 URL: https://issues.apache.org/jira/browse/HIVE-4160
 Project: Hive
  Issue Type: New Feature
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: Hive-Vectorized-Query-Execution-Design.docx, 
 Hive-Vectorized-Query-Execution-Design-rev2.docx, 
 Hive-Vectorized-Query-Execution-Design-rev3.docx, 
 Hive-Vectorized-Query-Execution-Design-rev3.docx, 
 Hive-Vectorized-Query-Execution-Design-rev3.pdf, 
 Hive-Vectorized-Query-Execution-Design-rev4.docx, 
 Hive-Vectorized-Query-Execution-Design-rev4.pdf, 
 Hive-Vectorized-Query-Execution-Design-rev5.docx


 The Hive query execution engine currently processes one row at a time. A 
 single row of data goes through all the operators before the next row can be 
 processed. This mode of processing is very inefficient in terms of CPU usage. 
 Research has demonstrated that this yields very low instructions per cycle 
 [MonetDB X100]. Also currently Hive heavily relies on lazy deserialization 
 and data columns go through a layer of object inspectors that identify column 
 type, deserialize data and determine appropriate expression routines in the 
 inner loop. These layers of virtual method calls further slow down the 
 processing. 
 This work will add support for vectorized query execution to Hive, where, 
 instead of individual rows, batches of about a thousand rows at a time are 
 processed. Each column in the batch is represented as a vector of a primitive 
 data type. The inner loop of execution scans these vectors very fast, 
 avoiding method calls, deserialization, unnecessary if-then-else, etc. This 
 substantially reduces CPU time used, and gives excellent instructions per 
 cycle (i.e. improved processor pipeline utilization). See the attached design 
 specification for more details.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4160) Vectorized Query Execution in Hive

2013-05-09 Thread Eric Hanson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Hanson updated HIVE-4160:
--

Attachment: Hive-Vectorized-Query-Execution-Design-rev5.pdf

 Vectorized Query Execution in Hive
 --

 Key: HIVE-4160
 URL: https://issues.apache.org/jira/browse/HIVE-4160
 Project: Hive
  Issue Type: New Feature
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: Hive-Vectorized-Query-Execution-Design.docx, 
 Hive-Vectorized-Query-Execution-Design-rev2.docx, 
 Hive-Vectorized-Query-Execution-Design-rev3.docx, 
 Hive-Vectorized-Query-Execution-Design-rev3.docx, 
 Hive-Vectorized-Query-Execution-Design-rev3.pdf, 
 Hive-Vectorized-Query-Execution-Design-rev4.docx, 
 Hive-Vectorized-Query-Execution-Design-rev4.pdf, 
 Hive-Vectorized-Query-Execution-Design-rev5.docx, 
 Hive-Vectorized-Query-Execution-Design-rev5.pdf


 The Hive query execution engine currently processes one row at a time. A 
 single row of data goes through all the operators before the next row can be 
 processed. This mode of processing is very inefficient in terms of CPU usage. 
 Research has demonstrated that this yields very low instructions per cycle 
 [MonetDB X100]. Also currently Hive heavily relies on lazy deserialization 
 and data columns go through a layer of object inspectors that identify column 
 type, deserialize data and determine appropriate expression routines in the 
 inner loop. These layers of virtual method calls further slow down the 
 processing. 
 This work will add support for vectorized query execution to Hive, where, 
 instead of individual rows, batches of about a thousand rows at a time are 
 processed. Each column in the batch is represented as a vector of a primitive 
 data type. The inner loop of execution scans these vectors very fast, 
 avoiding method calls, deserialization, unnecessary if-then-else, etc. This 
 substantially reduces CPU time used, and gives excellent instructions per 
 cycle (i.e. improved processor pipeline utilization). See the attached design 
 specification for more details.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4160) Vectorized Query Execution in Hive

2013-05-09 Thread Eric Hanson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13653325#comment-13653325
 ] 

Eric Hanson commented on HIVE-4160:
---

Updated design document with discussion of precise handling and interpretation 
of all-non-null (noNulls) and all identical (isRepeating) column vectors. 

Also included discussion of TIMESTAMP internal vector representation as long 
integer number of nonseconds since the epoch.

 Vectorized Query Execution in Hive
 --

 Key: HIVE-4160
 URL: https://issues.apache.org/jira/browse/HIVE-4160
 Project: Hive
  Issue Type: New Feature
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: Hive-Vectorized-Query-Execution-Design.docx, 
 Hive-Vectorized-Query-Execution-Design-rev2.docx, 
 Hive-Vectorized-Query-Execution-Design-rev3.docx, 
 Hive-Vectorized-Query-Execution-Design-rev3.docx, 
 Hive-Vectorized-Query-Execution-Design-rev3.pdf, 
 Hive-Vectorized-Query-Execution-Design-rev4.docx, 
 Hive-Vectorized-Query-Execution-Design-rev4.pdf, 
 Hive-Vectorized-Query-Execution-Design-rev5.docx, 
 Hive-Vectorized-Query-Execution-Design-rev5.pdf


 The Hive query execution engine currently processes one row at a time. A 
 single row of data goes through all the operators before the next row can be 
 processed. This mode of processing is very inefficient in terms of CPU usage. 
 Research has demonstrated that this yields very low instructions per cycle 
 [MonetDB X100]. Also currently Hive heavily relies on lazy deserialization 
 and data columns go through a layer of object inspectors that identify column 
 type, deserialize data and determine appropriate expression routines in the 
 inner loop. These layers of virtual method calls further slow down the 
 processing. 
 This work will add support for vectorized query execution to Hive, where, 
 instead of individual rows, batches of about a thousand rows at a time are 
 processed. Each column in the batch is represented as a vector of a primitive 
 data type. The inner loop of execution scans these vectors very fast, 
 avoiding method calls, deserialization, unnecessary if-then-else, etc. This 
 substantially reduces CPU time used, and gives excellent instructions per 
 cycle (i.e. improved processor pipeline utilization). See the attached design 
 specification for more details.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4531) [WebHCat] Collecting task logs to hdfs

2013-05-09 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-4531:
-

Attachment: (was: HIVE-4531-1.patch)

 [WebHCat] Collecting task logs to hdfs
 --

 Key: HIVE-4531
 URL: https://issues.apache.org/jira/browse/HIVE-4531
 Project: Hive
  Issue Type: New Feature
  Components: HCatalog
Reporter: Daniel Dai
 Attachments: HIVE-4531-1.patch


 It would be nice we collect task logs after job finish. This is similar to 
 what Amazon EMR does.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4531) [WebHCat] Collecting task logs to hdfs

2013-05-09 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-4531:
-

Attachment: HIVE-4531-1.patch

 [WebHCat] Collecting task logs to hdfs
 --

 Key: HIVE-4531
 URL: https://issues.apache.org/jira/browse/HIVE-4531
 Project: Hive
  Issue Type: New Feature
  Components: HCatalog
Reporter: Daniel Dai
 Attachments: HIVE-4531-1.patch


 It would be nice we collect task logs after job finish. This is similar to 
 what Amazon EMR does.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-4532) HiveMetaStore should have a shutdown() medthod

2013-05-09 Thread Carl Steinbach (JIRA)
Carl Steinbach created HIVE-4532:


 Summary: HiveMetaStore should have a shutdown() medthod
 Key: HIVE-4532
 URL: https://issues.apache.org/jira/browse/HIVE-4532
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Reporter: Carl Steinbach


Right now there doesn't appear to be any way of cleanly stopping and then 
restarting a HiveMetaStore server in a JVM process.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4532) HiveMetaStore should have a shutdown() medthod

2013-05-09 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-4532:
-

Issue Type: Improvement  (was: Bug)

 HiveMetaStore should have a shutdown() medthod
 --

 Key: HIVE-4532
 URL: https://issues.apache.org/jira/browse/HIVE-4532
 Project: Hive
  Issue Type: Improvement
  Components: Metastore
Reporter: Carl Steinbach

 Right now there doesn't appear to be any way of cleanly stopping and then 
 restarting a HiveMetaStore server in a JVM process.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4160) Vectorized Query Execution in Hive

2013-05-09 Thread Eric Hanson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Hanson updated HIVE-4160:
--

Attachment: Hive-Vectorized-Query-Execution-Design-rev6.docx

 Vectorized Query Execution in Hive
 --

 Key: HIVE-4160
 URL: https://issues.apache.org/jira/browse/HIVE-4160
 Project: Hive
  Issue Type: New Feature
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: Hive-Vectorized-Query-Execution-Design.docx, 
 Hive-Vectorized-Query-Execution-Design-rev2.docx, 
 Hive-Vectorized-Query-Execution-Design-rev3.docx, 
 Hive-Vectorized-Query-Execution-Design-rev3.docx, 
 Hive-Vectorized-Query-Execution-Design-rev3.pdf, 
 Hive-Vectorized-Query-Execution-Design-rev4.docx, 
 Hive-Vectorized-Query-Execution-Design-rev4.pdf, 
 Hive-Vectorized-Query-Execution-Design-rev5.docx, 
 Hive-Vectorized-Query-Execution-Design-rev5.pdf, 
 Hive-Vectorized-Query-Execution-Design-rev6.docx, 
 Hive-Vectorized-Query-Execution-Design-rev6.pdf


 The Hive query execution engine currently processes one row at a time. A 
 single row of data goes through all the operators before the next row can be 
 processed. This mode of processing is very inefficient in terms of CPU usage. 
 Research has demonstrated that this yields very low instructions per cycle 
 [MonetDB X100]. Also currently Hive heavily relies on lazy deserialization 
 and data columns go through a layer of object inspectors that identify column 
 type, deserialize data and determine appropriate expression routines in the 
 inner loop. These layers of virtual method calls further slow down the 
 processing. 
 This work will add support for vectorized query execution to Hive, where, 
 instead of individual rows, batches of about a thousand rows at a time are 
 processed. Each column in the batch is represented as a vector of a primitive 
 data type. The inner loop of execution scans these vectors very fast, 
 avoiding method calls, deserialization, unnecessary if-then-else, etc. This 
 substantially reduces CPU time used, and gives excellent instructions per 
 cycle (i.e. improved processor pipeline utilization). See the attached design 
 specification for more details.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4160) Vectorized Query Execution in Hive

2013-05-09 Thread Eric Hanson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Hanson updated HIVE-4160:
--

Attachment: Hive-Vectorized-Query-Execution-Design-rev6.pdf

 Vectorized Query Execution in Hive
 --

 Key: HIVE-4160
 URL: https://issues.apache.org/jira/browse/HIVE-4160
 Project: Hive
  Issue Type: New Feature
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: Hive-Vectorized-Query-Execution-Design.docx, 
 Hive-Vectorized-Query-Execution-Design-rev2.docx, 
 Hive-Vectorized-Query-Execution-Design-rev3.docx, 
 Hive-Vectorized-Query-Execution-Design-rev3.docx, 
 Hive-Vectorized-Query-Execution-Design-rev3.pdf, 
 Hive-Vectorized-Query-Execution-Design-rev4.docx, 
 Hive-Vectorized-Query-Execution-Design-rev4.pdf, 
 Hive-Vectorized-Query-Execution-Design-rev5.docx, 
 Hive-Vectorized-Query-Execution-Design-rev5.pdf, 
 Hive-Vectorized-Query-Execution-Design-rev6.docx, 
 Hive-Vectorized-Query-Execution-Design-rev6.pdf


 The Hive query execution engine currently processes one row at a time. A 
 single row of data goes through all the operators before the next row can be 
 processed. This mode of processing is very inefficient in terms of CPU usage. 
 Research has demonstrated that this yields very low instructions per cycle 
 [MonetDB X100]. Also currently Hive heavily relies on lazy deserialization 
 and data columns go through a layer of object inspectors that identify column 
 type, deserialize data and determine appropriate expression routines in the 
 inner loop. These layers of virtual method calls further slow down the 
 processing. 
 This work will add support for vectorized query execution to Hive, where, 
 instead of individual rows, batches of about a thousand rows at a time are 
 processed. Each column in the batch is represented as a vector of a primitive 
 data type. The inner loop of execution scans these vectors very fast, 
 avoiding method calls, deserialization, unnecessary if-then-else, etc. This 
 substantially reduces CPU time used, and gives excellent instructions per 
 cycle (i.e. improved processor pipeline utilization). See the attached design 
 specification for more details.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4532) HiveMetaStore should have a shutdown() medthod

2013-05-09 Thread Brock Noland (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13653344#comment-13653344
 ] 

Brock Noland commented on HIVE-4532:


It'd also be nice if there was a way to flip 
HiveMetaStore.HMSHandler.createDefaultDB without restoring to ugly reflection.

 HiveMetaStore should have a shutdown() medthod
 --

 Key: HIVE-4532
 URL: https://issues.apache.org/jira/browse/HIVE-4532
 Project: Hive
  Issue Type: Improvement
  Components: Metastore
Reporter: Carl Steinbach

 Right now there doesn't appear to be any way of cleanly stopping and then 
 restarting a HiveMetaStore server in a JVM process.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4532) HiveMetaStore should have a shutdown() method

2013-05-09 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-4532:
-

Summary: HiveMetaStore should have a shutdown() method  (was: HiveMetaStore 
should have a shutdown() medthod)

 HiveMetaStore should have a shutdown() method
 -

 Key: HIVE-4532
 URL: https://issues.apache.org/jira/browse/HIVE-4532
 Project: Hive
  Issue Type: Improvement
  Components: Metastore
Reporter: Carl Steinbach

 Right now there doesn't appear to be any way of cleanly stopping and then 
 restarting a HiveMetaStore server in a JVM process.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Build failed in Jenkins: Hive-0.10.0-SNAPSHOT-h0.20.1 #140

2013-05-09 Thread Apache Jenkins Server
See https://builds.apache.org/job/Hive-0.10.0-SNAPSHOT-h0.20.1/140/

--
[...truncated 41948 lines...]
[junit] Hadoop job information for null: number of mappers: 0; number of 
reducers: 0
[junit] 2013-05-09 16:09:10,413 null map = 100%,  reduce = 100%
[junit] Ended Job = job_local_0001
[junit] Execution completed successfully
[junit] Mapred Local Task Succeeded . Convert the Join into MapJoin
[junit] POSTHOOK: query: select count(1) as cnt from testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: 
https://builds.apache.org/job/Hive-0.10.0-SNAPSHOT-h0.20.1/140/artifact/hive/build/service/localscratchdir/hive_2013-05-09_16-09-06_909_6311888755055282525/-mr-1
[junit] OK
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: default@testhivedrivertable
[junit] POSTHOOK: query: drop table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] Hive history 
file=https://builds.apache.org/job/Hive-0.10.0-SNAPSHOT-h0.20.1/140/artifact/hive/build/service/tmp/hive_job_log_jenkins_201305091609_1242724629.txt
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] POSTHOOK: query: drop table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] OK
[junit] PREHOOK: query: create table testhivedrivertable (num int)
[junit] PREHOOK: type: DROPTABLE
[junit] Copying file: 
https://builds.apache.org/job/Hive-0.10.0-SNAPSHOT-h0.20.1/ws/hive/data/files/kv1.txt
[junit] POSTHOOK: query: create table testhivedrivertable (num int)
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] PREHOOK: query: load data local inpath 
'https://builds.apache.org/job/Hive-0.10.0-SNAPSHOT-h0.20.1/ws/hive/data/files/kv1.txt'
 into table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] PREHOOK: Output: default@testhivedrivertable
[junit] Copying data from 
https://builds.apache.org/job/Hive-0.10.0-SNAPSHOT-h0.20.1/ws/hive/data/files/kv1.txt
[junit] Loading data to table default.testhivedrivertable
[junit] Table default.testhivedrivertable stats: [num_partitions: 0, 
num_files: 1, num_rows: 0, total_size: 5812, raw_data_size: 0]
[junit] POSTHOOK: query: load data local inpath 
'https://builds.apache.org/job/Hive-0.10.0-SNAPSHOT-h0.20.1/ws/hive/data/files/kv1.txt'
 into table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] PREHOOK: query: select * from testhivedrivertable limit 10
[junit] PREHOOK: type: DROPTABLE
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: 
https://builds.apache.org/job/Hive-0.10.0-SNAPSHOT-h0.20.1/140/artifact/hive/build/service/localscratchdir/hive_2013-05-09_16-09-11_963_6614170573484285938/-mr-1
[junit] POSTHOOK: query: select * from testhivedrivertable limit 10
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: 
https://builds.apache.org/job/Hive-0.10.0-SNAPSHOT-h0.20.1/140/artifact/hive/build/service/localscratchdir/hive_2013-05-09_16-09-11_963_6614170573484285938/-mr-1
[junit] OK
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: default@testhivedrivertable
[junit] POSTHOOK: query: drop table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] Hive history 
file=https://builds.apache.org/job/Hive-0.10.0-SNAPSHOT-h0.20.1/140/artifact/hive/build/service/tmp/hive_job_log_jenkins_201305091609_1737272653.txt
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] POSTHOOK: query: drop table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] OK
[junit] PREHOOK: query: create table testhivedrivertable (num int)
[junit] PREHOOK: type: DROPTABLE
[junit] POSTHOOK: query: create table testhivedrivertable (num int)
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: default@testhivedrivertable

[jira] [Commented] (HIVE-4209) Cache evaluation result of deterministic expression and reuse it

2013-05-09 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13653381#comment-13653381
 ] 

Hudson commented on HIVE-4209:
--

Integrated in Hive-trunk-h0.21 #2094 (See 
[https://builds.apache.org/job/Hive-trunk-h0.21/2094/])
HIVE-4209 Cache evaluation result of deterministic expression and reuse it 
(Navis via namit) (Revision 1480597)

 Result = FAILURE
navis : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1480597
Files : 
* /hive/trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
* /hive/trunk/conf/hive-default.xml.template
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/ExprNodeColumnEvaluator.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/ExprNodeConstantEvaluator.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/ExprNodeEvaluator.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/ExprNodeEvaluatorFactory.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/ExprNodeEvaluatorHead.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/ExprNodeEvaluatorRef.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/ExprNodeFieldEvaluator.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/ExprNodeGenericFuncEvaluator.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/ExprNodeNullEvaluator.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/FilterOperator.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/JoinUtil.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/SelectOperator.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDF.java


 Cache evaluation result of deterministic expression and reuse it
 

 Key: HIVE-4209
 URL: https://issues.apache.org/jira/browse/HIVE-4209
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Minor
 Fix For: 0.12.0

 Attachments: HIVE-4209.6.patch.txt, HIVE-4209.D9585.1.patch, 
 HIVE-4209.D9585.2.patch, HIVE-4209.D9585.3.patch, HIVE-4209.D9585.4.patch, 
 HIVE-4209.D9585.5.patch


 For example, 
 {noformat}
 select key from src where key + 1  100 AND key + 1  200 limit 3;
 {noformat}
 key + 1 need not to be evaluated twice.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[ANNOUNCE] The next Bay Area Hive User Group Meetup, June 4th in SF

2013-05-09 Thread Carl Steinbach
The next Bay Area Hive User Group Meetup is happening on Tuesday, June 4th
at RichRelevance's offices in San Francisco. As usual the format will be a
series of short (15 min) talks followed by un-conference style sessions and
networking. Please join me in thanking the folks at RichRelevance for
providing the meeting space and refreshments.

In order to attend you must RSVP here:

http://www.meetup.com/Hive-User-Group-Meeting/events/118637862/


*Schedule*
6pm: Doors open
6pm-6:30pm: Networking and refreshments
6:30pm-9pm: Talks

*Talks:*

TBA

*Remote Viewing:*

We plan to do a live screencast of the meetup. More details will be posted
in the following weeks.


Hive-trunk-h0.21 - Build # 2094 - Still Failing

2013-05-09 Thread Apache Jenkins Server
Changes for Build #2070
[hashutosh] HIVE-4278 : HCat needs to get current Hive jars instead of pulling 
them from maven repo (Sushanth Sowmyan via Ashutosh Chauhan)


Changes for Build #2071
[khorgath] HCATALOG-621 bin/hcat should include hbase jar and dependencies in 
the classpath (Nick Dimiduk via Sushanth Sowmyan)

[omalley] HIVE-4178 : ORC fails with files with different numbers of columns


Changes for Build #2072
[hashutosh] HIVE-4304 : Remove unused builtins and pdk submodules (Travis 
Crawford via Ashutosh Chauhan)

[namit] HIVE-4310 optimize count(distinct) with hive.map.groupby.sorted
(Namit Jain via Gang Tim Liu)

[hashutosh] HIVE-4356 :  remove duplicate impersonation parameters for 
hiveserver2 (Gunther Hagleitner via Ashutosh Chauhan)

[hashutosh] HIVE-4318 : OperatorHooks hit performance even when not used 
(Gunther Hagleitner via Ashutosh Chauhan)

[hashutosh] HIVE-4378 : Counters hit performance even when not used (Gunther 
Hagleitner via Ashutosh Chauhan)

[omalley] HIVE-4189 : ORC fails with String column that ends in lots of nulls 
(Kevin
Wilfong)


Changes for Build #2073
[hashutosh] 4248 : Implement a memory manager for ORC. Missed one test file. 
(Owen Omalley via Ashutosh Chauhan)

[hashutosh] HIVE-4248 : Implement a memory manager for ORC (Owen Omalley via 
Ashutosh Chauhan)

[hashutosh] HIVE-4103 : Remove System.gc() call from the map-join local-task 
loop (Gopal V via Ashutosh Chauhan)


Changes for Build #2074
[namit] HIVE-4371 some issue with merging join trees
(Navis via namit)

[hashutosh] HIVE-4333 : most windowing tests fail on hadoop 2 (Harish Butani 
via Ashutosh Chauhan)

[namit] HIVE-4342 NPE for query involving UNION ALL with nested JOIN and UNION 
ALL
(Navis via namit)

[hashutosh] HIVE-4364 : beeline always exits with 0 status, should exit with 
non-zero status on error (Rob Weltman via Ashutosh Chauhan)

[hashutosh] HIVE-4130 : Bring the Lead/Lag UDFs interface in line with Lead/Lag 
UDAFs (Harish Butani via Ashutosh Chauhan)


Changes for Build #2075
[hashutosh] HIVE-2379 : Hive/HBase integration could be improved (Navis via 
Ashutosh Chauhan)

[hashutosh] HIVE-4295 : Lateral view makes invalid result if CP is disabled 
(Navis via Ashutosh Chauhan)

[hashutosh] HIVE-4365 : wrong result in left semi join (Navis via Ashutosh 
Chauhan)

[hashutosh] HIVE-3861 : Upgrade hbase dependency to 0.94 (Gunther Hagleitner 
via Ashutosh Chauhan)


Changes for Build #2076
[hashutosh] HIVE-3891 : physical optimizer changes for auto sort-merge join 
(Namit Jain via Ashutosh Chauhan)

[namit] HIVE-4393 Make the deleteData flag accessable from DropTable/Partition 
events
(Morgan Philips via namit)

[hashutosh] HIVE-4394 : test leadlag.q fails (Ashutosh Chauhan)

[namit] HIVE-4018 MapJoin failing with Distributed Cache error
(Amareshwari Sriramadasu via Namit Jain)


Changes for Build #2077
[namit] HIVE-4300 ant thriftif generated code that is checkedin is not 
up-to-date
(Roshan Naik via namit)


Changes for Build #2078
[namit] HIVE-4409 Prevent incompatible column type changes
(Dilip Joseph via namit)

[namit] HIVE-4095 Add exchange partition in Hive
(Dheeraj Kumar Singh via namit)

[namit] HIVE-4005 Column truncation
(Kevin Wilfong via namit)

[namit] HIVE-3952 merge map-job followed by map-reduce job
(Vinod Kumar Vavilapalli via namit)

[hashutosh] HIVE-4412 : PTFDesc tries serialize transient fields like OIs, etc. 
(Navis via Ashutosh Chauhan)

[khorgath] HIVE-4419 : webhcat - support ${WEBHCAT_PREFIX}/conf/ as config 
directory (Thejas M Nair via Sushanth Sowmyan)

[namit] HIVE-4181 Star argument without table alias for UDTF is not working
(Navis via namit)

[hashutosh] HIVE-4407 : TestHCatStorer.testStoreFuncAllSimpleTypes fails 
because of null case difference (Thejas Nair via Ashutosh Chauhan)

[hashutosh] HIVE-4369 : Many new failures on hadoop 2 (Vikram Dixit via 
Ashutosh Chauhan)


Changes for Build #2079
[namit] HIVE-4424 MetaStoreUtils.java.orig checked in mistakenly by HIVE-4409
(Namit Jain)

[hashutosh] HIVE-4358 : Check for Map side processing in PTFOp is no longer 
valid (Harish Butani via Ashutosh Chauhan)


Changes for Build #2080
[navis] HIVE-4068 Size of aggregation buffer which uses non-primitive type is 
not estimated correctly (Navis)

[khorgath] HIVE-4420 : HCatalog unit tests stop after a failure (Alan Gates via 
Sushanth Sowmyan)

[hashutosh] HIVE-3708 : Add mapreduce workflow information to job configuration 
(Billie Rinaldi via Ashutosh Chauhan)


Changes for Build #2081

Changes for Build #2082
[hashutosh] HIVE-4423 : Improve RCFile::sync(long) 10x (Gopal V via Ashutosh 
Chauhan)

[hashutosh] HIVE-4398 : HS2 Resource leak: operation handles not cleaned when 
originating session is closed (Ashish Vaidya via Ashutosh Chauhan)

[hashutosh] HIVE-4019 : Ability to create and drop temporary partition function 
(Brock Noland via Ashutosh Chauhan)


Changes for Build #2083
[navis] HIVE-4437 Missing file on HIVE-4068 (Navis)


Changes for Build #2084

Changes 

Hive-trunk-hadoop2 - Build # 190 - Failure

2013-05-09 Thread Apache Jenkins Server
Changes for Build #166
[khorgath] HCATALOG-621 bin/hcat should include hbase jar and dependencies in 
the classpath (Nick Dimiduk via Sushanth Sowmyan)

[omalley] HIVE-4178 : ORC fails with files with different numbers of columns


Changes for Build #167
[hashutosh] HIVE-4356 :  remove duplicate impersonation parameters for 
hiveserver2 (Gunther Hagleitner via Ashutosh Chauhan)

[hashutosh] HIVE-4318 : OperatorHooks hit performance even when not used 
(Gunther Hagleitner via Ashutosh Chauhan)

[hashutosh] HIVE-4378 : Counters hit performance even when not used (Gunther 
Hagleitner via Ashutosh Chauhan)

[omalley] HIVE-4189 : ORC fails with String column that ends in lots of nulls 
(Kevin
Wilfong)


Changes for Build #168
[hashutosh] 4248 : Implement a memory manager for ORC. Missed one test file. 
(Owen Omalley via Ashutosh Chauhan)

[hashutosh] HIVE-4248 : Implement a memory manager for ORC (Owen Omalley via 
Ashutosh Chauhan)

[hashutosh] HIVE-4103 : Remove System.gc() call from the map-join local-task 
loop (Gopal V via Ashutosh Chauhan)

[hashutosh] HIVE-4304 : Remove unused builtins and pdk submodules (Travis 
Crawford via Ashutosh Chauhan)

[namit] HIVE-4310 optimize count(distinct) with hive.map.groupby.sorted
(Namit Jain via Gang Tim Liu)


Changes for Build #169
[hashutosh] HIVE-4333 : most windowing tests fail on hadoop 2 (Harish Butani 
via Ashutosh Chauhan)

[namit] HIVE-4342 NPE for query involving UNION ALL with nested JOIN and UNION 
ALL
(Navis via namit)

[hashutosh] HIVE-4364 : beeline always exits with 0 status, should exit with 
non-zero status on error (Rob Weltman via Ashutosh Chauhan)

[hashutosh] HIVE-4130 : Bring the Lead/Lag UDFs interface in line with Lead/Lag 
UDAFs (Harish Butani via Ashutosh Chauhan)


Changes for Build #170
[hashutosh] HIVE-4295 : Lateral view makes invalid result if CP is disabled 
(Navis via Ashutosh Chauhan)

[hashutosh] HIVE-4365 : wrong result in left semi join (Navis via Ashutosh 
Chauhan)

[hashutosh] HIVE-3861 : Upgrade hbase dependency to 0.94 (Gunther Hagleitner 
via Ashutosh Chauhan)

[namit] HIVE-4371 some issue with merging join trees
(Navis via namit)


Changes for Build #171
[hashutosh] HIVE-2379 : Hive/HBase integration could be improved (Navis via 
Ashutosh Chauhan)


Changes for Build #172
[hashutosh] HIVE-4394 : test leadlag.q fails (Ashutosh Chauhan)

[namit] HIVE-4018 MapJoin failing with Distributed Cache error
(Amareshwari Sriramadasu via Namit Jain)


Changes for Build #173
[namit] HIVE-4300 ant thriftif generated code that is checkedin is not 
up-to-date
(Roshan Naik via namit)

[hashutosh] HIVE-3891 : physical optimizer changes for auto sort-merge join 
(Namit Jain via Ashutosh Chauhan)

[namit] HIVE-4393 Make the deleteData flag accessable from DropTable/Partition 
events
(Morgan Philips via namit)


Changes for Build #174
[khorgath] HIVE-4419 : webhcat - support ${WEBHCAT_PREFIX}/conf/ as config 
directory (Thejas M Nair via Sushanth Sowmyan)

[namit] HIVE-4181 Star argument without table alias for UDTF is not working
(Navis via namit)

[hashutosh] HIVE-4407 : TestHCatStorer.testStoreFuncAllSimpleTypes fails 
because of null case difference (Thejas Nair via Ashutosh Chauhan)

[hashutosh] HIVE-4369 : Many new failures on hadoop 2 (Vikram Dixit via 
Ashutosh Chauhan)


Changes for Build #175
[hashutosh] HIVE-4358 : Check for Map side processing in PTFOp is no longer 
valid (Harish Butani via Ashutosh Chauhan)

[namit] HIVE-4409 Prevent incompatible column type changes
(Dilip Joseph via namit)

[namit] HIVE-4095 Add exchange partition in Hive
(Dheeraj Kumar Singh via namit)

[namit] HIVE-4005 Column truncation
(Kevin Wilfong via namit)

[namit] HIVE-3952 merge map-job followed by map-reduce job
(Vinod Kumar Vavilapalli via namit)

[hashutosh] HIVE-4412 : PTFDesc tries serialize transient fields like OIs, etc. 
(Navis via Ashutosh Chauhan)


Changes for Build #176
[hashutosh] HIVE-3708 : Add mapreduce workflow information to job configuration 
(Billie Rinaldi via Ashutosh Chauhan)

[namit] HIVE-4424 MetaStoreUtils.java.orig checked in mistakenly by HIVE-4409
(Namit Jain)


Changes for Build #177
[navis] HIVE-4068 Size of aggregation buffer which uses non-primitive type is 
not estimated correctly (Navis)

[khorgath] HIVE-4420 : HCatalog unit tests stop after a failure (Alan Gates via 
Sushanth Sowmyan)


Changes for Build #178

Changes for Build #179
[hashutosh] HIVE-4423 : Improve RCFile::sync(long) 10x (Gopal V via Ashutosh 
Chauhan)

[hashutosh] HIVE-4398 : HS2 Resource leak: operation handles not cleaned when 
originating session is closed (Ashish Vaidya via Ashutosh Chauhan)

[hashutosh] HIVE-4019 : Ability to create and drop temporary partition function 
(Brock Noland via Ashutosh Chauhan)


Changes for Build #180
[navis] HIVE-4437 Missing file on HIVE-4068 (Navis)


Changes for Build #181

Changes for Build #182

Changes for Build #183
[hashutosh] HIVE-4350 : support AS keyword for table alias (Matthew Weaver via 
Ashutosh 

[jira] [Created] (HIVE-4533) vectorized NotNull operation does not handle short-circuit evaluation for NULL propagation correctly

2013-05-09 Thread Eric Hanson (JIRA)
Eric Hanson created HIVE-4533:
-

 Summary: vectorized NotNull operation does not handle 
short-circuit evaluation for NULL propagation correctly
 Key: HIVE-4533
 URL: https://issues.apache.org/jira/browse/HIVE-4533
 Project: Hive
  Issue Type: Sub-task
Reporter: Eric Hanson


This code does not look at the selection vector so it may waste time 
propagating nulls if the batch has already been filtered. 

A more serious problem is that it only copies over n entries. So if a filter 
has been applied, nulls my not be copied over when they should.

// handle NULLs
if (inputColVector.noNulls) {
  outV.noNulls = true;
} else {
  outV.noNulls = false;
  if (inputColVector.isRepeating) {
outV.isNull[0] = inputColVector.isNull[0];
  } else {
System.arraycopy(inputColVector.isNull, 0, outV.isNull, 0, n);
  }
}



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4533) vectorized NotNull operation does not handle short-circuit evaluation for NULL propagation correctly

2013-05-09 Thread Eric Hanson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Hanson updated HIVE-4533:
--

Description: 
See file NotNull.java in org.apache.hadoop.hive.ql.exec.vector.expressions.

This code does not look at the selection vector so it may waste time 
propagating nulls if the batch has already been filtered. 

A more serious problem is that it only copies over n entries. So if a filter 
has been applied, nulls my not be copied over when they should.

// handle NULLs
if (inputColVector.noNulls) {
  outV.noNulls = true;
} else {
  outV.noNulls = false;
  if (inputColVector.isRepeating) {
outV.isNull[0] = inputColVector.isNull[0];
  } else {
System.arraycopy(inputColVector.isNull, 0, outV.isNull, 0, n);
  }
}



  was:
This code does not look at the selection vector so it may waste time 
propagating nulls if the batch has already been filtered. 

A more serious problem is that it only copies over n entries. So if a filter 
has been applied, nulls my not be copied over when they should.

// handle NULLs
if (inputColVector.noNulls) {
  outV.noNulls = true;
} else {
  outV.noNulls = false;
  if (inputColVector.isRepeating) {
outV.isNull[0] = inputColVector.isNull[0];
  } else {
System.arraycopy(inputColVector.isNull, 0, outV.isNull, 0, n);
  }
}




 vectorized NotNull operation does not handle short-circuit evaluation for 
 NULL propagation correctly
 

 Key: HIVE-4533
 URL: https://issues.apache.org/jira/browse/HIVE-4533
 Project: Hive
  Issue Type: Sub-task
Reporter: Eric Hanson

 See file NotNull.java in org.apache.hadoop.hive.ql.exec.vector.expressions.
 This code does not look at the selection vector so it may waste time 
 propagating nulls if the batch has already been filtered. 
 A more serious problem is that it only copies over n entries. So if a filter 
 has been applied, nulls my not be copied over when they should.
 // handle NULLs
 if (inputColVector.noNulls) {
   outV.noNulls = true;
 } else {
   outV.noNulls = false;
   if (inputColVector.isRepeating) {
 outV.isNull[0] = inputColVector.isNull[0];
   } else {
 System.arraycopy(inputColVector.isNull, 0, outV.isNull, 0, n);
   }
 }

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4533) vectorized NotNull operation does not handle short-circuit evaluation for NULL propagation correctly

2013-05-09 Thread Eric Hanson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Hanson updated HIVE-4533:
--

Description: 
See file NotCol.java in org.apache.hadoop.hive.ql.exec.vector.expressions.

This code does not look at the selection vector so it may waste time 
propagating nulls if the batch has already been filtered. 

A more serious problem is that it only copies over n entries. So if a filter 
has been applied, nulls my not be copied over when they should.

// handle NULLs
if (inputColVector.noNulls) {
  outV.noNulls = true;
} else {
  outV.noNulls = false;
  if (inputColVector.isRepeating) {
outV.isNull[0] = inputColVector.isNull[0];
  } else {
System.arraycopy(inputColVector.isNull, 0, outV.isNull, 0, n);
  }
}



  was:
See file NotNull.java in org.apache.hadoop.hive.ql.exec.vector.expressions.

This code does not look at the selection vector so it may waste time 
propagating nulls if the batch has already been filtered. 

A more serious problem is that it only copies over n entries. So if a filter 
has been applied, nulls my not be copied over when they should.

// handle NULLs
if (inputColVector.noNulls) {
  outV.noNulls = true;
} else {
  outV.noNulls = false;
  if (inputColVector.isRepeating) {
outV.isNull[0] = inputColVector.isNull[0];
  } else {
System.arraycopy(inputColVector.isNull, 0, outV.isNull, 0, n);
  }
}




 vectorized NotNull operation does not handle short-circuit evaluation for 
 NULL propagation correctly
 

 Key: HIVE-4533
 URL: https://issues.apache.org/jira/browse/HIVE-4533
 Project: Hive
  Issue Type: Sub-task
Reporter: Eric Hanson

 See file NotCol.java in org.apache.hadoop.hive.ql.exec.vector.expressions.
 This code does not look at the selection vector so it may waste time 
 propagating nulls if the batch has already been filtered. 
 A more serious problem is that it only copies over n entries. So if a filter 
 has been applied, nulls my not be copied over when they should.
 // handle NULLs
 if (inputColVector.noNulls) {
   outV.noNulls = true;
 } else {
   outV.noNulls = false;
   if (inputColVector.isRepeating) {
 outV.isNull[0] = inputColVector.isNull[0];
   } else {
 System.arraycopy(inputColVector.isNull, 0, outV.isNull, 0, n);
   }
 }

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4160) Vectorized Query Execution in Hive

2013-05-09 Thread Eric Hanson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13653430#comment-13653430
 ] 

Eric Hanson commented on HIVE-4160:
---

The code for this work is currently in the vectorization branch of the public 
Hive repo.

 Vectorized Query Execution in Hive
 --

 Key: HIVE-4160
 URL: https://issues.apache.org/jira/browse/HIVE-4160
 Project: Hive
  Issue Type: New Feature
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: Hive-Vectorized-Query-Execution-Design.docx, 
 Hive-Vectorized-Query-Execution-Design-rev2.docx, 
 Hive-Vectorized-Query-Execution-Design-rev3.docx, 
 Hive-Vectorized-Query-Execution-Design-rev3.docx, 
 Hive-Vectorized-Query-Execution-Design-rev3.pdf, 
 Hive-Vectorized-Query-Execution-Design-rev4.docx, 
 Hive-Vectorized-Query-Execution-Design-rev4.pdf, 
 Hive-Vectorized-Query-Execution-Design-rev5.docx, 
 Hive-Vectorized-Query-Execution-Design-rev5.pdf, 
 Hive-Vectorized-Query-Execution-Design-rev6.docx, 
 Hive-Vectorized-Query-Execution-Design-rev6.pdf


 The Hive query execution engine currently processes one row at a time. A 
 single row of data goes through all the operators before the next row can be 
 processed. This mode of processing is very inefficient in terms of CPU usage. 
 Research has demonstrated that this yields very low instructions per cycle 
 [MonetDB X100]. Also currently Hive heavily relies on lazy deserialization 
 and data columns go through a layer of object inspectors that identify column 
 type, deserialize data and determine appropriate expression routines in the 
 inner loop. These layers of virtual method calls further slow down the 
 processing. 
 This work will add support for vectorized query execution to Hive, where, 
 instead of individual rows, batches of about a thousand rows at a time are 
 processed. Each column in the batch is represented as a vector of a primitive 
 data type. The inner loop of execution scans these vectors very fast, 
 avoiding method calls, deserialization, unnecessary if-then-else, etc. This 
 substantially reduces CPU time used, and gives excellent instructions per 
 cycle (i.e. improved processor pipeline utilization). See the attached design 
 specification for more details.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-4534) IsNotNull vectorized expression does not look at noNulls

2013-05-09 Thread Eric Hanson (JIRA)
Eric Hanson created HIVE-4534:
-

 Summary: IsNotNull vectorized expression does not look at noNulls
 Key: HIVE-4534
 URL: https://issues.apache.org/jira/browse/HIVE-4534
 Project: Hive
  Issue Type: Sub-task
Reporter: Eric Hanson


See file IsNotNull.java in package 
org.apache.hadoop.hive.ql.exec.vector.expressions

It never looks at the noNulls flag on the input vector, but accesses the 
isNull[] array anyway. This can yield incorrect results.

isRepeating and noNulls are not set in the output, which can also cause wrong 
results.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4293) Predicates following UDTF operator are removed by PPD

2013-05-09 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-4293:


Status: Patch Available  (was: Open)

 Predicates following UDTF operator are removed by PPD
 -

 Key: HIVE-4293
 URL: https://issues.apache.org/jira/browse/HIVE-4293
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Navis
Assignee: Navis
 Attachments: HIVE-4293.D9933.1.patch, HIVE-4293.D9933.2.patch


 For example, 
 {noformat}
 explain SELECT value from (
   select explode(array(key, value)) as (value) from (
 select * FROM src WHERE key  200
   ) A
 ) B WHERE value  300
 ;
 {noformat}
 Makes plan like this, removing last predicates
 {noformat}
   TableScan
 alias: src
 Filter Operator
   predicate:
   expr: (key  200.0)
   type: boolean
   Select Operator
 expressions:
   expr: array(key,value)
   type: arraystring
 outputColumnNames: _col0
 UDTF Operator
   function name: explode
   Select Operator
 expressions:
   expr: col
   type: string
 outputColumnNames: _col0
 File Output Operator
   compressed: false
   GlobalTableId: 0
   table:
   input format: org.apache.hadoop.mapred.TextInputFormat
   output format: 
 org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4293) Predicates following UDTF operator are removed by PPD

2013-05-09 Thread Phabricator (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HIVE-4293:
--

Attachment: HIVE-4293.D9933.3.patch

navis updated the revision HIVE-4293 [jira] Predicates following UDTF operator 
are removed by PPD.

  Rebased to trunk  Fixed test results

Reviewers: JIRA

REVISION DETAIL
  https://reviews.facebook.net/D9933

CHANGE SINCE LAST DIFF
  https://reviews.facebook.net/D9933?vs=31221id=33483#toc

AFFECTED FILES
  ql/src/java/org/apache/hadoop/hive/ql/exec/LateralViewJoinOperator.java
  ql/src/java/org/apache/hadoop/hive/ql/exec/ReduceSinkOperator.java
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/ColumnPrunerProcFactory.java
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
  ql/src/java/org/apache/hadoop/hive/ql/plan/LateralViewJoinDesc.java
  ql/src/java/org/apache/hadoop/hive/ql/ppd/ExprWalkerInfo.java
  ql/src/java/org/apache/hadoop/hive/ql/ppd/OpProcFactory.java
  ql/src/java/org/apache/hadoop/hive/ql/ppd/PredicatePushDown.java
  ql/src/test/queries/clientpositive/lateral_view_ppd.q
  ql/src/test/queries/clientpositive/ppd_udtf.q
  ql/src/test/results/clientpositive/cluster.q.out
  ql/src/test/results/clientpositive/ctas_colname.q.out
  ql/src/test/results/clientpositive/lateral_view_ppd.q.out
  ql/src/test/results/clientpositive/ppd2.q.out
  ql/src/test/results/clientpositive/ppd_gby.q.out
  ql/src/test/results/clientpositive/ppd_gby2.q.out
  ql/src/test/results/clientpositive/ppd_udtf.q.out
  ql/src/test/results/clientpositive/udtf_json_tuple.q.out
  ql/src/test/results/clientpositive/udtf_parse_url_tuple.q.out
  ql/src/test/results/compiler/plan/join1.q.xml
  ql/src/test/results/compiler/plan/join2.q.xml
  ql/src/test/results/compiler/plan/join3.q.xml
  ql/src/test/results/compiler/plan/join4.q.xml
  ql/src/test/results/compiler/plan/join5.q.xml
  ql/src/test/results/compiler/plan/join6.q.xml
  ql/src/test/results/compiler/plan/join7.q.xml
  ql/src/test/results/compiler/plan/join8.q.xml

To: JIRA, navis


 Predicates following UDTF operator are removed by PPD
 -

 Key: HIVE-4293
 URL: https://issues.apache.org/jira/browse/HIVE-4293
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Navis
Assignee: Navis
 Attachments: HIVE-4293.D9933.1.patch, HIVE-4293.D9933.2.patch, 
 HIVE-4293.D9933.3.patch


 For example, 
 {noformat}
 explain SELECT value from (
   select explode(array(key, value)) as (value) from (
 select * FROM src WHERE key  200
   ) A
 ) B WHERE value  300
 ;
 {noformat}
 Makes plan like this, removing last predicates
 {noformat}
   TableScan
 alias: src
 Filter Operator
   predicate:
   expr: (key  200.0)
   type: boolean
   Select Operator
 expressions:
   expr: array(key,value)
   type: arraystring
 outputColumnNames: _col0
 UDTF Operator
   function name: explode
   Select Operator
 expressions:
   expr: col
   type: string
 outputColumnNames: _col0
 File Output Operator
   compressed: false
   GlobalTableId: 0
   table:
   input format: org.apache.hadoop.mapred.TextInputFormat
   output format: 
 org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4526) auto_sortmerge_join_9.q throws NPE but test is succeeded

2013-05-09 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-4526:


Priority: Major  (was: Minor)

 auto_sortmerge_join_9.q throws NPE but test is succeeded
 

 Key: HIVE-4526
 URL: https://issues.apache.org/jira/browse/HIVE-4526
 Project: Hive
  Issue Type: Test
  Components: Tests
Reporter: Navis

 auto_sortmerge_join_9.q
 {noformat}
 [junit] Running org.apache.hadoop.hive.cli.TestCliDriver
 [junit] Begin query: auto_sortmerge_join_9.q
 [junit] Deleted 
 file:/home/navis/apache/oss-hive/build/ql/test/data/warehouse/tbl1
 [junit] Deleted 
 file:/home/navis/apache/oss-hive/build/ql/test/data/warehouse/tbl2
 [junit] org.apache.hadoop.hive.ql.metadata.HiveException: Failed with 
 exception nulljava.lang.NullPointerException
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.FetchOperator.getRowInspectorFromPartitionedTable(FetchOperator.java:252)
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.FetchOperator.getOutputObjectInspector(FetchOperator.java:605)
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.MapredLocalTask.initializeOperators(MapredLocalTask.java:393)
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.MapredLocalTask.executeFromChildJVM(MapredLocalTask.java:277)
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.ExecDriver.main(ExecDriver.java:676)
 [junit]   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 [junit]   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 [junit]   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 [junit]   at java.lang.reflect.Method.invoke(Method.java:597)
 [junit]   at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
 [junit] 
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.FetchOperator.getOutputObjectInspector(FetchOperator.java:631)
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.MapredLocalTask.initializeOperators(MapredLocalTask.java:393)
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.MapredLocalTask.executeFromChildJVM(MapredLocalTask.java:277)
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.ExecDriver.main(ExecDriver.java:676)
 [junit]   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 [junit]   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 [junit]   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 [junit]   at java.lang.reflect.Method.invoke(Method.java:597)
 [junit]   at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
 [junit] org.apache.hadoop.hive.ql.metadata.HiveException: Failed with 
 exception nulljava.lang.NullPointerException
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.FetchOperator.getRowInspectorFromPartitionedTable(FetchOperator.java:252)
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.FetchOperator.getOutputObjectInspector(FetchOperator.java:605)
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.MapredLocalTask.initializeOperators(MapredLocalTask.java:393)
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.MapredLocalTask.executeFromChildJVM(MapredLocalTask.java:277)
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.ExecDriver.main(ExecDriver.java:676)
 [junit]   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 [junit]   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 [junit]   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 [junit]   at java.lang.reflect.Method.invoke(Method.java:597)
 [junit]   at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
 [junit] 
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.FetchOperator.getOutputObjectInspector(FetchOperator.java:631)
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.MapredLocalTask.initializeOperators(MapredLocalTask.java:393)
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.MapredLocalTask.executeFromChildJVM(MapredLocalTask.java:277)
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.ExecDriver.main(ExecDriver.java:676)
 [junit]   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 [junit]   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 [junit]   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 [junit]   at java.lang.reflect.Method.invoke(Method.java:597)
 [junit]   at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
 [junit] Running: diff -a 
 /home/navis/apache/oss-hive/build/ql/test/logs/clientpositive/auto_sortmerge_join_9.q.out
  
 

[jira] [Commented] (HIVE-4526) auto_sortmerge_join_9.q throws NPE but test is succeeded

2013-05-09 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13653486#comment-13653486
 ] 

Navis commented on HIVE-4526:
-

Two queries are failed and reverted to reducer-join.

 auto_sortmerge_join_9.q throws NPE but test is succeeded
 

 Key: HIVE-4526
 URL: https://issues.apache.org/jira/browse/HIVE-4526
 Project: Hive
  Issue Type: Test
  Components: Tests
Reporter: Navis

 auto_sortmerge_join_9.q
 {noformat}
 [junit] Running org.apache.hadoop.hive.cli.TestCliDriver
 [junit] Begin query: auto_sortmerge_join_9.q
 [junit] Deleted 
 file:/home/navis/apache/oss-hive/build/ql/test/data/warehouse/tbl1
 [junit] Deleted 
 file:/home/navis/apache/oss-hive/build/ql/test/data/warehouse/tbl2
 [junit] org.apache.hadoop.hive.ql.metadata.HiveException: Failed with 
 exception nulljava.lang.NullPointerException
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.FetchOperator.getRowInspectorFromPartitionedTable(FetchOperator.java:252)
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.FetchOperator.getOutputObjectInspector(FetchOperator.java:605)
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.MapredLocalTask.initializeOperators(MapredLocalTask.java:393)
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.MapredLocalTask.executeFromChildJVM(MapredLocalTask.java:277)
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.ExecDriver.main(ExecDriver.java:676)
 [junit]   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 [junit]   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 [junit]   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 [junit]   at java.lang.reflect.Method.invoke(Method.java:597)
 [junit]   at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
 [junit] 
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.FetchOperator.getOutputObjectInspector(FetchOperator.java:631)
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.MapredLocalTask.initializeOperators(MapredLocalTask.java:393)
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.MapredLocalTask.executeFromChildJVM(MapredLocalTask.java:277)
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.ExecDriver.main(ExecDriver.java:676)
 [junit]   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 [junit]   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 [junit]   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 [junit]   at java.lang.reflect.Method.invoke(Method.java:597)
 [junit]   at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
 [junit] org.apache.hadoop.hive.ql.metadata.HiveException: Failed with 
 exception nulljava.lang.NullPointerException
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.FetchOperator.getRowInspectorFromPartitionedTable(FetchOperator.java:252)
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.FetchOperator.getOutputObjectInspector(FetchOperator.java:605)
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.MapredLocalTask.initializeOperators(MapredLocalTask.java:393)
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.MapredLocalTask.executeFromChildJVM(MapredLocalTask.java:277)
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.ExecDriver.main(ExecDriver.java:676)
 [junit]   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 [junit]   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 [junit]   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 [junit]   at java.lang.reflect.Method.invoke(Method.java:597)
 [junit]   at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
 [junit] 
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.FetchOperator.getOutputObjectInspector(FetchOperator.java:631)
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.MapredLocalTask.initializeOperators(MapredLocalTask.java:393)
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.MapredLocalTask.executeFromChildJVM(MapredLocalTask.java:277)
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.ExecDriver.main(ExecDriver.java:676)
 [junit]   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 [junit]   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 [junit]   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 [junit]   at java.lang.reflect.Method.invoke(Method.java:597)
 [junit]   at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
 [junit] Running: diff -a 
 /home/navis/apache/oss-hive/build/ql/test/logs/clientpositive/auto_sortmerge_join_9.q.out
  
 

[jira] [Assigned] (HIVE-4526) auto_sortmerge_join_9.q throws NPE but test is succeeded

2013-05-09 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis reassigned HIVE-4526:
---

Assignee: Navis

 auto_sortmerge_join_9.q throws NPE but test is succeeded
 

 Key: HIVE-4526
 URL: https://issues.apache.org/jira/browse/HIVE-4526
 Project: Hive
  Issue Type: Test
  Components: Tests
Reporter: Navis
Assignee: Navis

 auto_sortmerge_join_9.q
 {noformat}
 [junit] Running org.apache.hadoop.hive.cli.TestCliDriver
 [junit] Begin query: auto_sortmerge_join_9.q
 [junit] Deleted 
 file:/home/navis/apache/oss-hive/build/ql/test/data/warehouse/tbl1
 [junit] Deleted 
 file:/home/navis/apache/oss-hive/build/ql/test/data/warehouse/tbl2
 [junit] org.apache.hadoop.hive.ql.metadata.HiveException: Failed with 
 exception nulljava.lang.NullPointerException
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.FetchOperator.getRowInspectorFromPartitionedTable(FetchOperator.java:252)
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.FetchOperator.getOutputObjectInspector(FetchOperator.java:605)
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.MapredLocalTask.initializeOperators(MapredLocalTask.java:393)
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.MapredLocalTask.executeFromChildJVM(MapredLocalTask.java:277)
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.ExecDriver.main(ExecDriver.java:676)
 [junit]   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 [junit]   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 [junit]   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 [junit]   at java.lang.reflect.Method.invoke(Method.java:597)
 [junit]   at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
 [junit] 
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.FetchOperator.getOutputObjectInspector(FetchOperator.java:631)
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.MapredLocalTask.initializeOperators(MapredLocalTask.java:393)
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.MapredLocalTask.executeFromChildJVM(MapredLocalTask.java:277)
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.ExecDriver.main(ExecDriver.java:676)
 [junit]   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 [junit]   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 [junit]   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 [junit]   at java.lang.reflect.Method.invoke(Method.java:597)
 [junit]   at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
 [junit] org.apache.hadoop.hive.ql.metadata.HiveException: Failed with 
 exception nulljava.lang.NullPointerException
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.FetchOperator.getRowInspectorFromPartitionedTable(FetchOperator.java:252)
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.FetchOperator.getOutputObjectInspector(FetchOperator.java:605)
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.MapredLocalTask.initializeOperators(MapredLocalTask.java:393)
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.MapredLocalTask.executeFromChildJVM(MapredLocalTask.java:277)
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.ExecDriver.main(ExecDriver.java:676)
 [junit]   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 [junit]   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 [junit]   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 [junit]   at java.lang.reflect.Method.invoke(Method.java:597)
 [junit]   at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
 [junit] 
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.FetchOperator.getOutputObjectInspector(FetchOperator.java:631)
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.MapredLocalTask.initializeOperators(MapredLocalTask.java:393)
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.MapredLocalTask.executeFromChildJVM(MapredLocalTask.java:277)
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.ExecDriver.main(ExecDriver.java:676)
 [junit]   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 [junit]   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 [junit]   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 [junit]   at java.lang.reflect.Method.invoke(Method.java:597)
 [junit]   at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
 [junit] Running: diff -a 
 /home/navis/apache/oss-hive/build/ql/test/logs/clientpositive/auto_sortmerge_join_9.q.out
  
 

[jira] [Updated] (HIVE-4526) auto_sortmerge_join_9.q throws NPE but test is succeeded

2013-05-09 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-4526:


Status: Patch Available  (was: Open)

 auto_sortmerge_join_9.q throws NPE but test is succeeded
 

 Key: HIVE-4526
 URL: https://issues.apache.org/jira/browse/HIVE-4526
 Project: Hive
  Issue Type: Test
  Components: Tests
Reporter: Navis
Assignee: Navis
 Attachments: HIVE-4526.D10725.1.patch


 auto_sortmerge_join_9.q
 {noformat}
 [junit] Running org.apache.hadoop.hive.cli.TestCliDriver
 [junit] Begin query: auto_sortmerge_join_9.q
 [junit] Deleted 
 file:/home/navis/apache/oss-hive/build/ql/test/data/warehouse/tbl1
 [junit] Deleted 
 file:/home/navis/apache/oss-hive/build/ql/test/data/warehouse/tbl2
 [junit] org.apache.hadoop.hive.ql.metadata.HiveException: Failed with 
 exception nulljava.lang.NullPointerException
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.FetchOperator.getRowInspectorFromPartitionedTable(FetchOperator.java:252)
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.FetchOperator.getOutputObjectInspector(FetchOperator.java:605)
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.MapredLocalTask.initializeOperators(MapredLocalTask.java:393)
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.MapredLocalTask.executeFromChildJVM(MapredLocalTask.java:277)
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.ExecDriver.main(ExecDriver.java:676)
 [junit]   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 [junit]   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 [junit]   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 [junit]   at java.lang.reflect.Method.invoke(Method.java:597)
 [junit]   at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
 [junit] 
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.FetchOperator.getOutputObjectInspector(FetchOperator.java:631)
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.MapredLocalTask.initializeOperators(MapredLocalTask.java:393)
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.MapredLocalTask.executeFromChildJVM(MapredLocalTask.java:277)
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.ExecDriver.main(ExecDriver.java:676)
 [junit]   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 [junit]   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 [junit]   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 [junit]   at java.lang.reflect.Method.invoke(Method.java:597)
 [junit]   at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
 [junit] org.apache.hadoop.hive.ql.metadata.HiveException: Failed with 
 exception nulljava.lang.NullPointerException
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.FetchOperator.getRowInspectorFromPartitionedTable(FetchOperator.java:252)
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.FetchOperator.getOutputObjectInspector(FetchOperator.java:605)
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.MapredLocalTask.initializeOperators(MapredLocalTask.java:393)
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.MapredLocalTask.executeFromChildJVM(MapredLocalTask.java:277)
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.ExecDriver.main(ExecDriver.java:676)
 [junit]   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 [junit]   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 [junit]   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 [junit]   at java.lang.reflect.Method.invoke(Method.java:597)
 [junit]   at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
 [junit] 
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.FetchOperator.getOutputObjectInspector(FetchOperator.java:631)
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.MapredLocalTask.initializeOperators(MapredLocalTask.java:393)
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.MapredLocalTask.executeFromChildJVM(MapredLocalTask.java:277)
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.ExecDriver.main(ExecDriver.java:676)
 [junit]   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 [junit]   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 [junit]   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 [junit]   at java.lang.reflect.Method.invoke(Method.java:597)
 [junit]   at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
 [junit] Running: diff -a 
 /home/navis/apache/oss-hive/build/ql/test/logs/clientpositive/auto_sortmerge_join_9.q.out
  
 

[jira] [Updated] (HIVE-4209) Cache evaluation result of deterministic expression and reuse it

2013-05-09 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-4209:


Resolution: Fixed
Status: Resolved  (was: Patch Available)

 Cache evaluation result of deterministic expression and reuse it
 

 Key: HIVE-4209
 URL: https://issues.apache.org/jira/browse/HIVE-4209
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Minor
 Fix For: 0.12.0

 Attachments: HIVE-4209.6.patch.txt, HIVE-4209.D9585.1.patch, 
 HIVE-4209.D9585.2.patch, HIVE-4209.D9585.3.patch, HIVE-4209.D9585.4.patch, 
 HIVE-4209.D9585.5.patch


 For example, 
 {noformat}
 select key from src where key + 1  100 AND key + 1  200 limit 3;
 {noformat}
 key + 1 need not to be evaluated twice.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4526) auto_sortmerge_join_9.q throws NPE but test is succeeded

2013-05-09 Thread Phabricator (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HIVE-4526:
--

Attachment: HIVE-4526.D10725.1.patch

navis requested code review of HIVE-4526 [jira] auto_sortmerge_join_9.q throws 
NPE but test is succeeded.

Reviewers: JIRA

HIVE-4526 auto_sortmerge_join_9.q throws NPE but test is succeeded

auto_sortmerge_join_9.q

[junit] Running org.apache.hadoop.hive.cli.TestCliDriver
[junit] Begin query: auto_sortmerge_join_9.q
[junit] Deleted 
file:/home/navis/apache/oss-hive/build/ql/test/data/warehouse/tbl1
[junit] Deleted 
file:/home/navis/apache/oss-hive/build/ql/test/data/warehouse/tbl2
[junit] org.apache.hadoop.hive.ql.metadata.HiveException: Failed with 
exception nulljava.lang.NullPointerException
[junit] at 
org.apache.hadoop.hive.ql.exec.FetchOperator.getRowInspectorFromPartitionedTable(FetchOperator.java:252)
[junit] at 
org.apache.hadoop.hive.ql.exec.FetchOperator.getOutputObjectInspector(FetchOperator.java:605)
[junit] at 
org.apache.hadoop.hive.ql.exec.MapredLocalTask.initializeOperators(MapredLocalTask.java:393)
[junit] at 
org.apache.hadoop.hive.ql.exec.MapredLocalTask.executeFromChildJVM(MapredLocalTask.java:277)
[junit] at 
org.apache.hadoop.hive.ql.exec.ExecDriver.main(ExecDriver.java:676)
[junit] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
[junit] at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
[junit] at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
[junit] at java.lang.reflect.Method.invoke(Method.java:597)
[junit] at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
[junit]
[junit] at 
org.apache.hadoop.hive.ql.exec.FetchOperator.getOutputObjectInspector(FetchOperator.java:631)
[junit] at 
org.apache.hadoop.hive.ql.exec.MapredLocalTask.initializeOperators(MapredLocalTask.java:393)
[junit] at 
org.apache.hadoop.hive.ql.exec.MapredLocalTask.executeFromChildJVM(MapredLocalTask.java:277)
[junit] at 
org.apache.hadoop.hive.ql.exec.ExecDriver.main(ExecDriver.java:676)
[junit] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
[junit] at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
[junit] at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
[junit] at java.lang.reflect.Method.invoke(Method.java:597)
[junit] at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
[junit] org.apache.hadoop.hive.ql.metadata.HiveException: Failed with 
exception nulljava.lang.NullPointerException
[junit] at 
org.apache.hadoop.hive.ql.exec.FetchOperator.getRowInspectorFromPartitionedTable(FetchOperator.java:252)
[junit] at 
org.apache.hadoop.hive.ql.exec.FetchOperator.getOutputObjectInspector(FetchOperator.java:605)
[junit] at 
org.apache.hadoop.hive.ql.exec.MapredLocalTask.initializeOperators(MapredLocalTask.java:393)
[junit] at 
org.apache.hadoop.hive.ql.exec.MapredLocalTask.executeFromChildJVM(MapredLocalTask.java:277)
[junit] at 
org.apache.hadoop.hive.ql.exec.ExecDriver.main(ExecDriver.java:676)
[junit] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
[junit] at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
[junit] at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
[junit] at java.lang.reflect.Method.invoke(Method.java:597)
[junit] at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
[junit]
[junit] at 
org.apache.hadoop.hive.ql.exec.FetchOperator.getOutputObjectInspector(FetchOperator.java:631)
[junit] at 
org.apache.hadoop.hive.ql.exec.MapredLocalTask.initializeOperators(MapredLocalTask.java:393)
[junit] at 
org.apache.hadoop.hive.ql.exec.MapredLocalTask.executeFromChildJVM(MapredLocalTask.java:277)
[junit] at 
org.apache.hadoop.hive.ql.exec.ExecDriver.main(ExecDriver.java:676)
[junit] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
[junit] at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
[junit] at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
[junit] at java.lang.reflect.Method.invoke(Method.java:597)
[junit] at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
[junit] Running: diff -a 
/home/navis/apache/oss-hive/build/ql/test/logs/clientpositive/auto_sortmerge_join_9.q.out
 
/home/navis/apache/oss-hive/ql/src/test/results/clientpositive/auto_sortmerge_join_9.q.out
[junit] Done query: auto_sortmerge_join_9.q elapsedTime=178s
[junit] Cleaning up TestCliDriver
[junit] Tests run: 2, 

[jira] [Assigned] (HIVE-4533) vectorized NotCol operation does not handle short-circuit evaluation for NULL propagation correctly

2013-05-09 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey reassigned HIVE-4533:
--

Assignee: Jitendra Nath Pandey

 vectorized NotCol operation does not handle short-circuit evaluation for NULL 
 propagation correctly
 ---

 Key: HIVE-4533
 URL: https://issues.apache.org/jira/browse/HIVE-4533
 Project: Hive
  Issue Type: Sub-task
Reporter: Eric Hanson
Assignee: Jitendra Nath Pandey

 See file NotCol.java in org.apache.hadoop.hive.ql.exec.vector.expressions.
 This code does not look at the selection vector so it may waste time 
 propagating nulls if the batch has already been filtered. 
 A more serious problem is that it only copies over n entries. So if a filter 
 has been applied, nulls my not be copied over when they should.
 // handle NULLs
 if (inputColVector.noNulls) {
   outV.noNulls = true;
 } else {
   outV.noNulls = false;
   if (inputColVector.isRepeating) {
 outV.isNull[0] = inputColVector.isNull[0];
   } else {
 System.arraycopy(inputColVector.isNull, 0, outV.isNull, 0, n);
   }
 }

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (HIVE-4534) IsNotNull vectorized expression does not look at noNulls

2013-05-09 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4534?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey reassigned HIVE-4534:
--

Assignee: Jitendra Nath Pandey

 IsNotNull vectorized expression does not look at noNulls
 

 Key: HIVE-4534
 URL: https://issues.apache.org/jira/browse/HIVE-4534
 Project: Hive
  Issue Type: Sub-task
Reporter: Eric Hanson
Assignee: Jitendra Nath Pandey

 See file IsNotNull.java in package 
 org.apache.hadoop.hive.ql.exec.vector.expressions
 It never looks at the noNulls flag on the input vector, but accesses the 
 isNull[] array anyway. This can yield incorrect results.
 isRepeating and noNulls are not set in the output, which can also cause wrong 
 results.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira