[jira] [Updated] (HIVE-6662) Vector Join operations with DATE columns fail

2014-03-14 Thread Gopal V (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-6662:
--

Attachment: HIVE-6662.1.patch

 Vector Join operations with DATE columns fail
 -

 Key: HIVE-6662
 URL: https://issues.apache.org/jira/browse/HIVE-6662
 Project: Hive
  Issue Type: Bug
Reporter: Gopal V
Assignee: Gopal V
 Attachments: HIVE-6662.1.patch


 Trying to generate a DATE column as part of a JOIN's output throws an 
 exception
 {code}
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Incompatible 
 Long vector column and primitive category DATE
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorColumnAssignFactory.buildObjectAssign(VectorColumnAssignFactory.java:306)
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorColumnAssignFactory.buildAssigners(VectorColumnAssignFactory.java:414)
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.internalForward(VectorMapJoinOperator.java:235)
 at 
 org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genAllOneUniqueJoinObject(CommonJoinOperator.java:670)
 at 
 org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject(CommonJoinOperator.java:754)
 at 
 org.apache.hadoop.hive.ql.exec.MapJoinOperator.processOp(MapJoinOperator.java:229)
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.processOp(VectorMapJoinOperator.java:292)
 at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:790)
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6312) doAs with plain sasl auth should be session aware

2014-03-14 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13934664#comment-13934664
 ] 

Thejas M Nair commented on HIVE-6312:
-

[~prasadm] Yes, the patch from Navis didn't apply on post HIVE-5155 trunk, that 
is why I had to rebase it. HIVE-6312.3.patch.txt is the rebased patch I 
uploaded. It is not in the reviewboard link that Navis created, because I can't 
upload it there.



 doAs with plain sasl auth should be session aware
 -

 Key: HIVE-6312
 URL: https://issues.apache.org/jira/browse/HIVE-6312
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Navis
Assignee: Navis
 Attachments: HIVE-6312.1.patch.txt, HIVE-6312.2.patch.txt, 
 HIVE-6312.3.patch.txt


 TUGIContainingProcessor creates new Subject for each invocation which induces 
 FileSystem leakage when cache is enable(true by default).



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6518) Add a GC canary to the VectorGroupByOperator to flush whenever a GC is triggered

2014-03-14 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13934667#comment-13934667
 ] 

Hive QA commented on HIVE-6518:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12634401/HIVE-6518.3.patch

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 5389 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_auto_sortmerge_join_16
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_bucketmapjoin6
{noformat}

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1769/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1769/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12634401

 Add a GC canary to the VectorGroupByOperator to flush whenever a GC is 
 triggered
 

 Key: HIVE-6518
 URL: https://issues.apache.org/jira/browse/HIVE-6518
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.13.0
Reporter: Gopal V
Assignee: Gopal V
Priority: Minor
 Attachments: HIVE-6518.1-tez.patch, HIVE-6518.2-tez.patch, 
 HIVE-6518.2.patch, HIVE-6518.3.patch


 The current VectorGroupByOperator implementation flushes the in-memory hashes 
 when the maximum entries or fraction of memory is hit.
 This works for most cases, but there are some corner cases where we hit GC 
 ovehead limits or heap size limits before either of those conditions are 
 reached due to the rest of the pipeline.
 This patch adds a SoftReference as a GC canary. If the soft reference is 
 dead, then a full GC pass happened sometime in the near past  the 
 aggregation hashtables should be flushed immediately before another full GC 
 is triggered.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6312) doAs with plain sasl auth should be session aware

2014-03-14 Thread Prasad Mujumdar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13934684#comment-13934684
 ] 

Prasad Mujumdar commented on HIVE-6312:
---

ah ok. I only looked at the review board and not the latest patch. sorry about 
that.
Updated changes look fine to me.

+1


 doAs with plain sasl auth should be session aware
 -

 Key: HIVE-6312
 URL: https://issues.apache.org/jira/browse/HIVE-6312
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Navis
Assignee: Navis
 Attachments: HIVE-6312.1.patch.txt, HIVE-6312.2.patch.txt, 
 HIVE-6312.3.patch.txt


 TUGIContainingProcessor creates new Subject for each invocation which induces 
 FileSystem leakage when cache is enable(true by default).



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6430) MapJoin hash table has large memory overhead

2014-03-14 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13934692#comment-13934692
 ] 

Sergey Shelukhin commented on HIVE-6430:


Addressed all CR feedback, but patch still fails some Tez tests. Will address 
tomorrow.

Meanwhile, can you review common code (I may separate it into different patch), 
so that we could perhaps put this into Hive 13 in disabled form?

 MapJoin hash table has large memory overhead
 

 Key: HIVE-6430
 URL: https://issues.apache.org/jira/browse/HIVE-6430
 Project: Hive
  Issue Type: Improvement
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HIVE-6430.01.patch, HIVE-6430.patch


 Right now, in some queries, I see that storing e.g. 4 ints (2 for key and 2 
 for row) can take several hundred bytes, which is ridiculous. I am reducing 
 the size of MJKey and MJRowContainer in other jiras, but in general we don't 
 need to have java hash table there.  We can either use primitive-friendly 
 hashtable like the one from HPPC (Apache-licenced), or some variation, to map 
 primitive keys to single row storage structure without an object per row 
 (similar to vectorization).



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6312) doAs with plain sasl auth should be session aware

2014-03-14 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13934693#comment-13934693
 ] 

Thejas M Nair commented on HIVE-6312:
-

[~vgumashta] Yes, I think it makes sense to remove TUGIContainingProcessor. I 
will create a followup patch for it.


 doAs with plain sasl auth should be session aware
 -

 Key: HIVE-6312
 URL: https://issues.apache.org/jira/browse/HIVE-6312
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Navis
Assignee: Navis
 Attachments: HIVE-6312.1.patch.txt, HIVE-6312.2.patch.txt, 
 HIVE-6312.3.patch.txt


 TUGIContainingProcessor creates new Subject for each invocation which induces 
 FileSystem leakage when cache is enable(true by default).



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HIVE-6663) remove TUGIContainingProcessor class as it is not used anymore

2014-03-14 Thread Thejas M Nair (JIRA)
Thejas M Nair created HIVE-6663:
---

 Summary: remove TUGIContainingProcessor class as it is not used 
anymore
 Key: HIVE-6663
 URL: https://issues.apache.org/jira/browse/HIVE-6663
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Thejas M Nair
Assignee: Thejas M Nair


After HIVE-6312 changes, TUGIContainingProcessor class is unused. It should be 
removed.




--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6663) remove TUGIContainingProcessor class as it is not used anymore

2014-03-14 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-6663:


Attachment: HIVE-6663.1.patch

 remove TUGIContainingProcessor class as it is not used anymore
 --

 Key: HIVE-6663
 URL: https://issues.apache.org/jira/browse/HIVE-6663
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Attachments: HIVE-6663.1.patch


 After HIVE-6312 changes, TUGIContainingProcessor class is unused. It should 
 be removed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Re: Review Request 19206: HIVE-6657: Add test coverage for Kerberos authentication implementation using Hadoop's miniKdc

2014-03-14 Thread Prasad Mujumdar

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19206/
---

(Updated March 14, 2014, 7:06 a.m.)


Review request for hive, Brock Noland and Thejas Nair.


Bugs: HIVE-6657
https://issues.apache.org/jira/browse/HIVE-6657


Repository: hive-git


Description
---

- Build changes 
  + pull required dependenices
  + new sub-module under itest for the hive miniKdc framework, enabled for 
hadoop-2 only since hadoop-1 doesn't include miniKdc yet.
- A wrapper class MiniHiveKdc on top of Hadoop miniKdc that manages the kdc 
services that creates test principals, keytab etc
- Tests
  + simple unit test for MiniHiveKdc
  + Integrate MiniHiveKdc with MiniHS2
  + New HiveServer2 test that covers basic connection, as well as the token and 
proxy auth support added by HIVE-5155

Note that Hadoop does provide a secure mini-cluster yet. Hence the new tests 
are not executing any queries. At this point it's limited to authentication 
part only.


Diffs
-

  itests/hive-minikdc/pom.xml PRE-CREATION 
  itests/hive-minikdc/src/test/java/org/apache/hive/minikdc/MiniHiveKdc.java 
PRE-CREATION 
  
itests/hive-minikdc/src/test/java/org/apache/hive/minikdc/TestJdbcWithMiniKdc.java
 PRE-CREATION 
  
itests/hive-minikdc/src/test/java/org/apache/hive/minikdc/TestMiniHiveKdc.java 
PRE-CREATION 
  itests/hive-minikdc/src/test/resources/core-site.xml PRE-CREATION 
  itests/hive-unit/pom.xml 6aff277 
  
itests/hive-unit/src/main/java/org/apache/hive/jdbc/miniHS2/AbstractHiveService.java
 PRE-CREATION 
  itests/hive-unit/src/main/java/org/apache/hive/jdbc/miniHS2/MiniHS2.java 
PRE-CREATION 
  
itests/hive-unit/src/test/java/org/apache/hive/jdbc/miniHS2/AbstractHiveService.java
 095b989 
  itests/hive-unit/src/test/java/org/apache/hive/jdbc/miniHS2/MiniHS2.java 
d76750c 
  
itests/hive-unit/src/test/java/org/apache/hive/jdbc/miniHS2/TestHiveServer2.java
 84f508c 
  itests/pom.xml 589ba25 
  pom.xml 6503c94 
  service/src/java/org/apache/hive/service/auth/HiveAuthFactory.java 6759903 
  
service/src/java/org/apache/hive/service/cli/thrift/ThriftBinaryCLIService.java 
6fbc847 

Diff: https://reviews.apache.org/r/19206/diff/


Testing
---

New test cases using the miniKdc.


Thanks,

Prasad Mujumdar



[jira] [Commented] (HIVE-6312) doAs with plain sasl auth should be session aware

2014-03-14 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13934702#comment-13934702
 ] 

Thejas M Nair commented on HIVE-6312:
-

Thanks for pointing that out Vaibhav. Created HIVE-6663

 doAs with plain sasl auth should be session aware
 -

 Key: HIVE-6312
 URL: https://issues.apache.org/jira/browse/HIVE-6312
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Navis
Assignee: Navis
 Attachments: HIVE-6312.1.patch.txt, HIVE-6312.2.patch.txt, 
 HIVE-6312.3.patch.txt


 TUGIContainingProcessor creates new Subject for each invocation which induces 
 FileSystem leakage when cache is enable(true by default).



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HIVE-6664) Vectorized variance computation differs from row mode computation.

2014-03-14 Thread Jitendra Nath Pandey (JIRA)
Jitendra Nath Pandey created HIVE-6664:
--

 Summary: Vectorized variance computation differs from row mode 
computation.
 Key: HIVE-6664
 URL: https://issues.apache.org/jira/browse/HIVE-6664
 Project: Hive
  Issue Type: Bug
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey


Following query can show the difference:
select count(ss_sales_price), sum(ss_sales_price), avg(ss_sales_price), 
var_samp(ss_sales_price), var_pop(ss_sales_price), stddev_pop(ss_sales_price), 
stddev_samp(ss_sales_price) from store_sales

The reason for the difference is that row mode converts the decimal value to 
double upfront to calculate sum of values. But the vector mode performs local 
aggregate sum as decimal and converts into double only at flush.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HIVE-6665) Add test coverage for metastore kerberos authentication

2014-03-14 Thread Prasad Mujumdar (JIRA)
Prasad Mujumdar created HIVE-6665:
-

 Summary: Add test coverage for metastore kerberos authentication
 Key: HIVE-6665
 URL: https://issues.apache.org/jira/browse/HIVE-6665
 Project: Hive
  Issue Type: Bug
  Components: Metastore, Tests
Affects Versions: 0.13.0
Reporter: Prasad Mujumdar


Hadoop 2.3 includes miniKdc module. This provides a KDC that can be used by 
downstream projects to implement unit tests for Kerberos authentication code.
The HIVE-6657 patch includes a base miniKdc framework for Hive. That can be 
leveraged to add authentication (both Kerberos and delegation token) tests for 
metastore.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6664) Vectorized variance computation differs from row mode computation.

2014-03-14 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-6664:
---

Attachment: HIVE-6664.1.patch

Attached patch fixes the issue.

 Vectorized variance computation differs from row mode computation.
 --

 Key: HIVE-6664
 URL: https://issues.apache.org/jira/browse/HIVE-6664
 Project: Hive
  Issue Type: Bug
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6664.1.patch


 Following query can show the difference:
 select count(ss_sales_price), sum(ss_sales_price), avg(ss_sales_price), 
 var_samp(ss_sales_price), var_pop(ss_sales_price), 
 stddev_pop(ss_sales_price), stddev_samp(ss_sales_price) from store_sales
 The reason for the difference is that row mode converts the decimal value to 
 double upfront to calculate sum of values. But the vector mode performs local 
 aggregate sum as decimal and converts into double only at flush.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6657) Add test coverage for Kerberos authentication implementation using Hadoop's miniKdc

2014-03-14 Thread Prasad Mujumdar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasad Mujumdar updated HIVE-6657:
--

Description: 
Hadoop 2.3 includes miniKdc module. This provides a KDC that can be used by 
downstream projects to implement unit tests for Kerberos authentication code.
Hive has lot of code related to Kerberos and delegation token for 
authentication, as well as accessing secure hadoop resources. This pretty much 
has no coverage in the unit tests. We needs to add unit tests using miniKdc 
module.
Note that Hadoop 2.3 doesn't include a secure mini-cluster. Until that is 
available, we can at least test authentication for components like HiveServer2, 
Metastore and WebHCat.


  was:
Hadoop 2.3 includes miniKdc module. This provides a KDC that can be used by 
downstream projects to implement unit tests for Kerberos authentication code.



 Add test coverage for Kerberos authentication implementation using Hadoop's 
 miniKdc
 ---

 Key: HIVE-6657
 URL: https://issues.apache.org/jira/browse/HIVE-6657
 Project: Hive
  Issue Type: Improvement
  Components: Authentication, Testing Infrastructure, Tests
Affects Versions: 0.13.0
Reporter: Prasad Mujumdar
Assignee: Prasad Mujumdar
 Attachments: HIVE-6657.2.patch


 Hadoop 2.3 includes miniKdc module. This provides a KDC that can be used by 
 downstream projects to implement unit tests for Kerberos authentication code.
 Hive has lot of code related to Kerberos and delegation token for 
 authentication, as well as accessing secure hadoop resources. This pretty 
 much has no coverage in the unit tests. We needs to add unit tests using 
 miniKdc module.
 Note that Hadoop 2.3 doesn't include a secure mini-cluster. Until that is 
 available, we can at least test authentication for components like 
 HiveServer2, Metastore and WebHCat.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6664) Vectorized variance computation differs from row mode computation.

2014-03-14 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-6664:
---

Description: 
Following query can show the difference:
select  var_samp(ss_sales_price), var_pop(ss_sales_price), 
stddev_pop(ss_sales_price), stddev_samp(ss_sales_price) from store_sales.

The reason for the difference is that row mode converts the decimal value to 
double upfront to calculate sum of values, when computing variance. But the 
vector mode performs local aggregate sum as decimal and converts into double 
only at flush.

  was:
Following query can show the difference:
select count(ss_sales_price), sum(ss_sales_price), avg(ss_sales_price), 
var_samp(ss_sales_price), var_pop(ss_sales_price), stddev_pop(ss_sales_price), 
stddev_samp(ss_sales_price) from store_sales

The reason for the difference is that row mode converts the decimal value to 
double upfront to calculate sum of values. But the vector mode performs local 
aggregate sum as decimal and converts into double only at flush.


 Vectorized variance computation differs from row mode computation.
 --

 Key: HIVE-6664
 URL: https://issues.apache.org/jira/browse/HIVE-6664
 Project: Hive
  Issue Type: Bug
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6664.1.patch


 Following query can show the difference:
 select  var_samp(ss_sales_price), var_pop(ss_sales_price), 
 stddev_pop(ss_sales_price), stddev_samp(ss_sales_price) from store_sales.
 The reason for the difference is that row mode converts the decimal value to 
 double upfront to calculate sum of values, when computing variance. But the 
 vector mode performs local aggregate sum as decimal and converts into double 
 only at flush.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Review Request 19216: Vectorized variance computation differs from row mode computation.

2014-03-14 Thread Jitendra Pandey

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19216/
---

Review request for hive and Eric Hanson.


Bugs: HIVE-6664
https://issues.apache.org/jira/browse/HIVE-6664


Repository: hive-git


Description
---

Following query can show the difference:
select var_samp(ss_sales_price), var_pop(ss_sales_price), 
stddev_pop(ss_sales_price), stddev_samp(ss_sales_price) from store_sales.

The reason for the difference is that row mode converts the decimal value to 
double upfront to calculate sum of values, when computing variance. But the 
vector mode performs local aggregate sum as decimal and converts into double 
only at flush.


Diffs
-

  ql/src/gen/vectorization/UDAFTemplates/VectorUDAFVarDecimal.txt c5af930 
  ql/src/test/results/clientpositive/vector_decimal_aggregate.q.out 507f798 

Diff: https://reviews.apache.org/r/19216/diff/


Testing
---


Thanks,

Jitendra Pandey



[jira] [Updated] (HIVE-6664) Vectorized variance computation differs from row mode computation.

2014-03-14 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-6664:
---

Status: Patch Available  (was: Open)

 Vectorized variance computation differs from row mode computation.
 --

 Key: HIVE-6664
 URL: https://issues.apache.org/jira/browse/HIVE-6664
 Project: Hive
  Issue Type: Bug
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6664.1.patch


 Following query can show the difference:
 select  var_samp(ss_sales_price), var_pop(ss_sales_price), 
 stddev_pop(ss_sales_price), stddev_samp(ss_sales_price) from store_sales.
 The reason for the difference is that row mode converts the decimal value to 
 double upfront to calculate sum of values, when computing variance. But the 
 vector mode performs local aggregate sum as decimal and converts into double 
 only at flush.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HIVE-6666) Metastore init scripts should always populate the version information at the end

2014-03-14 Thread Prasad Mujumdar (JIRA)
Prasad Mujumdar created HIVE-:
-

 Summary: Metastore init scripts should always populate the version 
information at the end
 Key: HIVE-
 URL: https://issues.apache.org/jira/browse/HIVE-
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.13.0
Reporter: Prasad Mujumdar
Assignee: Prasad Mujumdar


The metastore schema create scripts for 0.13 and 0.14 (current trunk) has  
multiple other operations after setting the schema version. This is problematic 
  as any failure in those later operations would leave metastore in 
inconsistent state, and yet with valid version information. The schemaTool 
depends on the schema version details.

Recording the schema version should be the last step in schema initialization 
script.




--
This message was sent by Atlassian JIRA
(v6.2#6252)


Re: Review Request 19212: HIVE-6645: to_date()/to_unix_timestamp() fail with NPE if input is null

2014-03-14 Thread Mohammad Islam

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19212/#review37177
---



ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFToUnixTimeStamp.java
https://reviews.apache.org/r/19212/#comment68592

do we need to check if arguments[1].get() is null  as done for arguments[0]?
Or the converter will handle it and return 'null'.



ql/src/test/org/apache/hadoop/hive/ql/udf/generic/TestGenericUDFToUnixTimestamp.java
https://reviews.apache.org/r/19212/#comment68594

what about test cases where 1st arg is not null but the second arg is  null?


- Mohammad Islam


On March 14, 2014, 2:40 a.m., Jason Dere wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/19212/
 ---
 
 (Updated March 14, 2014, 2:40 a.m.)
 
 
 Review request for hive.
 
 
 Bugs: HIVE-6645
 https://issues.apache.org/jira/browse/HIVE-6645
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 - fix null inputs
 - allow char/varchar params
 - tests
 
 
 Diffs
 -
 
   ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFDate.java 
 c31174a 
   
 ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFToUnixTimeStamp.java
  dc259c6 
   ql/src/test/org/apache/hadoop/hive/ql/udf/TestGenericUDFDate.java 384ce4e 
   
 ql/src/test/org/apache/hadoop/hive/ql/udf/generic/TestGenericUDFToUnixTimestamp.java
  PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/19212/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Jason Dere
 




Re: Timeline for the Hive 0.13 release?

2014-03-14 Thread Prasanth Jayachandran
Harish

Can you please make the following changes to my earlier request?

HIVE-4177 is not required.. instead the same work is tracked under HIVE-6578.

Can you also consider HIVE-6656? 
HIVE-6656 is bug fix for ORC reader when reading timestamp nanoseconds. 
This bug exists in earlier versions as well, so it will be good have this fixed 
in 0.13.0

Thanks
Prasanth Jayachandran

On Mar 13, 2014, at 8:52 AM, Thejas Nair the...@hortonworks.com wrote:

 Harish,
 I think we should include the following -
 HIVE-6547 - This is a cleanup of metastore api changes introduced in 0.13 .
 This can't be done post release. I will get a patch out in few hours.
 HIVE-6567 -  fixes a NPE in 'show grant .. on all
 HIVE-6629 - change in syntax for 'set role none' . marked as a blocker bug.
 
 
 On Tue, Mar 11, 2014 at 8:39 AM, Harish Butani hbut...@hortonworks.comwrote:
 
 yes sure.
 
 
 On Mar 10, 2014, at 3:55 PM, Gopal V gop...@apache.org wrote:
 
 Can I add HIVE-6518 as well to the merge queue on
 
 
 https://cwiki.apache.org/confluence/display/Hive/Hive+0.13+release+status
 
 It is a relatively simple OOM safety patch to vectorized group-by.
 
 Tests pass locally for vec group-by, but the pre-commit tests haven't
 fired eventhough it's been PA for a while now.
 
 Cheers,
 Gopal
 
 
 --
 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or entity to
 which it is addressed and may contain information that is confidential,
 privileged and exempt from disclosure under applicable law. If the reader
 of this message is not the intended recipient, you are hereby notified that
 any printing, copying, dissemination, distribution, disclosure or
 forwarding of this communication is strictly prohibited. If you have
 received this communication in error, please contact the sender immediately
 and delete it from your system. Thank You.
 
 
 -- 
 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or entity to 
 which it is addressed and may contain information that is confidential, 
 privileged and exempt from disclosure under applicable law. If the reader 
 of this message is not the intended recipient, you are hereby notified that 
 any printing, copying, dissemination, distribution, disclosure or 
 forwarding of this communication is strictly prohibited. If you have 
 received this communication in error, please contact the sender immediately 
 and delete it from your system. Thank You.


-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


Re: create table like behaviour

2014-03-14 Thread Thejas Nair
I think it is intuitive that all metadata gets copied over, for 'create
table like'.  I would treat the case of TBLPROPERTIES that should not be
copied over as a very special case, a case in which the user should be able
to remove the property.


On Thu, Mar 13, 2014 at 3:15 PM, Sushanth Sowmyan khorg...@gmail.comwrote:

 Hi All,

 Currently, if I do the following:

 hive -e 'create table a(k1 string, k2 int) TBLPROPERTIES(pi = 3.14159)'
 hive -e 'create table b like a;'
 hive -e 'describe extended a;'
 hive -e 'describe extended b;'


 We see that the table property is not copied over to the definition of
 b. Does anyone know if this is by design (i.e. by a principle that
 table properties are not table description and so should not be copied
 over) or is it a bug? I also notice that there's HIVE-3527, which
 added the ability to create TBLPROPERTIES on the table being created,
 so I assume it's by design, but I wanted to check if anyone knew/had
 strong feelings about it.

 I can see a good reason for not copying over tableproperties if
 they're used to store specific table state (say backup state/etc), but
 I also see a good reason for copying over table properties, with
 things like orc, which store table metadata of sorts(like
 orc.compress, or stride size, etc) in table properties, which makes a
 good case for copying them over  if a person wants to create a table
 with similar definitions to the first.

 -Sushanth


-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


Re: Review Request 18185: Support Kerberos HTTP authentication for HiveServer2 running in http mode

2014-03-14 Thread Thejas Nair

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18185/#review37178
---

Ship it!


Ship It!

- Thejas Nair


On March 13, 2014, 9:57 p.m., Vaibhav Gumashta wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/18185/
 ---
 
 (Updated March 13, 2014, 9:57 p.m.)
 
 
 Review request for hive and Thejas Nair.
 
 
 Bugs: HIVE-4764
 https://issues.apache.org/jira/browse/HIVE-4764
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 Support Kerberos HTTP authentication for HiveServer2 running in http mode
 
 
 Diffs
 -
 
   
 itests/hive-unit/src/test/java/org/apache/hive/service/cli/thrift/TestThriftHttpCLIService.java
  57fda94 
   jdbc/src/java/org/apache/hive/jdbc/HiveConnection.java c55aad2 
   jdbc/src/java/org/apache/hive/jdbc/HttpBasicAuthInterceptor.java 66eba1b 
   jdbc/src/java/org/apache/hive/jdbc/HttpKerberosRequestInterceptor.java 
 PRE-CREATION 
   pom.xml 6e8a735 
   service/src/java/org/apache/hive/service/auth/HiveAuthFactory.java 6759903 
   service/src/java/org/apache/hive/service/auth/HttpAuthUtils.java 
 PRE-CREATION 
   
 service/src/java/org/apache/hive/service/auth/HttpAuthenticationException.java
  PRE-CREATION 
   
 service/src/java/org/apache/hive/service/auth/HttpCLIServiceUGIProcessor.java 
 PRE-CREATION 
   service/src/java/org/apache/hive/service/cli/CLIService.java bdc943e 
   service/src/java/org/apache/hive/service/cli/session/SessionManager.java 
 4545d2b 
   service/src/java/org/apache/hive/service/cli/thrift/ThriftCLIService.java 
 0c9ac37 
   
 service/src/java/org/apache/hive/service/cli/thrift/ThriftHttpCLIService.java 
 a6ff6ce 
   service/src/java/org/apache/hive/service/cli/thrift/ThriftHttpServlet.java 
 e77f043 
   
 shims/common-secure/src/main/java/org/apache/hadoop/hive/thrift/HadoopThriftAuthBridge20S.java
  e3f3e38 
   shims/common/src/main/java/org/apache/hadoop/hive/shims/HadoopShims.java 
 1f24a94 
   
 shims/common/src/main/java/org/apache/hadoop/hive/thrift/HadoopThriftAuthBridge.java
  e69373a 
 
 Diff: https://reviews.apache.org/r/18185/diff/
 
 
 Testing
 ---
 
 Using beeline in a kerberos setup.
 
 
 Thanks,
 
 Vaibhav Gumashta
 




[jira] [Commented] (HIVE-4764) Support Kerberos HTTP authentication for HiveServer2 running in http mode

2014-03-14 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13934728#comment-13934728
 ] 

Thejas M Nair commented on HIVE-4764:
-

+1

 Support Kerberos HTTP authentication for HiveServer2 running in http mode
 -

 Key: HIVE-4764
 URL: https://issues.apache.org/jira/browse/HIVE-4764
 Project: Hive
  Issue Type: Sub-task
  Components: HiveServer2
Affects Versions: 0.13.0
Reporter: Thejas M Nair
Assignee: Vaibhav Gumashta
 Fix For: 0.13.0

 Attachments: HIVE-4764.1.patch, HIVE-4764.2.patch, HIVE-4764.3.patch, 
 HIVE-4764.4.patch, HIVE-4764.5.patch


 Support Kerberos authentication for HiveServer2 running in http mode.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6654) SimpleFetchOptimizer ignores views

2014-03-14 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13934735#comment-13934735
 ] 

Navis commented on HIVE-6654:
-

Views are regarded as sub-queries and current simple fetch optimizer does not 
accept those, which seemed to be harsh. I'll take look at this later.

 SimpleFetchOptimizer ignores views
 --

 Key: HIVE-6654
 URL: https://issues.apache.org/jira/browse/HIVE-6654
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.13.0
Reporter: Gopal V
 Attachments: HIVE-6654-test.tgz


 SimpleFetchOptimizer optimizes select * from session_orc limit 1;
 But when an alias view is created like create view session as select * from 
 session_orc;
 The simple fetch optimizer fails to kick in and optimize away select * from 
 session limit 1; as a fetch only task.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6312) doAs with plain sasl auth should be session aware

2014-03-14 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13934738#comment-13934738
 ] 

Navis commented on HIVE-6312:
-

[~thejas] How about just removing it in here?

 doAs with plain sasl auth should be session aware
 -

 Key: HIVE-6312
 URL: https://issues.apache.org/jira/browse/HIVE-6312
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Navis
Assignee: Navis
 Attachments: HIVE-6312.1.patch.txt, HIVE-6312.2.patch.txt, 
 HIVE-6312.3.patch.txt


 TUGIContainingProcessor creates new Subject for each invocation which induces 
 FileSystem leakage when cache is enable(true by default).



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6312) doAs with plain sasl auth should be session aware

2014-03-14 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-6312:


Attachment: HIVE-6312.4.patch.txt

HIVE-6312.4.patch.txt - removes TUGIContainingProcessor

 doAs with plain sasl auth should be session aware
 -

 Key: HIVE-6312
 URL: https://issues.apache.org/jira/browse/HIVE-6312
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Navis
Assignee: Navis
 Attachments: HIVE-6312.1.patch.txt, HIVE-6312.2.patch.txt, 
 HIVE-6312.3.patch.txt, HIVE-6312.4.patch.txt


 TUGIContainingProcessor creates new Subject for each invocation which induces 
 FileSystem leakage when cache is enable(true by default).



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6312) doAs with plain sasl auth should be session aware

2014-03-14 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13934749#comment-13934749
 ] 

Thejas M Nair commented on HIVE-6312:
-

Navis, I included it in a different jira because I wasn't sure if I would get a 
+1 in time. Somehow thought that its late in night there (Seoul?) and you might 
not be around ! :)


 doAs with plain sasl auth should be session aware
 -

 Key: HIVE-6312
 URL: https://issues.apache.org/jira/browse/HIVE-6312
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Navis
Assignee: Navis
 Attachments: HIVE-6312.1.patch.txt, HIVE-6312.2.patch.txt, 
 HIVE-6312.3.patch.txt, HIVE-6312.4.patch.txt


 TUGIContainingProcessor creates new Subject for each invocation which induces 
 FileSystem leakage when cache is enable(true by default).



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6639) Vectorization: Partition column names are not picked up.

2014-03-14 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-6639:
---

Status: Open  (was: Patch Available)

 Vectorization: Partition column names are not picked up.
 

 Key: HIVE-6639
 URL: https://issues.apache.org/jira/browse/HIVE-6639
 Project: Hive
  Issue Type: Bug
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6639.2.patch, HIVE-6639.3.patch, HIVE-6639.4.patch


 The vectorized plan generation finds the list of partitioning columns from 
 pruned-partition-list using table scan operator. In some cases the list is 
 coming as null. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6666) Metastore init scripts should always populate the version information at the end

2014-03-14 Thread Prasad Mujumdar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasad Mujumdar updated HIVE-:
--

Attachment: HIVE-.1.patch

 Metastore init scripts should always populate the version information at the 
 end
 

 Key: HIVE-
 URL: https://issues.apache.org/jira/browse/HIVE-
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.13.0
Reporter: Prasad Mujumdar
Assignee: Prasad Mujumdar
 Attachments: HIVE-.1.patch


 The metastore schema create scripts for 0.13 and 0.14 (current trunk) has  
 multiple other operations after setting the schema version. This is 
 problematic   as any failure in those later operations would leave metastore 
 in inconsistent state, and yet with valid version information. The schemaTool 
 depends on the schema version details.
 Recording the schema version should be the last step in schema initialization 
 script.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6666) Metastore init scripts should always populate the version information at the end

2014-03-14 Thread Prasad Mujumdar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasad Mujumdar updated HIVE-:
--

Status: Patch Available  (was: Open)

[~ashutoshc] This is the followup patch as discussed in HIVE-6555.

 Metastore init scripts should always populate the version information at the 
 end
 

 Key: HIVE-
 URL: https://issues.apache.org/jira/browse/HIVE-
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.13.0
Reporter: Prasad Mujumdar
Assignee: Prasad Mujumdar
 Attachments: HIVE-.1.patch


 The metastore schema create scripts for 0.13 and 0.14 (current trunk) has  
 multiple other operations after setting the schema version. This is 
 problematic   as any failure in those later operations would leave metastore 
 in inconsistent state, and yet with valid version information. The schemaTool 
 depends on the schema version details.
 Recording the schema version should be the last step in schema initialization 
 script.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6639) Vectorization: Partition column names are not picked up.

2014-03-14 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-6639:
---

Description: The vectorized plan generation finds the list of partitioning 
columns from pruned-partition-list using table scan operator. In some cases the 
list is coming as null. TPCDS query 27 can reproduce this issue if the 
store_sales table is partitioned on ss_store_sk.  (was: The vectorized plan 
generation finds the list of partitioning columns from pruned-partition-list 
using table scan operator. In some cases the list is coming as null. )

 Vectorization: Partition column names are not picked up.
 

 Key: HIVE-6639
 URL: https://issues.apache.org/jira/browse/HIVE-6639
 Project: Hive
  Issue Type: Bug
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6639.2.patch, HIVE-6639.3.patch, HIVE-6639.4.patch


 The vectorized plan generation finds the list of partitioning columns from 
 pruned-partition-list using table scan operator. In some cases the list is 
 coming as null. TPCDS query 27 can reproduce this issue if the store_sales 
 table is partitioned on ss_store_sk.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Re: Review Request 19216: Vectorized variance computation differs from row mode computation.

2014-03-14 Thread Jitendra Pandey

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19216/
---

(Updated March 14, 2014, 8:41 a.m.)


Review request for hive, Eric Hanson and Remus Rusanu.


Bugs: HIVE-6664
https://issues.apache.org/jira/browse/HIVE-6664


Repository: hive-git


Description
---

Following query can show the difference:
select var_samp(ss_sales_price), var_pop(ss_sales_price), 
stddev_pop(ss_sales_price), stddev_samp(ss_sales_price) from store_sales.

The reason for the difference is that row mode converts the decimal value to 
double upfront to calculate sum of values, when computing variance. But the 
vector mode performs local aggregate sum as decimal and converts into double 
only at flush.


Diffs
-

  ql/src/gen/vectorization/UDAFTemplates/VectorUDAFVarDecimal.txt c5af930 
  ql/src/test/results/clientpositive/vector_decimal_aggregate.q.out 507f798 

Diff: https://reviews.apache.org/r/19216/diff/


Testing
---


Thanks,

Jitendra Pandey



[jira] [Commented] (HIVE-6312) doAs with plain sasl auth should be session aware

2014-03-14 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13934782#comment-13934782
 ] 

Navis commented on HIVE-6312:
-

It's about the time to go home :)
I've checked TUGIContainingProcessor is not included in any other class or 
document. Thanks again for a review.

 doAs with plain sasl auth should be session aware
 -

 Key: HIVE-6312
 URL: https://issues.apache.org/jira/browse/HIVE-6312
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Navis
Assignee: Navis
 Attachments: HIVE-6312.1.patch.txt, HIVE-6312.2.patch.txt, 
 HIVE-6312.3.patch.txt, HIVE-6312.4.patch.txt


 TUGIContainingProcessor creates new Subject for each invocation which induces 
 FileSystem leakage when cache is enable(true by default).



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Re: Why isn't itests/ listed as submodule of root pom.xml?

2014-03-14 Thread Navis류승우
Should we write this on wiki? :)

2014-03-12 8:46 GMT+09:00 Brock Noland br...@cloudera.com:
 Hopefully this is the last time I have to say this :)

 The qfile tests in itests require the packaging phase. The maven test phase
 is after compile and before packaging. We could change the qfile tests to
 run during the integration-test phase using the failsafe plugin but the
 failsafe plugin is different than surefire and IMO is hard to use.

 If you'd like to give that a try, by all means, go ahead.


 On Tue, Mar 11, 2014 at 6:37 PM, Jason Dere jd...@hortonworks.com wrote:

 Noticed this since internally we set the version number to something
 different than simply 0.13.0, and mvn version:set doesn't really work
 correctly with itests because itests isn't listed as one of the root POM's
 submodules.  Is there a particular reason for it not being listed as a
 submodule when the mavenization was done?

 Having it as a submodule also allows you to run the qfile tests from root
 directory, so we could simplify the instructions for testing.

 Jason
 --
 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or entity to
 which it is addressed and may contain information that is confidential,
 privileged and exempt from disclosure under applicable law. If the reader
 of this message is not the intended recipient, you are hereby notified that
 any printing, copying, dissemination, distribution, disclosure or
 forwarding of this communication is strictly prohibited. If you have
 received this communication in error, please contact the sender immediately
 and delete it from your system. Thank You.



[jira] [Commented] (HIVE-6664) Vectorized variance computation differs from row mode computation.

2014-03-14 Thread Jitendra Nath Pandey (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13934793#comment-13934793
 ] 

Jitendra Nath Pandey commented on HIVE-6664:


Review board : https://reviews.apache.org/r/19216/

 Vectorized variance computation differs from row mode computation.
 --

 Key: HIVE-6664
 URL: https://issues.apache.org/jira/browse/HIVE-6664
 Project: Hive
  Issue Type: Bug
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6664.1.patch


 Following query can show the difference:
 select  var_samp(ss_sales_price), var_pop(ss_sales_price), 
 stddev_pop(ss_sales_price), stddev_samp(ss_sales_price) from store_sales.
 The reason for the difference is that row mode converts the decimal value to 
 double upfront to calculate sum of values, when computing variance. But the 
 vector mode performs local aggregate sum as decimal and converts into double 
 only at flush.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6649) Vectorization: some date expressions throw exception.

2014-03-14 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-6649:
---

Status: Open  (was: Patch Available)

 Vectorization: some date expressions throw exception.
 -

 Key: HIVE-6649
 URL: https://issues.apache.org/jira/browse/HIVE-6649
 Project: Hive
  Issue Type: Bug
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6649.1.patch


 Query ran with hive.vectorized.execution.enabled=true:
 {code}
 select dt, to_date(date_add(dt, 2)), to_date(date_sub(dt, 2)),
datediff(dt, date_add(dt, 2)), datediff(dt, date_sub(dt, 2)),
datediff(date_add(dt, 2), date_sub(dt, 2))
 from vectortab10korc limit 1;
 {code}
 fails with the following error:
 {noformat}
 Error: java.lang.RuntimeException: 
 org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
 processing row
   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:195)
   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:396)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
 Error while processing row
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:45)
   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:177)
   ... 8 more
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating 
 datediff(date_add(dt, 2), date_sub(dt, 2))
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:117)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:791)
   at 
 org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:92)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:791)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:43)
   ... 9 more
 Caused by: java.lang.NullPointerException
   at java.lang.String.checkBounds(String.java:400)
   at java.lang.String.init(String.java:569)
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.setDays(VectorUDFDateDiffColCol.java:254)
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.copySelected(VectorUDFDateDiffColCol.java:231)
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.toDateArray(VectorUDFDateDiffColCol.java:190)
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.evaluate(VectorUDFDateDiffColCol.java:72)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:115)
   ... 13 more
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Review Request 19218: Vectorization: some date expressions throw exception.

2014-03-14 Thread Jitendra Pandey

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19218/
---

Review request for hive and Eric Hanson.


Bugs: HIVE-6649
https://issues.apache.org/jira/browse/HIVE-6649


Repository: hive-git


Description
---

Query:
select dt, to_date(date_add(dt, 2)), to_date(date_sub(dt, 2)),
   datediff(dt, date_add(dt, 2)), datediff(dt, date_sub(dt, 2)),
   datediff(date_add(dt, 2), date_sub(dt, 2))
from vectortab10korc limit 1;

throws NPE.


Diffs
-

  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/ConstantVectorExpression.java
 901005e 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/StringUnaryUDF.java
 4875d0d 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/VectorUDFDateAddColCol.java
 09f6e47 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/VectorUDFDateAddColScalar.java
 6578907 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/VectorUDFDateAddScalarCol.java
 d1156b6 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/VectorUDFDateDiffColCol.java
 15e995c 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/VectorUDFDateDiffColScalar.java
 05b71ac 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/VectorUDFDateDiffScalarCol.java
 7c76901 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/VectorUDFDateString.java
 dd84de3 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/VectorUDFTimestampFieldString.java
 011a790 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/aggregates/VectorUDAFAvgDecimal.java
 8418587 
  ql/src/test/queries/clientpositive/vectorized_date_funcs.q 6c9515c 
  ql/src/test/results/clientpositive/vectorized_date_funcs.q.out a9d7dde 

Diff: https://reviews.apache.org/r/19218/diff/


Testing
---


Thanks,

Jitendra Pandey



[jira] [Updated] (HIVE-6649) Vectorization: some date expressions throw exception.

2014-03-14 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-6649:
---

Attachment: HIVE-6649.2.patch

Review board: https://reviews.apache.org/r/19218/

 Vectorization: some date expressions throw exception.
 -

 Key: HIVE-6649
 URL: https://issues.apache.org/jira/browse/HIVE-6649
 Project: Hive
  Issue Type: Bug
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6649.1.patch, HIVE-6649.2.patch


 Query ran with hive.vectorized.execution.enabled=true:
 {code}
 select dt, to_date(date_add(dt, 2)), to_date(date_sub(dt, 2)),
datediff(dt, date_add(dt, 2)), datediff(dt, date_sub(dt, 2)),
datediff(date_add(dt, 2), date_sub(dt, 2))
 from vectortab10korc limit 1;
 {code}
 fails with the following error:
 {noformat}
 Error: java.lang.RuntimeException: 
 org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
 processing row
   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:195)
   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:396)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
 Error while processing row
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:45)
   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:177)
   ... 8 more
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating 
 datediff(date_add(dt, 2), date_sub(dt, 2))
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:117)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:791)
   at 
 org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:92)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:791)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:43)
   ... 9 more
 Caused by: java.lang.NullPointerException
   at java.lang.String.checkBounds(String.java:400)
   at java.lang.String.init(String.java:569)
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.setDays(VectorUDFDateDiffColCol.java:254)
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.copySelected(VectorUDFDateDiffColCol.java:231)
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.toDateArray(VectorUDFDateDiffColCol.java:190)
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.evaluate(VectorUDFDateDiffColCol.java:72)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:115)
   ... 13 more
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6649) Vectorization: some date expressions throw exception.

2014-03-14 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-6649:
---

Status: Patch Available  (was: Open)

 Vectorization: some date expressions throw exception.
 -

 Key: HIVE-6649
 URL: https://issues.apache.org/jira/browse/HIVE-6649
 Project: Hive
  Issue Type: Bug
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6649.1.patch, HIVE-6649.2.patch


 Query ran with hive.vectorized.execution.enabled=true:
 {code}
 select dt, to_date(date_add(dt, 2)), to_date(date_sub(dt, 2)),
datediff(dt, date_add(dt, 2)), datediff(dt, date_sub(dt, 2)),
datediff(date_add(dt, 2), date_sub(dt, 2))
 from vectortab10korc limit 1;
 {code}
 fails with the following error:
 {noformat}
 Error: java.lang.RuntimeException: 
 org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
 processing row
   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:195)
   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:396)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
 Error while processing row
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:45)
   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:177)
   ... 8 more
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating 
 datediff(date_add(dt, 2), date_sub(dt, 2))
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:117)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:791)
   at 
 org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:92)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:791)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:43)
   ... 9 more
 Caused by: java.lang.NullPointerException
   at java.lang.String.checkBounds(String.java:400)
   at java.lang.String.init(String.java:569)
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.setDays(VectorUDFDateDiffColCol.java:254)
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.copySelected(VectorUDFDateDiffColCol.java:231)
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.toDateArray(VectorUDFDateDiffColCol.java:190)
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.evaluate(VectorUDFDateDiffColCol.java:72)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:115)
   ... 13 more
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6651) broken link in WebHCat doc: Job Information — GET queue/:jobid

2014-03-14 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13934799#comment-13934799
 ] 

Lefty Leverenz commented on HIVE-6651:
--

The link is broken in four docs:  GET queue/:jobid, DELETE queue/:jobid, GET 
jobs/:jobid, and DELETE jobs/:jobid.

The link tries to go to the stable version of Hadoop API docs -- 
http://hadoop.apache.org/docs/stable/api/org/apache/hadoop/mapred/JobProfile.html
 -- but there are two stable sets of docs, stable1 for Hadoop 1 and stable2 for 
Hadoop 2, with stable currently set to stable2.  Class JobProfile is not in 
stable2, but it is in stable1:  
[http://hadoop.apache.org/docs/stable1/api/org/apache/hadoop/mapred/JobProfile.html].

It's easy enough to change all the links to stable1 (and for consistency also 
change the JobStatus links) but does the Hadoop 2 vs. Hadoop 1 distinction 
matter in the WebHCat documentation?

Hadoop doc sets are here:  [http://hadoop.apache.org/docs/].  Stable, stable1, 
and stable2 are listed at the bottom.

* stable (Hadoop 2.2.0) API docs:  
[http://hadoop.apache.org/docs/stable/api/index.html]
* stable1 (Hadoop 1.2.1) API docs:  
[http://hadoop.apache.org/docs/stable1/api/index.html]
* stable2 (Hadoop 2.2.0) API docs:  
[http://hadoop.apache.org/docs/stable2/api/index.html]

WebHCat docs with broken links:

* [Get Queue JobID 
|https://cwiki.apache.org/confluence/display/Hive/WebHCat+Reference+JobInfo#WebHCatReferenceJobInfo-Results]
* [Delete Queue JobID 
|https://cwiki.apache.org/confluence/display/Hive/WebHCat+Reference+DeleteJob#WebHCatReferenceDeleteJob-Results]
* [Get Jobs JobID 
|https://cwiki.apache.org/confluence/display/Hive/WebHCat+Reference+Job#WebHCatReferenceJob-Results]
* [Delete Jobs JobID 
|https://cwiki.apache.org/confluence/display/Hive/WebHCat+Reference+DeleteJobID#WebHCatReferenceDeleteJobID-Results]


 broken link in WebHCat doc: Job Information — GET queue/:jobid
 --

 Key: HIVE-6651
 URL: https://issues.apache.org/jira/browse/HIVE-6651
 Project: Hive
  Issue Type: Bug
  Components: Documentation, WebHCat
Reporter: Eugene Koifman
Priority: Minor

 https://cwiki.apache.org/confluence/display/Hive/WebHCat+Reference+JobInfo#WebHCatReferenceJobInfo-Results
 the link in the table to Class JobProfile is broken



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6639) Vectorization: Partition column names are not picked up.

2014-03-14 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-6639:
---

Status: Patch Available  (was: Open)

 Vectorization: Partition column names are not picked up.
 

 Key: HIVE-6639
 URL: https://issues.apache.org/jira/browse/HIVE-6639
 Project: Hive
  Issue Type: Bug
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6639.2.patch, HIVE-6639.3.patch, HIVE-6639.4.patch, 
 HIVE-6639.5.patch


 The vectorized plan generation finds the list of partitioning columns from 
 pruned-partition-list using table scan operator. In some cases the list is 
 coming as null. TPCDS query 27 can reproduce this issue if the store_sales 
 table is partitioned on ss_store_sk.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6639) Vectorization: Partition column names are not picked up.

2014-03-14 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-6639:
---

Attachment: HIVE-6639.5.patch

 Vectorization: Partition column names are not picked up.
 

 Key: HIVE-6639
 URL: https://issues.apache.org/jira/browse/HIVE-6639
 Project: Hive
  Issue Type: Bug
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6639.2.patch, HIVE-6639.3.patch, HIVE-6639.4.patch, 
 HIVE-6639.5.patch


 The vectorized plan generation finds the list of partitioning columns from 
 pruned-partition-list using table scan operator. In some cases the list is 
 coming as null. TPCDS query 27 can reproduce this issue if the store_sales 
 table is partitioned on ss_store_sk.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Review Request 19219: Vectorization: Partition column names are not picked up.

2014-03-14 Thread Jitendra Pandey

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19219/
---

Review request for hive.


Bugs: HIVE-6639
https://issues.apache.org/jira/browse/HIVE-6639


Repository: hive-git


Description
---

Vectorization: Partition column names are not picked up.


Diffs
-

  common/src/test/org/apache/hadoop/hive/common/type/TestDecimal128.java 
409a13a 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastDecimalToTimestamp.java
 32386fe 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/Vectorizer.java 
c26da37 

Diff: https://reviews.apache.org/r/19219/diff/


Testing
---


Thanks,

Jitendra Pandey



[jira] [Commented] (HIVE-6639) Vectorization: Partition column names are not picked up.

2014-03-14 Thread Jitendra Nath Pandey (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13934805#comment-13934805
 ] 

Jitendra Nath Pandey commented on HIVE-6639:


Review board: https://reviews.apache.org/r/19219/

 Vectorization: Partition column names are not picked up.
 

 Key: HIVE-6639
 URL: https://issues.apache.org/jira/browse/HIVE-6639
 Project: Hive
  Issue Type: Bug
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6639.2.patch, HIVE-6639.3.patch, HIVE-6639.4.patch, 
 HIVE-6639.5.patch


 The vectorized plan generation finds the list of partitioning columns from 
 pruned-partition-list using table scan operator. In some cases the list is 
 coming as null. TPCDS query 27 can reproduce this issue if the store_sales 
 table is partitioned on ss_store_sk.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6578) Use ORC file footer statistics through StatsProvidingRecordReader interface for analyze command

2014-03-14 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13934807#comment-13934807
 ] 

Lefty Leverenz commented on HIVE-6578:
--

Editorial nit:  ... for file formats that implements ... should be 
implement in comment (HiveConf.java) and description 
(hive-default.xml.template).

 Use ORC file footer statistics through StatsProvidingRecordReader interface 
 for analyze command
 ---

 Key: HIVE-6578
 URL: https://issues.apache.org/jira/browse/HIVE-6578
 Project: Hive
  Issue Type: New Feature
Affects Versions: 0.13.0
Reporter: Prasanth J
Assignee: Prasanth J
  Labels: orcfile
 Attachments: HIVE-6578.1.patch, HIVE-6578.2.patch, HIVE-6578.3.patch


 ORC provides file level statistics which can be used in analyze partialscan 
 and noscan cases to compute basic statistics like number of rows, number of 
 files, total file size and raw data size. On the writer side, a new interface 
 was added earlier (StatsProvidingRecordWriter) that exposed stats when 
 writing a table. Similarly, a new interface StatsProvidingRecordReader can be 
 added which when implemented should provide stats that are gathered by the 
 underlying file format.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6576) sending user.name as a form parameter in POST doesn't work post HADOOP-10193

2014-03-14 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-6576:


   Resolution: Fixed
Fix Version/s: 0.13.0
   Status: Resolved  (was: Patch Available)

Patch committed to trunk and 0.13 branch (included in 0.13 wiki ).
Thanks for the contribution Eugene!


 sending user.name as a form parameter in POST doesn't work post HADOOP-10193
 

 Key: HIVE-6576
 URL: https://issues.apache.org/jira/browse/HIVE-6576
 Project: Hive
  Issue Type: Bug
  Components: WebHCat
Reporter: Eugene Koifman
Assignee: Eugene Koifman
 Fix For: 0.13.0

 Attachments: HIVE-6576.patch


 WebHCat uses AuthFilter to handle authentication.  In simple mode that means 
 using PseudoAuthenticationHandler.  Prior to HADOOP-10193, the latter handled 
 user.name as form parameter in a POST request.  Now it only handles it as a 
 query parameter.  
 to maintain webhcat backwards compat, we need to make WebHCat still extract 
 it from form param.  This will be deprecated immediately and removed in 0.15
 Also, all examples in WebHCat reference manual should be updated to use 
 user.name in query string from current form param (curl -d user.name=foo)



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6649) Vectorization: some date expressions throw exception.

2014-03-14 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13934824#comment-13934824
 ] 

Hive QA commented on HIVE-6649:
---



{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12634396/HIVE-6649.1.patch

{color:green}SUCCESS:{color} +1 5394 tests passed

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1770/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1770/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12634396

 Vectorization: some date expressions throw exception.
 -

 Key: HIVE-6649
 URL: https://issues.apache.org/jira/browse/HIVE-6649
 Project: Hive
  Issue Type: Bug
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6649.1.patch, HIVE-6649.2.patch


 Query ran with hive.vectorized.execution.enabled=true:
 {code}
 select dt, to_date(date_add(dt, 2)), to_date(date_sub(dt, 2)),
datediff(dt, date_add(dt, 2)), datediff(dt, date_sub(dt, 2)),
datediff(date_add(dt, 2), date_sub(dt, 2))
 from vectortab10korc limit 1;
 {code}
 fails with the following error:
 {noformat}
 Error: java.lang.RuntimeException: 
 org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
 processing row
   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:195)
   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:396)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
 Error while processing row
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:45)
   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:177)
   ... 8 more
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating 
 datediff(date_add(dt, 2), date_sub(dt, 2))
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:117)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:791)
   at 
 org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:92)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:791)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:43)
   ... 9 more
 Caused by: java.lang.NullPointerException
   at java.lang.String.checkBounds(String.java:400)
   at java.lang.String.init(String.java:569)
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.setDays(VectorUDFDateDiffColCol.java:254)
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.copySelected(VectorUDFDateDiffColCol.java:231)
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.toDateArray(VectorUDFDateDiffColCol.java:190)
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.evaluate(VectorUDFDateDiffColCol.java:72)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:115)
   ... 13 more
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Re: Review Request 19206: HIVE-6657: Add test coverage for Kerberos authentication implementation using Hadoop's miniKdc

2014-03-14 Thread Brock Noland

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19206/#review37195
---


This looks awesome! I had no idea we could enable a submodule for a specific 
profile!

Just curious, what is the felix plugin doing for us?


itests/hive-unit/pom.xml
https://reviews.apache.org/r/19206/#comment68624

This might be a my pedantic side...but since we are moving test scope from 
from deps can we move those under a !-- intra-project -- comment? I have 
tried to separate out test and non-test deps:

https://github.com/apache/hive/blob/trunk/ql/pom.xml#L36



itests/hive-unit/src/main/java/org/apache/hive/jdbc/miniHS2/MiniHS2.java
https://reviews.apache.org/r/19206/#comment68625

Should this commented out line and the ones waitForStartup be removed?

If not, please add a comment saying when we would uncomment.



pom.xml
https://reviews.apache.org/r/19206/#comment68626

Can you create a property and add this version to the properties?


- Brock Noland


On March 14, 2014, 7:06 a.m., Prasad Mujumdar wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/19206/
 ---
 
 (Updated March 14, 2014, 7:06 a.m.)
 
 
 Review request for hive, Brock Noland and Thejas Nair.
 
 
 Bugs: HIVE-6657
 https://issues.apache.org/jira/browse/HIVE-6657
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 - Build changes 
   + pull required dependenices
   + new sub-module under itest for the hive miniKdc framework, enabled for 
 hadoop-2 only since hadoop-1 doesn't include miniKdc yet.
 - A wrapper class MiniHiveKdc on top of Hadoop miniKdc that manages the kdc 
 services that creates test principals, keytab etc
 - Tests
   + simple unit test for MiniHiveKdc
   + Integrate MiniHiveKdc with MiniHS2
   + New HiveServer2 test that covers basic connection, as well as the token 
 and proxy auth support added by HIVE-5155
 
 Note that Hadoop does provide a secure mini-cluster yet. Hence the new tests 
 are not executing any queries. At this point it's limited to authentication 
 part only.
 
 
 Diffs
 -
 
   itests/hive-minikdc/pom.xml PRE-CREATION 
   itests/hive-minikdc/src/test/java/org/apache/hive/minikdc/MiniHiveKdc.java 
 PRE-CREATION 
   
 itests/hive-minikdc/src/test/java/org/apache/hive/minikdc/TestJdbcWithMiniKdc.java
  PRE-CREATION 
   
 itests/hive-minikdc/src/test/java/org/apache/hive/minikdc/TestMiniHiveKdc.java
  PRE-CREATION 
   itests/hive-minikdc/src/test/resources/core-site.xml PRE-CREATION 
   itests/hive-unit/pom.xml 6aff277 
   
 itests/hive-unit/src/main/java/org/apache/hive/jdbc/miniHS2/AbstractHiveService.java
  PRE-CREATION 
   itests/hive-unit/src/main/java/org/apache/hive/jdbc/miniHS2/MiniHS2.java 
 PRE-CREATION 
   
 itests/hive-unit/src/test/java/org/apache/hive/jdbc/miniHS2/AbstractHiveService.java
  095b989 
   itests/hive-unit/src/test/java/org/apache/hive/jdbc/miniHS2/MiniHS2.java 
 d76750c 
   
 itests/hive-unit/src/test/java/org/apache/hive/jdbc/miniHS2/TestHiveServer2.java
  84f508c 
   itests/pom.xml 589ba25 
   pom.xml 6503c94 
   service/src/java/org/apache/hive/service/auth/HiveAuthFactory.java 6759903 
   
 service/src/java/org/apache/hive/service/cli/thrift/ThriftBinaryCLIService.java
  6fbc847 
 
 Diff: https://reviews.apache.org/r/19206/diff/
 
 
 Testing
 ---
 
 New test cases using the miniKdc.
 
 
 Thanks,
 
 Prasad Mujumdar
 




[jira] [Commented] (HIVE-6647) Bump the thrift api version to V7 for HiveServer2

2014-03-14 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13935020#comment-13935020
 ] 

Hive QA commented on HIVE-6647:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12634391/HIVE-6647.1.patch

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 5394 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_auto_sortmerge_join_16
org.apache.hive.service.cli.thrift.TestThriftHttpCLIService.testExecuteStatementAsync
{noformat}

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1771/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1771/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12634391

 Bump the thrift api version to V7 for HiveServer2
 -

 Key: HIVE-6647
 URL: https://issues.apache.org/jira/browse/HIVE-6647
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2, JDBC
Affects Versions: 0.13.0
Reporter: Vaibhav Gumashta
Assignee: Prasad Mujumdar
Priority: Blocker
 Fix For: 0.13.0

 Attachments: HIVE-6647.1.patch


 HIVE-5155 added new api for delegation token support. Per the convention 
 followed till now, we should update the version to 7. 
 Marking it as blocker for 13. cc [~prasadm] [~thejas]



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-5837) SQL standard based secure authorization for hive

2014-03-14 Thread Alex Nastetsky (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13935053#comment-13935053
 ] 

Alex Nastetsky commented on HIVE-5837:
--

Thanks Thejas. Should I create a ticket for show tables or does one already 
exist?

 SQL standard based secure authorization for hive
 

 Key: HIVE-5837
 URL: https://issues.apache.org/jira/browse/HIVE-5837
 Project: Hive
  Issue Type: New Feature
  Components: Authorization
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Attachments: SQL standard authorization hive.pdf


 The current default authorization is incomplete and not secure. The 
 alternative of storage based authorization provides security but does not 
 provide fine grained authorization.
 The proposal is to support secure fine grained authorization in hive using 
 SQL standard based authorization model.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Re: Why isn't itests/ listed as submodule of root pom.xml?

2014-03-14 Thread Brock Noland
Good idea:

https://cwiki.apache.org/confluence/display/Hive/HiveDeveloperFAQ#HiveDeveloperFAQ-Whyisn'ttheitestspomconnectedtotherootpom
?


On Fri, Mar 14, 2014 at 3:58 AM, Navis류승우 navis@nexr.com wrote:

 Should we write this on wiki? :)

 2014-03-12 8:46 GMT+09:00 Brock Noland br...@cloudera.com:
  Hopefully this is the last time I have to say this :)
 
  The qfile tests in itests require the packaging phase. The maven test
 phase
  is after compile and before packaging. We could change the qfile tests to
  run during the integration-test phase using the failsafe plugin but the
  failsafe plugin is different than surefire and IMO is hard to use.
 
  If you'd like to give that a try, by all means, go ahead.
 
 
  On Tue, Mar 11, 2014 at 6:37 PM, Jason Dere jd...@hortonworks.com
 wrote:
 
  Noticed this since internally we set the version number to something
  different than simply 0.13.0, and mvn version:set doesn't really work
  correctly with itests because itests isn't listed as one of the root
 POM's
  submodules.  Is there a particular reason for it not being listed as a
  submodule when the mavenization was done?
 
  Having it as a submodule also allows you to run the qfile tests from
 root
  directory, so we could simplify the instructions for testing.
 
  Jason
  --
  CONFIDENTIALITY NOTICE
  NOTICE: This message is intended for the use of the individual or
 entity to
  which it is addressed and may contain information that is confidential,
  privileged and exempt from disclosure under applicable law. If the
 reader
  of this message is not the intended recipient, you are hereby notified
 that
  any printing, copying, dissemination, distribution, disclosure or
  forwarding of this communication is strictly prohibited. If you have
  received this communication in error, please contact the sender
 immediately
  and delete it from your system. Thank You.
 



[jira] [Commented] (HIVE-6635) Heartbeats are not being sent when DbLockMgr is used and an operation holds locks

2014-03-14 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13935119#comment-13935119
 ] 

Ashutosh Chauhan commented on HIVE-6635:


+1

 Heartbeats are not being sent when DbLockMgr is used and an operation holds 
 locks
 -

 Key: HIVE-6635
 URL: https://issues.apache.org/jira/browse/HIVE-6635
 Project: Hive
  Issue Type: Bug
  Components: Locking
Affects Versions: 0.13.0
Reporter: Alan Gates
Assignee: Alan Gates
 Fix For: 0.13.0

 Attachments: HIVE-6635.2.patch, HIVE-6635.patch


 The new DbLockManager depends on heartbeats from the client in order to 
 determine that a lock has not timed out.  The client is not currently sending 
 those heartbeats.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6547) normalize struct Role in metastore thrift interface

2014-03-14 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13935120#comment-13935120
 ] 

Ashutosh Chauhan commented on HIVE-6547:


Looks good. Do you want to qualify fields in new structs with required ?

 normalize struct Role in metastore thrift interface
 ---

 Key: HIVE-6547
 URL: https://issues.apache.org/jira/browse/HIVE-6547
 Project: Hive
  Issue Type: Bug
  Components: Metastore, Thrift API
Affects Versions: 0.13.0
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Fix For: 0.13.0

 Attachments: HIVE-6547.thriftapi.patch


 As discussed in HIVE-5931, it will be cleaner to have the information about 
 Role to role member mapping removed from the Role object, as it is not part 
 of a logical Role. This information not relevant for actions such as creating 
 a Role.
 As part of this change  get_role_grants_for_principal api will be added, so 
 that it can be used in place of  list_roles, when role mapping information is 
 desired.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-5837) SQL standard based secure authorization for hive

2014-03-14 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13935121#comment-13935121
 ] 

Thejas M Nair commented on HIVE-5837:
-

[~terrasect] Please create one for 'show tables'.


 SQL standard based secure authorization for hive
 

 Key: HIVE-5837
 URL: https://issues.apache.org/jira/browse/HIVE-5837
 Project: Hive
  Issue Type: New Feature
  Components: Authorization
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Attachments: SQL standard authorization hive.pdf


 The current default authorization is incomplete and not secure. The 
 alternative of storage based authorization provides security but does not 
 provide fine grained authorization.
 The proposal is to support secure fine grained authorization in hive using 
 SQL standard based authorization model.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6666) Metastore init scripts should always populate the version information at the end

2014-03-14 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13935124#comment-13935124
 ] 

Ashutosh Chauhan commented on HIVE-:


+1

 Metastore init scripts should always populate the version information at the 
 end
 

 Key: HIVE-
 URL: https://issues.apache.org/jira/browse/HIVE-
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.13.0
Reporter: Prasad Mujumdar
Assignee: Prasad Mujumdar
 Attachments: HIVE-.1.patch


 The metastore schema create scripts for 0.13 and 0.14 (current trunk) has  
 multiple other operations after setting the schema version. This is 
 problematic   as any failure in those later operations would leave metastore 
 in inconsistent state, and yet with valid version information. The schemaTool 
 depends on the schema version details.
 Recording the schema version should be the last step in schema initialization 
 script.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HIVE-6667) Need support for show tables authorization

2014-03-14 Thread Alex Nastetsky (JIRA)
Alex Nastetsky created HIVE-6667:


 Summary: Need support for show tables authorization
 Key: HIVE-6667
 URL: https://issues.apache.org/jira/browse/HIVE-6667
 Project: Hive
  Issue Type: Improvement
  Components: Authorization
Reporter: Alex Nastetsky


Need the ability to restrict access to show tables on a per database basis or 
globally.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6659) Update log for list_bucket_* to add pre/post DB

2014-03-14 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13935129#comment-13935129
 ] 

Ashutosh Chauhan commented on HIVE-6659:


+1

 Update log for list_bucket_* to add pre/post DB
 ---

 Key: HIVE-6659
 URL: https://issues.apache.org/jira/browse/HIVE-6659
 Project: Hive
  Issue Type: Bug
Reporter: Laljo John Pullokkaran
Assignee: Laljo John Pullokkaran
 Attachments: HIVE-6659.patch


 On Hadoop2 we now print out Database Name using pre/post hooks.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-5837) SQL standard based secure authorization for hive

2014-03-14 Thread Alex Nastetsky (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13935125#comment-13935125
 ] 

Alex Nastetsky commented on HIVE-5837:
--

Done: https://issues.apache.org/jira/browse/HIVE-6667. Thanks.

 SQL standard based secure authorization for hive
 

 Key: HIVE-5837
 URL: https://issues.apache.org/jira/browse/HIVE-5837
 Project: Hive
  Issue Type: New Feature
  Components: Authorization
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Attachments: SQL standard authorization hive.pdf


 The current default authorization is incomplete and not secure. The 
 alternative of storage based authorization provides security but does not 
 provide fine grained authorization.
 The proposal is to support secure fine grained authorization in hive using 
 SQL standard based authorization model.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Re: Review Request 18179: Support more generic way of using composite key for HBaseHandler

2014-03-14 Thread Xuefu Zhang


 On March 10, 2014, 9:25 p.m., Xuefu Zhang wrote:
  hbase-handler/src/java/org/apache/hadoop/hive/hbase/HBaseSerDe.java, line 
  730
  https://reviews.apache.org/r/18179/diff/5/?file=513392#file513392line730
 
  Can we define serielize() interface in HBaseKeyFactory, move the 
  existing implementation here to HBaseCompositeKeyFactory? Serialize() seems 
  seems generic enough to expect from all key factories. Doing this will 
  eliminate HBaseWritableKeyFactory and use of the class to detect what 
  method to call.
 
 Navis Ryu wrote:
 If the default serialization can be done by simple decent method call, I 
 would have done like that. But current implementation needs seven argument 
 for that(+serdeParams), which made me think twice of it. 
 
 byte[] serialize(
 int i,
 ListColumnMapping mapping,
 List? extends StructField fields,
 ListObject list,
 List? extends StructField declaredFields,
 boolean useJSONSerialize,
 ByteStream.Output serializeStream) throws IOException;
 
 Xuefu Zhang wrote:
 Yes, I agree that too many params for a method is ugly. In this case, 
 however, it doesn't seem too bad:
 
 1. i and and the 4 lists can be reduced to 4 fields, as i is just the 
 index in the lists, which are derived from object inspector and serdeparams. 
 To further reduce the arg number, a struct can be defined to wrap the 4 
 items: keyMapping, keyField, keyObject, and keyDeclaredField. 
 (keyDeclaredField may not be needed as we are talking about row key here.)
 
 2. useJsonSerialize seems always false, so it can be removed.
 
 I understand that some refactoring is needed. However, I think it's worth 
 the effort for readability and maintenance.
 
 Navis Ryu wrote:
 I've tried but we needs
 
 Object f, ObjectInspector foi, StructField declared, ColumnMapping 
 colMap, SerDeParameters serdeParams, ByteStream.Output serializeStream
 
 If we hand over HBaseSerde, 
 
 Object f, ObjectInspector foi, StructField declared, ColumnMapping 
 colMap, HBaseSerde serde
 
 Should we really do this? it aches my belly.

If we're not ready for this, I think we should remove HBaseWritableKeyFactory 
interface. We can add the serialize() interface in the future when we are have 
a better idea.


- Xuefu


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18179/#review36688
---


On March 7, 2014, 7:46 a.m., Navis Ryu wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/18179/
 ---
 
 (Updated March 7, 2014, 7:46 a.m.)
 
 
 Review request for hive.
 
 
 Bugs: HIVE-6411
 https://issues.apache.org/jira/browse/HIVE-6411
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 HIVE-2599 introduced using custom object for the row key. But it forces key 
 objects to extend HBaseCompositeKey, which is again extension of LazyStruct. 
 If user provides proper Object and OI, we can replace internal key and keyOI 
 with those. 
 
 Initial implementation is based on factory interface.
 {code}
 public interface HBaseKeyFactory {
   void init(SerDeParameters parameters, Properties properties) throws 
 SerDeException;
   ObjectInspector createObjectInspector(TypeInfo type) throws SerDeException;
   LazyObjectBase createObject(ObjectInspector inspector) throws 
 SerDeException;
 }
 {code}
 
 
 Diffs
 -
 
   hbase-handler/pom.xml 132af43 
   hbase-handler/src/java/org/apache/hadoop/hive/hbase/HBaseCompositeKey.java 
 5008f15 
   
 hbase-handler/src/java/org/apache/hadoop/hive/hbase/HBaseCompositeKeyFactory.java
  PRE-CREATION 
   hbase-handler/src/java/org/apache/hadoop/hive/hbase/HBaseKeyFactory.java 
 PRE-CREATION 
   
 hbase-handler/src/java/org/apache/hadoop/hive/hbase/HBaseLazyObjectFactory.java
  PRE-CREATION 
   hbase-handler/src/java/org/apache/hadoop/hive/hbase/HBaseScanRange.java 
 PRE-CREATION 
   hbase-handler/src/java/org/apache/hadoop/hive/hbase/HBaseSerDe.java 2cd65cb 
   
 hbase-handler/src/java/org/apache/hadoop/hive/hbase/HBaseStorageHandler.java 
 29e5da5 
   
 hbase-handler/src/java/org/apache/hadoop/hive/hbase/HBaseWritableKeyFactory.java
  PRE-CREATION 
   
 hbase-handler/src/java/org/apache/hadoop/hive/hbase/HiveHBaseTableInputFormat.java
  704fcb9 
   hbase-handler/src/java/org/apache/hadoop/hive/hbase/LazyHBaseRow.java 
 fc40195 
   
 hbase-handler/src/test/org/apache/hadoop/hive/hbase/HBaseTestCompositeKey.java
  13c344b 
   
 hbase-handler/src/test/org/apache/hadoop/hive/hbase/TestHBaseKeyFactory.java 
 PRE-CREATION 
   
 hbase-handler/src/test/org/apache/hadoop/hive/hbase/TestHBaseKeyFactory2.java 
 PRE-CREATION 
   

[jira] [Comment Edited] (HIVE-6547) normalize struct Role in metastore thrift interface

2014-03-14 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13935140#comment-13935140
 ] 

Thejas M Nair edited comment on HIVE-6547 at 3/14/14 3:23 PM:
--

Thanks for the feedback. Will make them required.
While I am at it , i will also make the fields in 
GetPrincipalsInRoleResponse,GetPrincipalsInRoleRequest required. With the 
semantics of the functions, it does not make any sense for them to be 
non-required.
Those are similar functions that were added recently


was (Author: thejas):
Thanks for the feedback. Will make them required.
While I am at it , i will also make the fields in 
GetPrincipalsInRoleResponse,GetPrincipalsInRoleRequest required. With the 
semantics of the functions, it does not make any sense for them to be 
non-required.


 normalize struct Role in metastore thrift interface
 ---

 Key: HIVE-6547
 URL: https://issues.apache.org/jira/browse/HIVE-6547
 Project: Hive
  Issue Type: Bug
  Components: Metastore, Thrift API
Affects Versions: 0.13.0
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Fix For: 0.13.0

 Attachments: HIVE-6547.thriftapi.patch


 As discussed in HIVE-5931, it will be cleaner to have the information about 
 Role to role member mapping removed from the Role object, as it is not part 
 of a logical Role. This information not relevant for actions such as creating 
 a Role.
 As part of this change  get_role_grants_for_principal api will be added, so 
 that it can be used in place of  list_roles, when role mapping information is 
 desired.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6547) normalize struct Role in metastore thrift interface

2014-03-14 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13935140#comment-13935140
 ] 

Thejas M Nair commented on HIVE-6547:
-

Thanks for the feedback. Will make them required.
While I am at it , i will also make the fields in 
GetPrincipalsInRoleResponse,GetPrincipalsInRoleRequest required. With the 
semantics of the functions, it does not make any sense for them to be 
non-required.


 normalize struct Role in metastore thrift interface
 ---

 Key: HIVE-6547
 URL: https://issues.apache.org/jira/browse/HIVE-6547
 Project: Hive
  Issue Type: Bug
  Components: Metastore, Thrift API
Affects Versions: 0.13.0
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Fix For: 0.13.0

 Attachments: HIVE-6547.thriftapi.patch


 As discussed in HIVE-5931, it will be cleaner to have the information about 
 Role to role member mapping removed from the Role object, as it is not part 
 of a logical Role. This information not relevant for actions such as creating 
 a Role.
 As part of this change  get_role_grants_for_principal api will be added, so 
 that it can be used in place of  list_roles, when role mapping information is 
 desired.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Re: Timeline for the Hive 0.13 release?

2014-03-14 Thread Thejas Nair
Can you also add HIVE-6647 https://issues.apache.org/jira/browse/HIVE-6647 to
the list? It is marked as a blocker for 0.13.
It has a necessary version number upgrade for HS2. It is ready to be
committed.


On Fri, Mar 14, 2014 at 12:38 AM, Prasanth Jayachandran 
pjayachand...@hortonworks.com wrote:

 Harish

 Can you please make the following changes to my earlier request?

 HIVE-4177 is not required.. instead the same work is tracked under
 HIVE-6578.

 Can you also consider HIVE-6656?
 HIVE-6656 is bug fix for ORC reader when reading timestamp nanoseconds.
 This bug exists in earlier versions as well, so it will be good have this
 fixed in 0.13.0

 Thanks
 Prasanth Jayachandran

 On Mar 13, 2014, at 8:52 AM, Thejas Nair the...@hortonworks.com wrote:

  Harish,
  I think we should include the following -
  HIVE-6547 - This is a cleanup of metastore api changes introduced in
 0.13 .
  This can't be done post release. I will get a patch out in few hours.
  HIVE-6567 -  fixes a NPE in 'show grant .. on all
  HIVE-6629 - change in syntax for 'set role none' . marked as a blocker
 bug.
 
 
  On Tue, Mar 11, 2014 at 8:39 AM, Harish Butani hbut...@hortonworks.com
 wrote:
 
  yes sure.
 
 
  On Mar 10, 2014, at 3:55 PM, Gopal V gop...@apache.org wrote:
 
  Can I add HIVE-6518 as well to the merge queue on
 
 
 
 https://cwiki.apache.org/confluence/display/Hive/Hive+0.13+release+status
 
  It is a relatively simple OOM safety patch to vectorized group-by.
 
  Tests pass locally for vec group-by, but the pre-commit tests haven't
  fired eventhough it's been PA for a while now.
 
  Cheers,
  Gopal
 
 
  --
  CONFIDENTIALITY NOTICE
  NOTICE: This message is intended for the use of the individual or
 entity to
  which it is addressed and may contain information that is confidential,
  privileged and exempt from disclosure under applicable law. If the
 reader
  of this message is not the intended recipient, you are hereby notified
 that
  any printing, copying, dissemination, distribution, disclosure or
  forwarding of this communication is strictly prohibited. If you have
  received this communication in error, please contact the sender
 immediately
  and delete it from your system. Thank You.
 
 
  --
  CONFIDENTIALITY NOTICE
  NOTICE: This message is intended for the use of the individual or entity
 to
  which it is addressed and may contain information that is confidential,
  privileged and exempt from disclosure under applicable law. If the reader
  of this message is not the intended recipient, you are hereby notified
 that
  any printing, copying, dissemination, distribution, disclosure or
  forwarding of this communication is strictly prohibited. If you have
  received this communication in error, please contact the sender
 immediately
  and delete it from your system. Thank You.


 --
 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or entity to
 which it is addressed and may contain information that is confidential,
 privileged and exempt from disclosure under applicable law. If the reader
 of this message is not the intended recipient, you are hereby notified that
 any printing, copying, dissemination, distribution, disclosure or
 forwarding of this communication is strictly prohibited. If you have
 received this communication in error, please contact the sender immediately
 and delete it from your system. Thank You.


-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


[jira] [Commented] (HIVE-6647) Bump the thrift api version to V7 for HiveServer2

2014-03-14 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13935162#comment-13935162
 ] 

Thejas M Nair commented on HIVE-6647:
-

testExecuteStatementAsync is a flaky test, documented in HIVE-6543 .
I will commit this shortly.


 Bump the thrift api version to V7 for HiveServer2
 -

 Key: HIVE-6647
 URL: https://issues.apache.org/jira/browse/HIVE-6647
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2, JDBC
Affects Versions: 0.13.0
Reporter: Vaibhav Gumashta
Assignee: Prasad Mujumdar
Priority: Blocker
 Fix For: 0.13.0

 Attachments: HIVE-6647.1.patch


 HIVE-5155 added new api for delegation token support. Per the convention 
 followed till now, we should update the version to 7. 
 Marking it as blocker for 13. cc [~prasadm] [~thejas]



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6543) TestEmbeddedThriftBinaryCLIService.testExecuteStatementAsync is failing sometimes

2014-03-14 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13935165#comment-13935165
 ] 

Thejas M Nair commented on HIVE-6543:
-

Verified that the test passes with test patch. Will commit it shortly.

 TestEmbeddedThriftBinaryCLIService.testExecuteStatementAsync is failing 
 sometimes
 -

 Key: HIVE-6543
 URL: https://issues.apache.org/jira/browse/HIVE-6543
 Project: Hive
  Issue Type: Task
  Components: Tests
Reporter: Navis
Assignee: Navis
Priority: Trivial
 Attachments: HIVE-6543.1.patch.txt


 NO PRECOMMIT TESTS
 The test uses CREATE TABLE NON_EXISTING_TAB (ID STRING) location 
 'hdfs://localhost:1/a/b/c' query for intended fail but it seemed not 
 fail so quickly in testbed. Just making the query worse (replacing hdfs to 
 invalid, etc.) would be enough.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6543) TestEmbeddedThriftBinaryCLIService.testExecuteStatementAsync is failing sometimes

2014-03-14 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-6543:


   Resolution: Fixed
Fix Version/s: 0.14.0
   Status: Resolved  (was: Patch Available)

Patch committed to trunk.
Thanks Navis!


 TestEmbeddedThriftBinaryCLIService.testExecuteStatementAsync is failing 
 sometimes
 -

 Key: HIVE-6543
 URL: https://issues.apache.org/jira/browse/HIVE-6543
 Project: Hive
  Issue Type: Task
  Components: Tests
Reporter: Navis
Assignee: Navis
Priority: Trivial
 Fix For: 0.14.0

 Attachments: HIVE-6543.1.patch.txt


 NO PRECOMMIT TESTS
 The test uses CREATE TABLE NON_EXISTING_TAB (ID STRING) location 
 'hdfs://localhost:1/a/b/c' query for intended fail but it seemed not 
 fail so quickly in testbed. Just making the query worse (replacing hdfs to 
 invalid, etc.) would be enough.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Re: Review Request 19206: HIVE-6657: Add test coverage for Kerberos authentication implementation using Hadoop's miniKdc

2014-03-14 Thread Prasad Mujumdar


 On March 14, 2014, 1:43 p.m., Brock Noland wrote:
  This looks awesome! I had no idea we could enable a submodule for a 
  specific profile!
  
  Just curious, what is the felix plugin doing for us?

felix plugin needed for apache DS dependency of hadoop miniKdc.


 On March 14, 2014, 1:43 p.m., Brock Noland wrote:
  itests/hive-unit/pom.xml, line 41
  https://reviews.apache.org/r/19206/diff/1/?file=519155#file519155line41
 
  This might be a my pedantic side...but since we are moving test scope 
  from from deps can we move those under a !-- intra-project -- comment? I 
  have tried to separate out test and non-test deps:
  
  https://github.com/apache/hive/blob/trunk/ql/pom.xml#L36

Sounds reasonable. Updated the pom to group the non-test dependencies together 
under 'intra-project' comment header.


 On March 14, 2014, 1:43 p.m., Brock Noland wrote:
  itests/hive-unit/src/main/java/org/apache/hive/jdbc/miniHS2/MiniHS2.java, 
  line 152
  https://reviews.apache.org/r/19206/diff/1/?file=519157#file519157line152
 
  Should this commented out line and the ones waitForStartup be removed?
  
  If not, please add a comment saying when we would uncomment.

removed the comment.


 On March 14, 2014, 1:43 p.m., Brock Noland wrote:
  pom.xml, line 636
  https://reviews.apache.org/r/19206/diff/1/?file=519162#file519162line636
 
  Can you create a property and add this version to the properties?

Done


- Prasad


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19206/#review37195
---


On March 14, 2014, 7:06 a.m., Prasad Mujumdar wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/19206/
 ---
 
 (Updated March 14, 2014, 7:06 a.m.)
 
 
 Review request for hive, Brock Noland and Thejas Nair.
 
 
 Bugs: HIVE-6657
 https://issues.apache.org/jira/browse/HIVE-6657
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 - Build changes 
   + pull required dependenices
   + new sub-module under itest for the hive miniKdc framework, enabled for 
 hadoop-2 only since hadoop-1 doesn't include miniKdc yet.
 - A wrapper class MiniHiveKdc on top of Hadoop miniKdc that manages the kdc 
 services that creates test principals, keytab etc
 - Tests
   + simple unit test for MiniHiveKdc
   + Integrate MiniHiveKdc with MiniHS2
   + New HiveServer2 test that covers basic connection, as well as the token 
 and proxy auth support added by HIVE-5155
 
 Note that Hadoop does provide a secure mini-cluster yet. Hence the new tests 
 are not executing any queries. At this point it's limited to authentication 
 part only.
 
 
 Diffs
 -
 
   itests/hive-minikdc/pom.xml PRE-CREATION 
   itests/hive-minikdc/src/test/java/org/apache/hive/minikdc/MiniHiveKdc.java 
 PRE-CREATION 
   
 itests/hive-minikdc/src/test/java/org/apache/hive/minikdc/TestJdbcWithMiniKdc.java
  PRE-CREATION 
   
 itests/hive-minikdc/src/test/java/org/apache/hive/minikdc/TestMiniHiveKdc.java
  PRE-CREATION 
   itests/hive-minikdc/src/test/resources/core-site.xml PRE-CREATION 
   itests/hive-unit/pom.xml 6aff277 
   
 itests/hive-unit/src/main/java/org/apache/hive/jdbc/miniHS2/AbstractHiveService.java
  PRE-CREATION 
   itests/hive-unit/src/main/java/org/apache/hive/jdbc/miniHS2/MiniHS2.java 
 PRE-CREATION 
   
 itests/hive-unit/src/test/java/org/apache/hive/jdbc/miniHS2/AbstractHiveService.java
  095b989 
   itests/hive-unit/src/test/java/org/apache/hive/jdbc/miniHS2/MiniHS2.java 
 d76750c 
   
 itests/hive-unit/src/test/java/org/apache/hive/jdbc/miniHS2/TestHiveServer2.java
  84f508c 
   itests/pom.xml 589ba25 
   pom.xml 6503c94 
   service/src/java/org/apache/hive/service/auth/HiveAuthFactory.java 6759903 
   
 service/src/java/org/apache/hive/service/cli/thrift/ThriftBinaryCLIService.java
  6fbc847 
 
 Diff: https://reviews.apache.org/r/19206/diff/
 
 
 Testing
 ---
 
 New test cases using the miniKdc.
 
 
 Thanks,
 
 Prasad Mujumdar
 




Re: Review Request 19206: HIVE-6657: Add test coverage for Kerberos authentication implementation using Hadoop's miniKdc

2014-03-14 Thread Prasad Mujumdar

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19206/
---

(Updated March 14, 2014, 5:12 p.m.)


Review request for hive, Brock Noland and Thejas Nair.


Changes
---

Changes per review feedback


Bugs: HIVE-6657
https://issues.apache.org/jira/browse/HIVE-6657


Repository: hive-git


Description
---

- Build changes 
  + pull required dependenices
  + new sub-module under itest for the hive miniKdc framework, enabled for 
hadoop-2 only since hadoop-1 doesn't include miniKdc yet.
- A wrapper class MiniHiveKdc on top of Hadoop miniKdc that manages the kdc 
services that creates test principals, keytab etc
- Tests
  + simple unit test for MiniHiveKdc
  + Integrate MiniHiveKdc with MiniHS2
  + New HiveServer2 test that covers basic connection, as well as the token and 
proxy auth support added by HIVE-5155

Note that Hadoop does provide a secure mini-cluster yet. Hence the new tests 
are not executing any queries. At this point it's limited to authentication 
part only.


Diffs (updated)
-

  itests/hive-minikdc/pom.xml PRE-CREATION 
  itests/hive-minikdc/src/test/java/org/apache/hive/minikdc/MiniHiveKdc.java 
PRE-CREATION 
  
itests/hive-minikdc/src/test/java/org/apache/hive/minikdc/TestJdbcWithMiniKdc.java
 PRE-CREATION 
  
itests/hive-minikdc/src/test/java/org/apache/hive/minikdc/TestMiniHiveKdc.java 
PRE-CREATION 
  itests/hive-minikdc/src/test/resources/core-site.xml PRE-CREATION 
  itests/hive-unit/pom.xml 6aff277 
  
itests/hive-unit/src/main/java/org/apache/hive/jdbc/miniHS2/AbstractHiveService.java
 PRE-CREATION 
  itests/hive-unit/src/main/java/org/apache/hive/jdbc/miniHS2/MiniHS2.java 
PRE-CREATION 
  
itests/hive-unit/src/test/java/org/apache/hive/jdbc/miniHS2/AbstractHiveService.java
 095b989 
  itests/hive-unit/src/test/java/org/apache/hive/jdbc/miniHS2/MiniHS2.java 
d76750c 
  
itests/hive-unit/src/test/java/org/apache/hive/jdbc/miniHS2/TestHiveServer2.java
 84f508c 
  itests/pom.xml 589ba25 
  pom.xml 6503c94 
  service/src/java/org/apache/hive/service/auth/HiveAuthFactory.java 6759903 
  
service/src/java/org/apache/hive/service/cli/thrift/ThriftBinaryCLIService.java 
6fbc847 

Diff: https://reviews.apache.org/r/19206/diff/


Testing
---

New test cases using the miniKdc.


Thanks,

Prasad Mujumdar



Re: Review Request 19206: HIVE-6657: Add test coverage for Kerberos authentication implementation using Hadoop's miniKdc

2014-03-14 Thread Brock Noland

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19206/#review37222
---

Ship it!


LGTM pending tests

- Brock Noland


On March 14, 2014, 5:12 p.m., Prasad Mujumdar wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/19206/
 ---
 
 (Updated March 14, 2014, 5:12 p.m.)
 
 
 Review request for hive, Brock Noland and Thejas Nair.
 
 
 Bugs: HIVE-6657
 https://issues.apache.org/jira/browse/HIVE-6657
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 - Build changes 
   + pull required dependenices
   + new sub-module under itest for the hive miniKdc framework, enabled for 
 hadoop-2 only since hadoop-1 doesn't include miniKdc yet.
 - A wrapper class MiniHiveKdc on top of Hadoop miniKdc that manages the kdc 
 services that creates test principals, keytab etc
 - Tests
   + simple unit test for MiniHiveKdc
   + Integrate MiniHiveKdc with MiniHS2
   + New HiveServer2 test that covers basic connection, as well as the token 
 and proxy auth support added by HIVE-5155
 
 Note that Hadoop does provide a secure mini-cluster yet. Hence the new tests 
 are not executing any queries. At this point it's limited to authentication 
 part only.
 
 
 Diffs
 -
 
   itests/hive-minikdc/pom.xml PRE-CREATION 
   itests/hive-minikdc/src/test/java/org/apache/hive/minikdc/MiniHiveKdc.java 
 PRE-CREATION 
   
 itests/hive-minikdc/src/test/java/org/apache/hive/minikdc/TestJdbcWithMiniKdc.java
  PRE-CREATION 
   
 itests/hive-minikdc/src/test/java/org/apache/hive/minikdc/TestMiniHiveKdc.java
  PRE-CREATION 
   itests/hive-minikdc/src/test/resources/core-site.xml PRE-CREATION 
   itests/hive-unit/pom.xml 6aff277 
   
 itests/hive-unit/src/main/java/org/apache/hive/jdbc/miniHS2/AbstractHiveService.java
  PRE-CREATION 
   itests/hive-unit/src/main/java/org/apache/hive/jdbc/miniHS2/MiniHS2.java 
 PRE-CREATION 
   
 itests/hive-unit/src/test/java/org/apache/hive/jdbc/miniHS2/AbstractHiveService.java
  095b989 
   itests/hive-unit/src/test/java/org/apache/hive/jdbc/miniHS2/MiniHS2.java 
 d76750c 
   
 itests/hive-unit/src/test/java/org/apache/hive/jdbc/miniHS2/TestHiveServer2.java
  84f508c 
   itests/pom.xml 589ba25 
   pom.xml 6503c94 
   service/src/java/org/apache/hive/service/auth/HiveAuthFactory.java 6759903 
   
 service/src/java/org/apache/hive/service/cli/thrift/ThriftBinaryCLIService.java
  6fbc847 
 
 Diff: https://reviews.apache.org/r/19206/diff/
 
 
 Testing
 ---
 
 New test cases using the miniKdc.
 
 
 Thanks,
 
 Prasad Mujumdar
 




[jira] [Created] (HIVE-6668) When auto join convert is on and noconditionaltask is off, ConditionalResolverCommonJoin fails to resolve map joins.

2014-03-14 Thread Yin Huai (JIRA)
Yin Huai created HIVE-6668:
--

 Summary: When auto join convert is on and noconditionaltask is 
off, ConditionalResolverCommonJoin fails to resolve map joins.
 Key: HIVE-6668
 URL: https://issues.apache.org/jira/browse/HIVE-6668
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0, 0.14.0
Reporter: Yin Huai
Priority: Blocker
 Fix For: 0.13.0


I tried the following query today ...
{code:sql}
set mapred.job.map.memory.mb=2048;
set mapred.job.reduce.memory.mb=2048;
set mapred.map.child.java.opts=-server -Xmx3072m 
-Djava.net.preferIPv4Stack=true;
set mapred.reduce.child.java.opts=-server -Xmx3072m 
-Djava.net.preferIPv4Stack=true;

set mapred.reduce.tasks=60;

set hive.stats.autogather=false;
set hive.exec.parallel=false;
set hive.enforce.bucketing=true;
set hive.enforce.sorting=true;
set hive.map.aggr=true;
set hive.optimize.bucketmapjoin=true;
set hive.optimize.bucketmapjoin.sortedmerge=true;
set hive.mapred.reduce.tasks.speculative.execution=false;
set hive.auto.convert.join=true;
set hive.auto.convert.sortmerge.join=true;
set hive.auto.convert.sortmerge.join.noconditionaltask=false;
set hive.auto.convert.join.noconditionaltask=false;
set hive.auto.convert.join.noconditionaltask.size=1;
set hive.optimize.reducededuplication=true;
set hive.optimize.reducededuplication.min.reducer=1;
set hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat;
set hive.mapjoin.smalltable.filesize=4500;

set hive.optimize.index.filter=false;
set hive.vectorized.execution.enabled=false;
set hive.optimize.correlation=false;
select
   i_item_id,
   s_state,
   avg(ss_quantity) agg1,
   avg(ss_list_price) agg2,
   avg(ss_coupon_amt) agg3,
   avg(ss_sales_price) agg4
FROM store_sales
JOIN date_dim on (store_sales.ss_sold_date_sk = date_dim.d_date_sk)
JOIN item on (store_sales.ss_item_sk = item.i_item_sk)
JOIN customer_demographics on (store_sales.ss_cdemo_sk = 
customer_demographics.cd_demo_sk)
JOIN store on (store_sales.ss_store_sk = store.s_store_sk)
where
   cd_gender = 'F' and
   cd_marital_status = 'U' and
   cd_education_status = 'Primary' and
   d_year = 2002 and
   s_state in ('GA','PA', 'LA', 'SC', 'MI', 'AL')
group by i_item_id, s_state with rollup
order by
   i_item_id,
   s_state
limit 100;
{code}

The log shows ...
{code}
14/03/14 17:05:02 INFO plan.ConditionalResolverCommonJoin: Failed to resolve 
driver alias (threshold : 4500, length mapping : {store=94175, 
store_sales=48713909726, item=39798667, customer_demographics=1660831, 
date_dim=2275902})
Stage-27 is filtered out by condition resolver.
14/03/14 17:05:02 INFO exec.Task: Stage-27 is filtered out by condition 
resolver.
Stage-28 is filtered out by condition resolver.
14/03/14 17:05:02 INFO exec.Task: Stage-28 is filtered out by condition 
resolver.
Stage-3 is selected by condition resolver.
{code}
Stage-3 is a reduce join. Actually, the resolver should pick the map join



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Re: Timeline for the Hive 0.13 release?

2014-03-14 Thread Yin Huai
Guys,

Seems ConditionalResolverCommonJoin is not working correctly? I created
https://issues.apache.org/jira/browse/HIVE-6668 and set it as a blocker.

thanks,

Yin


On Fri, Mar 14, 2014 at 11:34 AM, Thejas Nair the...@hortonworks.comwrote:

 Can you also add HIVE-6647 
 https://issues.apache.org/jira/browse/HIVE-6647 to
 the list? It is marked as a blocker for 0.13.
 It has a necessary version number upgrade for HS2. It is ready to be
 committed.


 On Fri, Mar 14, 2014 at 12:38 AM, Prasanth Jayachandran 
 pjayachand...@hortonworks.com wrote:

  Harish
 
  Can you please make the following changes to my earlier request?
 
  HIVE-4177 is not required.. instead the same work is tracked under
  HIVE-6578.
 
  Can you also consider HIVE-6656?
  HIVE-6656 is bug fix for ORC reader when reading timestamp nanoseconds.
  This bug exists in earlier versions as well, so it will be good have this
  fixed in 0.13.0
 
  Thanks
  Prasanth Jayachandran
 
  On Mar 13, 2014, at 8:52 AM, Thejas Nair the...@hortonworks.com wrote:
 
   Harish,
   I think we should include the following -
   HIVE-6547 - This is a cleanup of metastore api changes introduced in
  0.13 .
   This can't be done post release. I will get a patch out in few hours.
   HIVE-6567 -  fixes a NPE in 'show grant .. on all
   HIVE-6629 - change in syntax for 'set role none' . marked as a blocker
  bug.
  
  
   On Tue, Mar 11, 2014 at 8:39 AM, Harish Butani 
 hbut...@hortonworks.com
  wrote:
  
   yes sure.
  
  
   On Mar 10, 2014, at 3:55 PM, Gopal V gop...@apache.org wrote:
  
   Can I add HIVE-6518 as well to the merge queue on
  
  
  
 
 https://cwiki.apache.org/confluence/display/Hive/Hive+0.13+release+status
  
   It is a relatively simple OOM safety patch to vectorized group-by.
  
   Tests pass locally for vec group-by, but the pre-commit tests haven't
   fired eventhough it's been PA for a while now.
  
   Cheers,
   Gopal
  
  
   --
   CONFIDENTIALITY NOTICE
   NOTICE: This message is intended for the use of the individual or
  entity to
   which it is addressed and may contain information that is
 confidential,
   privileged and exempt from disclosure under applicable law. If the
  reader
   of this message is not the intended recipient, you are hereby notified
  that
   any printing, copying, dissemination, distribution, disclosure or
   forwarding of this communication is strictly prohibited. If you have
   received this communication in error, please contact the sender
  immediately
   and delete it from your system. Thank You.
  
  
   --
   CONFIDENTIALITY NOTICE
   NOTICE: This message is intended for the use of the individual or
 entity
  to
   which it is addressed and may contain information that is confidential,
   privileged and exempt from disclosure under applicable law. If the
 reader
   of this message is not the intended recipient, you are hereby notified
  that
   any printing, copying, dissemination, distribution, disclosure or
   forwarding of this communication is strictly prohibited. If you have
   received this communication in error, please contact the sender
  immediately
   and delete it from your system. Thank You.
 
 
  --
  CONFIDENTIALITY NOTICE
  NOTICE: This message is intended for the use of the individual or entity
 to
  which it is addressed and may contain information that is confidential,
  privileged and exempt from disclosure under applicable law. If the reader
  of this message is not the intended recipient, you are hereby notified
 that
  any printing, copying, dissemination, distribution, disclosure or
  forwarding of this communication is strictly prohibited. If you have
  received this communication in error, please contact the sender
 immediately
  and delete it from your system. Thank You.
 

 --
 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or entity to
 which it is addressed and may contain information that is confidential,
 privileged and exempt from disclosure under applicable law. If the reader
 of this message is not the intended recipient, you are hereby notified that
 any printing, copying, dissemination, distribution, disclosure or
 forwarding of this communication is strictly prohibited. If you have
 received this communication in error, please contact the sender immediately
 and delete it from your system. Thank You.



[jira] [Commented] (HIVE-6668) When auto join convert is on and noconditionaltask is off, ConditionalResolverCommonJoin fails to resolve map joins.

2014-03-14 Thread Yin Huai (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13935290#comment-13935290
 ] 

Yin Huai commented on HIVE-6668:


I guess it was broken by HIVE-6403 or HIVE-6144.

 When auto join convert is on and noconditionaltask is off, 
 ConditionalResolverCommonJoin fails to resolve map joins.
 

 Key: HIVE-6668
 URL: https://issues.apache.org/jira/browse/HIVE-6668
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0, 0.14.0
Reporter: Yin Huai
Priority: Blocker
 Fix For: 0.13.0


 I tried the following query today ...
 {code:sql}
 set mapred.job.map.memory.mb=2048;
 set mapred.job.reduce.memory.mb=2048;
 set mapred.map.child.java.opts=-server -Xmx3072m 
 -Djava.net.preferIPv4Stack=true;
 set mapred.reduce.child.java.opts=-server -Xmx3072m 
 -Djava.net.preferIPv4Stack=true;
 set mapred.reduce.tasks=60;
 set hive.stats.autogather=false;
 set hive.exec.parallel=false;
 set hive.enforce.bucketing=true;
 set hive.enforce.sorting=true;
 set hive.map.aggr=true;
 set hive.optimize.bucketmapjoin=true;
 set hive.optimize.bucketmapjoin.sortedmerge=true;
 set hive.mapred.reduce.tasks.speculative.execution=false;
 set hive.auto.convert.join=true;
 set hive.auto.convert.sortmerge.join=true;
 set hive.auto.convert.sortmerge.join.noconditionaltask=false;
 set hive.auto.convert.join.noconditionaltask=false;
 set hive.auto.convert.join.noconditionaltask.size=1;
 set hive.optimize.reducededuplication=true;
 set hive.optimize.reducededuplication.min.reducer=1;
 set hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat;
 set hive.mapjoin.smalltable.filesize=4500;
 set hive.optimize.index.filter=false;
 set hive.vectorized.execution.enabled=false;
 set hive.optimize.correlation=false;
 select
i_item_id,
s_state,
avg(ss_quantity) agg1,
avg(ss_list_price) agg2,
avg(ss_coupon_amt) agg3,
avg(ss_sales_price) agg4
 FROM store_sales
 JOIN date_dim on (store_sales.ss_sold_date_sk = date_dim.d_date_sk)
 JOIN item on (store_sales.ss_item_sk = item.i_item_sk)
 JOIN customer_demographics on (store_sales.ss_cdemo_sk = 
 customer_demographics.cd_demo_sk)
 JOIN store on (store_sales.ss_store_sk = store.s_store_sk)
 where
cd_gender = 'F' and
cd_marital_status = 'U' and
cd_education_status = 'Primary' and
d_year = 2002 and
s_state in ('GA','PA', 'LA', 'SC', 'MI', 'AL')
 group by i_item_id, s_state with rollup
 order by
i_item_id,
s_state
 limit 100;
 {code}
 The log shows ...
 {code}
 14/03/14 17:05:02 INFO plan.ConditionalResolverCommonJoin: Failed to resolve 
 driver alias (threshold : 4500, length mapping : {store=94175, 
 store_sales=48713909726, item=39798667, customer_demographics=1660831, 
 date_dim=2275902})
 Stage-27 is filtered out by condition resolver.
 14/03/14 17:05:02 INFO exec.Task: Stage-27 is filtered out by condition 
 resolver.
 Stage-28 is filtered out by condition resolver.
 14/03/14 17:05:02 INFO exec.Task: Stage-28 is filtered out by condition 
 resolver.
 Stage-3 is selected by condition resolver.
 {code}
 Stage-3 is a reduce join. Actually, the resolver should pick the map join



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6641) optimized HashMap keys won't work correctly with decimals

2014-03-14 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13935289#comment-13935289
 ] 

Sergey Shelukhin commented on HIVE-6641:


[~gopalv] [~hagleitn] can you guys review? it's a small patch, most of it is a 
new .q.out

 optimized HashMap keys won't work correctly with decimals
 -

 Key: HIVE-6641
 URL: https://issues.apache.org/jira/browse/HIVE-6641
 Project: Hive
  Issue Type: Bug
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HIVE-6641.patch


 Decimal values with can be equal while having different byte representations 
 (different precision/scale), so comparing bytes is not enough. For a quick 
 fix, we can disable this for decimals



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6657) Add test coverage for Kerberos authentication implementation using Hadoop's miniKdc

2014-03-14 Thread Prasad Mujumdar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasad Mujumdar updated HIVE-6657:
--

Attachment: HIVE-6657.3.patch

Update patch per review feedback.

 Add test coverage for Kerberos authentication implementation using Hadoop's 
 miniKdc
 ---

 Key: HIVE-6657
 URL: https://issues.apache.org/jira/browse/HIVE-6657
 Project: Hive
  Issue Type: Improvement
  Components: Authentication, Testing Infrastructure, Tests
Affects Versions: 0.13.0
Reporter: Prasad Mujumdar
Assignee: Prasad Mujumdar
 Attachments: HIVE-6657.2.patch, HIVE-6657.3.patch


 Hadoop 2.3 includes miniKdc module. This provides a KDC that can be used by 
 downstream projects to implement unit tests for Kerberos authentication code.
 Hive has lot of code related to Kerberos and delegation token for 
 authentication, as well as accessing secure hadoop resources. This pretty 
 much has no coverage in the unit tests. We needs to add unit tests using 
 miniKdc module.
 Note that Hadoop 2.3 doesn't include a secure mini-cluster. Until that is 
 available, we can at least test authentication for components like 
 HiveServer2, Metastore and WebHCat.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6312) doAs with plain sasl auth should be session aware

2014-03-14 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13935294#comment-13935294
 ] 

Hive QA commented on HIVE-6312:
---



{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12634672/HIVE-6312.4.patch.txt

{color:green}SUCCESS:{color} +1 5394 tests passed

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1773/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1773/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12634672

 doAs with plain sasl auth should be session aware
 -

 Key: HIVE-6312
 URL: https://issues.apache.org/jira/browse/HIVE-6312
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Navis
Assignee: Navis
 Attachments: HIVE-6312.1.patch.txt, HIVE-6312.2.patch.txt, 
 HIVE-6312.3.patch.txt, HIVE-6312.4.patch.txt


 TUGIContainingProcessor creates new Subject for each invocation which induces 
 FileSystem leakage when cache is enable(true by default).



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HIVE-6669) sourcing txn-script from schema script results in failure for mysql oracle

2014-03-14 Thread Prasad Mujumdar (JIRA)
Prasad Mujumdar created HIVE-6669:
-

 Summary: sourcing txn-script from schema script results in failure 
for mysql  oracle
 Key: HIVE-6669
 URL: https://issues.apache.org/jira/browse/HIVE-6669
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.14.0
Reporter: Prasad Mujumdar
Priority: Blocker


This issues is addressed in 0.13 by in-lining the the transaction schema 
statements in the schema initialization script (HIVE-6559)
The 0.14 schema initialization is not fixed. This is the followup ticket for to 
address the problem in 0.14. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6669) sourcing txn-script from schema script results in failure for mysql oracle

2014-03-14 Thread Prasad Mujumdar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13935299#comment-13935299
 ] 

Prasad Mujumdar commented on HIVE-6669:
---

[~ashutoshc] and [~alangates], this is the followup ticket to track the oracle 
and mysql schema initialization problem for trunk (0.14). Thanks!

 sourcing txn-script from schema script results in failure for mysql  oracle
 

 Key: HIVE-6669
 URL: https://issues.apache.org/jira/browse/HIVE-6669
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.14.0
Reporter: Prasad Mujumdar
Priority: Blocker

 This issues is addressed in 0.13 by in-lining the the transaction schema 
 statements in the schema initialization script (HIVE-6559)
 The 0.14 schema initialization is not fixed. This is the followup ticket for 
 to address the problem in 0.14. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Assigned] (HIVE-6669) sourcing txn-script from schema script results in failure for mysql oracle

2014-03-14 Thread Alan Gates (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates reassigned HIVE-6669:


Assignee: Alan Gates

 sourcing txn-script from schema script results in failure for mysql  oracle
 

 Key: HIVE-6669
 URL: https://issues.apache.org/jira/browse/HIVE-6669
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.14.0
Reporter: Prasad Mujumdar
Assignee: Alan Gates
Priority: Blocker

 This issues is addressed in 0.13 by in-lining the the transaction schema 
 statements in the schema initialization script (HIVE-6559)
 The 0.14 schema initialization is not fixed. This is the followup ticket for 
 to address the problem in 0.14. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HIVE-6670) ClassNotFound with Serde

2014-03-14 Thread Abin Shahab (JIRA)
Abin Shahab created HIVE-6670:
-

 Summary: ClassNotFound with Serde
 Key: HIVE-6670
 URL: https://issues.apache.org/jira/browse/HIVE-6670
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0
Reporter: Abin Shahab


We are finding a ClassNotFound exception when we use 
CSVSerde(https://github.com/ogrodnek/csv-serde) to create a table.
This is happening because MapredLocalTask does not pass the local added jars to 
ExecDriver when that is launched.
ExecDriver's classpath does not include the added jars. Therefore, when the 
plan is deserialized, it throws a ClassNotFoundException in the deserialization 
code, and results in a TableDesc object with a Null DeserializerClass.
This results in an NPE during Fetch.
Steps to reproduce:
wget 
https://drone.io/github.com/ogrodnek/csv-serde/files/target/csv-serde-1.1.2-0.11.0-all.jar
 into somewhere local eg. 
/home/soam/HiveSerdeIssue/csv-serde-1.1.2-0.11.0-all.jar.
Place the sample files attached to this ticket in HDFS as follows:
hdfs dfs -mkdir /user/soam/HiveSerdeIssue/sampleCSV/
hdfs dfs -put /home/soam/sampleCSV.csv /user/soam/HiveSerdeIssue/sampleCSV/
hdfs dfs -mkdir /user/soam/HiveSerdeIssue/sampleJoinTarget/
hdfs dfs -put /home/soam/sampleJoinTarget.csv 
/user/soam/HiveSerdeIssue/sampleJoinTarget/

create the tables in hive (this might cause a problem in dogfood since i've 
already created tables in those names, so you'll have to change the table names 
or delete mine):
ADD JAR /home/soam/HiveSerdeIssue/csv-serde-1.1.2-0.11.0-all.jar;
create external table sampleCSV (md5hash string, filepath string)
row format serde 'com.bizo.hive.serde.csv.CSVSerde'
stored as textfile
location '/user/soam/HiveSerdeIssue/sampleCSV/'
;
create external table sampleJoinTarget (md5hash string, filepath string, 
datestamp string, nblines string, nberrors string)
ROW FORMAT DELIMITED 
FIELDS TERMINATED BY ',' 
LINES TERMINATED BY '\n'
STORED AS TEXTFILE
LOCATION '/user/soam/HiveSerdeIssue/sampleJoinTarget/'
;
===
Now, try the following JOIN:
ADD JAR /home/soam/HiveSerdeIssue/csv-serde-1.1.2-0.11.0-all.jar;
SELECT 
sampleCSV.md5hash, 
sampleCSV.filepath 
FROM sampleCSV
JOIN sampleJoinTarget
ON (sampleCSV.md5hash = sampleJoinTarget.md5hash) 
;
—
This will fail with the error:
Execution log at: /tmp/soam/.log
java.lang.ClassNotFoundException: com/bizo/hive/serde/csv/CSVSerde
Continuing ...
2014-03-11 10:35:03 Starting to launch local task to process map join; maximum 
memory = 238551040
Execution failed with exit status: 2
Obtaining error information
Task failed!
Task ID:
Stage-4
Logs:
/var/log/hive/soam/hive.log
FAILED: Execution Error, return code 2 from 
org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask
Try the following LEFT JOIN. This will work:
SELECT 
sampleCSV.md5hash, 
sampleCSV.filepath 
FROM sampleCSV
LEFT JOIN sampleJoinTarget
ON (sampleCSV.md5hash = sampleJoinTarget.md5hash) 
;
==



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6670) ClassNotFound with Serde

2014-03-14 Thread Abin Shahab (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abin Shahab updated HIVE-6670:
--

Description: 
We are finding a ClassNotFound exception when we use 
CSVSerde(https://github.com/ogrodnek/csv-serde) to create a table.
This is happening because MapredLocalTask does not pass the local added jars to 
ExecDriver when that is launched.
ExecDriver's classpath does not include the added jars. Therefore, when the 
plan is deserialized, it throws a ClassNotFoundException in the deserialization 
code, and results in a TableDesc object with a Null DeserializerClass.
This results in an NPE during Fetch.
Steps to reproduce:
wget 
https://drone.io/github.com/ogrodnek/csv-serde/files/target/csv-serde-1.1.2-0.11.0-all.jar
 into somewhere local eg. 
/home/soam/HiveSerdeIssue/csv-serde-1.1.2-0.11.0-all.jar.
Place some sample SCV files in HDFS as follows:
hdfs dfs -mkdir /user/soam/HiveSerdeIssue/sampleCSV/
hdfs dfs -put /home/soam/sampleCSV.csv /user/soam/HiveSerdeIssue/sampleCSV/
hdfs dfs -mkdir /user/soam/HiveSerdeIssue/sampleJoinTarget/
hdfs dfs -put /home/soam/sampleJoinTarget.csv 
/user/soam/HiveSerdeIssue/sampleJoinTarget/

create the tables in hive:
ADD JAR /home/soam/HiveSerdeIssue/csv-serde-1.1.2-0.11.0-all.jar;
create external table sampleCSV (md5hash string, filepath string)
row format serde 'com.bizo.hive.serde.csv.CSVSerde'
stored as textfile
location '/user/soam/HiveSerdeIssue/sampleCSV/'
;
create external table sampleJoinTarget (md5hash string, filepath string, 
datestamp string, nblines string, nberrors string)
ROW FORMAT DELIMITED 
FIELDS TERMINATED BY ',' 
LINES TERMINATED BY '\n'
STORED AS TEXTFILE
LOCATION '/user/soam/HiveSerdeIssue/sampleJoinTarget/'
;
===
Now, try the following JOIN:
ADD JAR /home/soam/HiveSerdeIssue/csv-serde-1.1.2-0.11.0-all.jar;
SELECT 
sampleCSV.md5hash, 
sampleCSV.filepath 
FROM sampleCSV
JOIN sampleJoinTarget
ON (sampleCSV.md5hash = sampleJoinTarget.md5hash) 
;
—
This will fail with the error:
Execution log at: /tmp/soam/.log
java.lang.ClassNotFoundException: com/bizo/hive/serde/csv/CSVSerde
Continuing ...
2014-03-11 10:35:03 Starting to launch local task to process map join; maximum 
memory = 238551040
Execution failed with exit status: 2
Obtaining error information
Task failed!
Task ID:
Stage-4
Logs:
/var/log/hive/soam/hive.log
FAILED: Execution Error, return code 2 from 
org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask
Try the following LEFT JOIN. This will work:
SELECT 
sampleCSV.md5hash, 
sampleCSV.filepath 
FROM sampleCSV
LEFT JOIN sampleJoinTarget
ON (sampleCSV.md5hash = sampleJoinTarget.md5hash) 
;
==

  was:
We are finding a ClassNotFound exception when we use 
CSVSerde(https://github.com/ogrodnek/csv-serde) to create a table.
This is happening because MapredLocalTask does not pass the local added jars to 
ExecDriver when that is launched.
ExecDriver's classpath does not include the added jars. Therefore, when the 
plan is deserialized, it throws a ClassNotFoundException in the deserialization 
code, and results in a TableDesc object with a Null DeserializerClass.
This results in an NPE during Fetch.
Steps to reproduce:
wget 
https://drone.io/github.com/ogrodnek/csv-serde/files/target/csv-serde-1.1.2-0.11.0-all.jar
 into somewhere local eg. 
/home/soam/HiveSerdeIssue/csv-serde-1.1.2-0.11.0-all.jar.
Place the sample files attached to this ticket in HDFS as follows:
hdfs dfs -mkdir /user/soam/HiveSerdeIssue/sampleCSV/
hdfs dfs -put /home/soam/sampleCSV.csv /user/soam/HiveSerdeIssue/sampleCSV/
hdfs dfs -mkdir /user/soam/HiveSerdeIssue/sampleJoinTarget/
hdfs dfs -put /home/soam/sampleJoinTarget.csv 
/user/soam/HiveSerdeIssue/sampleJoinTarget/

create the tables in hive (this might cause a problem in dogfood since i've 
already created tables in those names, so you'll have to change the table names 
or delete mine):
ADD JAR /home/soam/HiveSerdeIssue/csv-serde-1.1.2-0.11.0-all.jar;
create external table sampleCSV (md5hash string, filepath string)
row format serde 'com.bizo.hive.serde.csv.CSVSerde'
stored as textfile
location '/user/soam/HiveSerdeIssue/sampleCSV/'
;
create external table sampleJoinTarget (md5hash string, filepath string, 
datestamp string, nblines string, nberrors string)
ROW FORMAT DELIMITED 
FIELDS TERMINATED BY ',' 
LINES TERMINATED BY '\n'
STORED AS TEXTFILE
LOCATION '/user/soam/HiveSerdeIssue/sampleJoinTarget/'
;
===
Now, try the following JOIN:
ADD JAR /home/soam/HiveSerdeIssue/csv-serde-1.1.2-0.11.0-all.jar;
SELECT 
sampleCSV.md5hash, 
sampleCSV.filepath 
FROM sampleCSV
JOIN sampleJoinTarget
ON (sampleCSV.md5hash = sampleJoinTarget.md5hash) 
;
—
This will fail with the error:
Execution log at: /tmp/soam/.log
java.lang.ClassNotFoundException: com/bizo/hive/serde/csv/CSVSerde
Continuing ...
2014-03-11 10:35:03 Starting to launch local task to process map join; maximum 
memory = 238551040
Execution failed with exit 

[jira] [Updated] (HIVE-6636) /user/hive is a bad default for HDFS jars path for Tez

2014-03-14 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6636?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-6636:
---

Resolution: Fixed
Status: Resolved  (was: Patch Available)

committed to trunk and 0.13 branch

 /user/hive is a bad default for HDFS jars path for Tez
 --

 Key: HIVE-6636
 URL: https://issues.apache.org/jira/browse/HIVE-6636
 Project: Hive
  Issue Type: Bug
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Fix For: 0.13.0

 Attachments: HIVE-6636.01.patch, HIVE-6636.02.patch, HIVE-6636.patch


 If user runs hive under the user name that is not hive, jobs will fail 
 until everyone is granted write access to /user/hive, which is not nice.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6312) doAs with plain sasl auth should be session aware

2014-03-14 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-6312:


   Resolution: Fixed
Fix Version/s: 0.13.0
   Status: Resolved  (was: Patch Available)

Patch committed to trunk and 0.13 branch (this is in the 0.13 jira list).
Thanks Navis!


 doAs with plain sasl auth should be session aware
 -

 Key: HIVE-6312
 URL: https://issues.apache.org/jira/browse/HIVE-6312
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Navis
Assignee: Navis
 Fix For: 0.13.0

 Attachments: HIVE-6312.1.patch.txt, HIVE-6312.2.patch.txt, 
 HIVE-6312.3.patch.txt, HIVE-6312.4.patch.txt


 TUGIContainingProcessor creates new Subject for each invocation which induces 
 FileSystem leakage when cache is enable(true by default).



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6658) Modify Alter_numbuckets* test to reflect hadoop2 changes

2014-03-14 Thread Szehon Ho (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13935362#comment-13935362
 ] 

Szehon Ho commented on HIVE-6658:
-

Yea I saw this too.   Thanks for fixing this, one comment for consideration- 
Looking at other q-tests with two versions, original includes 0.20,0.20S, and 
new version exclude 0.20,20S.  Do we want to do that instead for consistency, 
as future versions of hadoop should probably adhere to 'new' behavior (which is 
proper #buckets in this case).

 Modify Alter_numbuckets* test to reflect hadoop2 changes
 

 Key: HIVE-6658
 URL: https://issues.apache.org/jira/browse/HIVE-6658
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0
Reporter: Laljo John Pullokkaran
Assignee: Laljo John Pullokkaran
 Attachments: HIVE-6658.patch


 Hadoop2 now honors number of reducers config while running in local mode. 
 This affects bucketing tests as the data gets properly bucketed in Hadoop2 
 (In hadoop1 all data ended up in same bucket while in local mode).



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6607) describe extended on a view fails with NPE

2014-03-14 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-6607:
-

Priority: Blocker  (was: Major)

 describe extended on a view fails with NPE
 --

 Key: HIVE-6607
 URL: https://issues.apache.org/jira/browse/HIVE-6607
 Project: Hive
  Issue Type: Bug
  Components: WebHCat
Affects Versions: 0.12.0, 0.13.0
Reporter: Eugene Koifman
Assignee: Eugene Koifman
Priority: Blocker
 Fix For: 0.14.0

 Attachments: HIVE-6607.patch


 STEPS TO REPRODUCE:
 Create a table called 'sample_08'
 Create a view of the table. From hive command line, please run:
 hive create view sample_09 as select * from sample_08 ;
 ACTUAL BEHAVIOR:
 Run the following command in the browser:
 http://localhost:50111/templeton/v1/ddl/database/default/table/sample_09?format=extended
 It fails with the following exception:
 {errorDetail:org.apache.hadoop.hive.ql.metadata.HiveException: Exception 
 while processing show table status\n\tat 
 org.apache.hadoop.hive.ql.exec.DDLTask.showTableStatus(DDLTask.java:2707)\n\tat
  org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:343)\n\tat 
 org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:151)\n\tat 
 org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:65)\n\tat
  org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1437)\n\tat 
 org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1215)\n\tat 
 org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1043)\n\tat 
 org.apache.hadoop.hive.ql.Driver.run(Driver.java:911)\n\tat 
 org.apache.hive.hcatalog.cli.HCatDriver.run(HCatDriver.java:43)\n\tat 
 org.apache.hive.hcatalog.cli.HCatCli.processCmd(HCatCli.java:259)\n\tat 
 org.apache.hive.hcatalog.cli.HCatCli.processLine(HCatCli.java:213)\n\tat 
 org.apache.hive.hcatalog.cli.HCatCli.main(HCatCli.java:172)\n\tat 
 sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)\n\tat 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)\n\tat
  
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)\n\tat
  java.lang.reflect.Method.invoke(Method.java:597)\n\tat 
 org.apache.hadoop.util.RunJar.main(RunJar.java:212)\nCaused by: 
 java.lang.NullPointerException\n\tat 
 org.apache.hadoop.hive.ql.metadata.formatting.JsonMetaDataFormatter.putFileSystemsStats(JsonMetaDataFormatter.java:264)\n\tat
  
 org.apache.hadoop.hive.ql.metadata.formatting.JsonMetaDataFormatter.makeOneTableStatus(JsonMetaDataFormatter.java:218)\n\tat
  
 org.apache.hadoop.hive.ql.metadata.formatting.JsonMetaDataFormatter.makeAllTableStatus(JsonMetaDataFormatter.java:170)\n\tat
  
 org.apache.hadoop.hive.ql.metadata.formatting.JsonMetaDataFormatter.showTableStatus(JsonMetaDataFormatter.java:153)\n\tat
  
 org.apache.hadoop.hive.ql.exec.DDLTask.showTableStatus(DDLTask.java:2702)\n\t...
  16 more\n,error:FAILED: Execution Error, return code 1 from 
 org.apache.hadoop.hive.ql.exec.DDLTask. Exception while processing show table 
 status,sqlState:08S01,errorCode:4,database:default,table:sample_09}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6576) sending user.name as a form parameter in POST doesn't work post HADOOP-10193

2014-03-14 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-6576:
-

Priority: Blocker  (was: Major)

 sending user.name as a form parameter in POST doesn't work post HADOOP-10193
 

 Key: HIVE-6576
 URL: https://issues.apache.org/jira/browse/HIVE-6576
 Project: Hive
  Issue Type: Bug
  Components: WebHCat
Affects Versions: 0.13.0
Reporter: Eugene Koifman
Assignee: Eugene Koifman
Priority: Blocker
 Fix For: 0.13.0

 Attachments: HIVE-6576.patch


 WebHCat uses AuthFilter to handle authentication.  In simple mode that means 
 using PseudoAuthenticationHandler.  Prior to HADOOP-10193, the latter handled 
 user.name as form parameter in a POST request.  Now it only handles it as a 
 query parameter.  
 to maintain webhcat backwards compat, we need to make WebHCat still extract 
 it from form param.  This will be deprecated immediately and removed in 0.15
 Also, all examples in WebHCat reference manual should be updated to use 
 user.name in query string from current form param (curl -d user.name=foo)



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6576) sending user.name as a form parameter in POST doesn't work post HADOOP-10193

2014-03-14 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-6576:
-

Affects Version/s: 0.13.0

 sending user.name as a form parameter in POST doesn't work post HADOOP-10193
 

 Key: HIVE-6576
 URL: https://issues.apache.org/jira/browse/HIVE-6576
 Project: Hive
  Issue Type: Bug
  Components: WebHCat
Affects Versions: 0.13.0
Reporter: Eugene Koifman
Assignee: Eugene Koifman
Priority: Blocker
 Fix For: 0.13.0

 Attachments: HIVE-6576.patch


 WebHCat uses AuthFilter to handle authentication.  In simple mode that means 
 using PseudoAuthenticationHandler.  Prior to HADOOP-10193, the latter handled 
 user.name as form parameter in a POST request.  Now it only handles it as a 
 query parameter.  
 to maintain webhcat backwards compat, we need to make WebHCat still extract 
 it from form param.  This will be deprecated immediately and removed in 0.15
 Also, all examples in WebHCat reference manual should be updated to use 
 user.name in query string from current form param (curl -d user.name=foo)



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6607) describe extended on a view fails with NPE

2014-03-14 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-6607:
-

Affects Version/s: 0.13.0

 describe extended on a view fails with NPE
 --

 Key: HIVE-6607
 URL: https://issues.apache.org/jira/browse/HIVE-6607
 Project: Hive
  Issue Type: Bug
  Components: WebHCat
Affects Versions: 0.12.0, 0.13.0
Reporter: Eugene Koifman
Assignee: Eugene Koifman
 Fix For: 0.14.0

 Attachments: HIVE-6607.patch


 STEPS TO REPRODUCE:
 Create a table called 'sample_08'
 Create a view of the table. From hive command line, please run:
 hive create view sample_09 as select * from sample_08 ;
 ACTUAL BEHAVIOR:
 Run the following command in the browser:
 http://localhost:50111/templeton/v1/ddl/database/default/table/sample_09?format=extended
 It fails with the following exception:
 {errorDetail:org.apache.hadoop.hive.ql.metadata.HiveException: Exception 
 while processing show table status\n\tat 
 org.apache.hadoop.hive.ql.exec.DDLTask.showTableStatus(DDLTask.java:2707)\n\tat
  org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:343)\n\tat 
 org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:151)\n\tat 
 org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:65)\n\tat
  org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1437)\n\tat 
 org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1215)\n\tat 
 org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1043)\n\tat 
 org.apache.hadoop.hive.ql.Driver.run(Driver.java:911)\n\tat 
 org.apache.hive.hcatalog.cli.HCatDriver.run(HCatDriver.java:43)\n\tat 
 org.apache.hive.hcatalog.cli.HCatCli.processCmd(HCatCli.java:259)\n\tat 
 org.apache.hive.hcatalog.cli.HCatCli.processLine(HCatCli.java:213)\n\tat 
 org.apache.hive.hcatalog.cli.HCatCli.main(HCatCli.java:172)\n\tat 
 sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)\n\tat 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)\n\tat
  
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)\n\tat
  java.lang.reflect.Method.invoke(Method.java:597)\n\tat 
 org.apache.hadoop.util.RunJar.main(RunJar.java:212)\nCaused by: 
 java.lang.NullPointerException\n\tat 
 org.apache.hadoop.hive.ql.metadata.formatting.JsonMetaDataFormatter.putFileSystemsStats(JsonMetaDataFormatter.java:264)\n\tat
  
 org.apache.hadoop.hive.ql.metadata.formatting.JsonMetaDataFormatter.makeOneTableStatus(JsonMetaDataFormatter.java:218)\n\tat
  
 org.apache.hadoop.hive.ql.metadata.formatting.JsonMetaDataFormatter.makeAllTableStatus(JsonMetaDataFormatter.java:170)\n\tat
  
 org.apache.hadoop.hive.ql.metadata.formatting.JsonMetaDataFormatter.showTableStatus(JsonMetaDataFormatter.java:153)\n\tat
  
 org.apache.hadoop.hive.ql.exec.DDLTask.showTableStatus(DDLTask.java:2702)\n\t...
  16 more\n,error:FAILED: Execution Error, return code 1 from 
 org.apache.hadoop.hive.ql.exec.DDLTask. Exception while processing show table 
 status,sqlState:08S01,errorCode:4,database:default,table:sample_09}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (HIVE-6663) remove TUGIContainingProcessor class as it is not used anymore

2014-03-14 Thread Prasad Mujumdar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasad Mujumdar resolved HIVE-6663.
---

Resolution: Not A Problem

Addressed as part of HIVE-6312

 remove TUGIContainingProcessor class as it is not used anymore
 --

 Key: HIVE-6663
 URL: https://issues.apache.org/jira/browse/HIVE-6663
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Attachments: HIVE-6663.1.patch


 After HIVE-6312 changes, TUGIContainingProcessor class is unused. It should 
 be removed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-4501) HS2 memory leak - FileSystem objects in FileSystem.CACHE

2014-03-14 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13935377#comment-13935377
 ] 

Thejas M Nair commented on HIVE-4501:
-

The changes in HIVE-6312 (in 0.13) should fix this issue, as the unsecure mode 
now follows same code path as secure mode, which calls closeAllForUGI .
I haven't verified with large number of requests and cache enabled, but if 
there is still any leak its going to have a different root cause.


 HS2 memory leak - FileSystem objects in FileSystem.CACHE
 

 Key: HIVE-4501
 URL: https://issues.apache.org/jira/browse/HIVE-4501
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Affects Versions: 0.11.0
Reporter: Thejas M Nair
Assignee: Vaibhav Gumashta
 Fix For: 0.13.0

 Attachments: HIVE-4501.1.patch, HIVE-4501.1.patch, HIVE-4501.1.patch, 
 HIVE-4501.trunk.patch


 org.apache.hadoop.fs.FileSystem objects are getting accumulated in 
 FileSystem.CACHE, with HS2 in unsecure mode.
 As a workaround, it is possible to set fs.hdfs.impl.disable.cache and 
 fs.file.impl.disable.cache to true.
 Users should not have to bother with this extra configuration. 
 As a workaround disable impersonation by setting hive.server2.enable.doAs to 
 false.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (HIVE-4501) HS2 memory leak - FileSystem objects in FileSystem.CACHE

2014-03-14 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair resolved HIVE-4501.
-

Resolution: Duplicate

 HS2 memory leak - FileSystem objects in FileSystem.CACHE
 

 Key: HIVE-4501
 URL: https://issues.apache.org/jira/browse/HIVE-4501
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Affects Versions: 0.11.0
Reporter: Thejas M Nair
Assignee: Vaibhav Gumashta
 Fix For: 0.13.0

 Attachments: HIVE-4501.1.patch, HIVE-4501.1.patch, HIVE-4501.1.patch, 
 HIVE-4501.trunk.patch


 org.apache.hadoop.fs.FileSystem objects are getting accumulated in 
 FileSystem.CACHE, with HS2 in unsecure mode.
 As a workaround, it is possible to set fs.hdfs.impl.disable.cache and 
 fs.file.impl.disable.cache to true.
 Users should not have to bother with this extra configuration. 
 As a workaround disable impersonation by setting hive.server2.enable.doAs to 
 false.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6663) remove TUGIContainingProcessor class as it is not used anymore

2014-03-14 Thread Prasad Mujumdar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13935380#comment-13935380
 ] 

Prasad Mujumdar commented on HIVE-6663:
---

This change is rolled into HIVE-6312 patch.

 remove TUGIContainingProcessor class as it is not used anymore
 --

 Key: HIVE-6663
 URL: https://issues.apache.org/jira/browse/HIVE-6663
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Attachments: HIVE-6663.1.patch


 After HIVE-6312 changes, TUGIContainingProcessor class is unused. It should 
 be removed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6157) Fetching column stats slower than the 101 during rush hour

2014-03-14 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-6157:
---

Fix Version/s: 0.13.0

 Fetching column stats slower than the 101 during rush hour
 --

 Key: HIVE-6157
 URL: https://issues.apache.org/jira/browse/HIVE-6157
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0
Reporter: Gunther Hagleitner
Assignee: Sergey Shelukhin
 Fix For: 0.13.0

 Attachments: HIVE-6157.01.patch, HIVE-6157.01.patch, 
 HIVE-6157.03.patch, HIVE-6157.03.patch, HIVE-6157.nogen.patch, 
 HIVE-6157.nogen.patch, HIVE-6157.prelim.patch


 hive.stats.fetch.column.stats controls whether the column stats for a table 
 are fetched during explain (in Tez: during query planning). On my setup (1 
 table 4000 partitions, 24 columns) the time spent in semantic analyze goes 
 from ~1 second to ~66 seconds when turning the flag on. 65 seconds spent 
 fetching column stats...
 The reason is probably that the APIs force you to make separate metastore 
 calls for each column in each partition. That's probably the first thing that 
 has to change. The question is if in addition to that we need to cache this 
 in the client or store the stats as a single blob in the database to further 
 cut down on the time. However, the way it stands right now column stats seem 
 unusable.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6668) When auto join convert is on and noconditionaltask is off, ConditionalResolverCommonJoin fails to resolve map joins.

2014-03-14 Thread Yin Huai (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13935397#comment-13935397
 ] 

Yin Huai commented on HIVE-6668:


Seems aliases returned from this line 
(https://github.com/apache/hive/blob/trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/ConditionalResolverCommonJoin.java#L178)
 is an empty set.

 When auto join convert is on and noconditionaltask is off, 
 ConditionalResolverCommonJoin fails to resolve map joins.
 

 Key: HIVE-6668
 URL: https://issues.apache.org/jira/browse/HIVE-6668
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0, 0.14.0
Reporter: Yin Huai
Priority: Blocker
 Fix For: 0.13.0


 I tried the following query today ...
 {code:sql}
 set mapred.job.map.memory.mb=2048;
 set mapred.job.reduce.memory.mb=2048;
 set mapred.map.child.java.opts=-server -Xmx3072m 
 -Djava.net.preferIPv4Stack=true;
 set mapred.reduce.child.java.opts=-server -Xmx3072m 
 -Djava.net.preferIPv4Stack=true;
 set mapred.reduce.tasks=60;
 set hive.stats.autogather=false;
 set hive.exec.parallel=false;
 set hive.enforce.bucketing=true;
 set hive.enforce.sorting=true;
 set hive.map.aggr=true;
 set hive.optimize.bucketmapjoin=true;
 set hive.optimize.bucketmapjoin.sortedmerge=true;
 set hive.mapred.reduce.tasks.speculative.execution=false;
 set hive.auto.convert.join=true;
 set hive.auto.convert.sortmerge.join=true;
 set hive.auto.convert.sortmerge.join.noconditionaltask=false;
 set hive.auto.convert.join.noconditionaltask=false;
 set hive.auto.convert.join.noconditionaltask.size=1;
 set hive.optimize.reducededuplication=true;
 set hive.optimize.reducededuplication.min.reducer=1;
 set hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat;
 set hive.mapjoin.smalltable.filesize=4500;
 set hive.optimize.index.filter=false;
 set hive.vectorized.execution.enabled=false;
 set hive.optimize.correlation=false;
 select
i_item_id,
s_state,
avg(ss_quantity) agg1,
avg(ss_list_price) agg2,
avg(ss_coupon_amt) agg3,
avg(ss_sales_price) agg4
 FROM store_sales
 JOIN date_dim on (store_sales.ss_sold_date_sk = date_dim.d_date_sk)
 JOIN item on (store_sales.ss_item_sk = item.i_item_sk)
 JOIN customer_demographics on (store_sales.ss_cdemo_sk = 
 customer_demographics.cd_demo_sk)
 JOIN store on (store_sales.ss_store_sk = store.s_store_sk)
 where
cd_gender = 'F' and
cd_marital_status = 'U' and
cd_education_status = 'Primary' and
d_year = 2002 and
s_state in ('GA','PA', 'LA', 'SC', 'MI', 'AL')
 group by i_item_id, s_state with rollup
 order by
i_item_id,
s_state
 limit 100;
 {code}
 The log shows ...
 {code}
 14/03/14 17:05:02 INFO plan.ConditionalResolverCommonJoin: Failed to resolve 
 driver alias (threshold : 4500, length mapping : {store=94175, 
 store_sales=48713909726, item=39798667, customer_demographics=1660831, 
 date_dim=2275902})
 Stage-27 is filtered out by condition resolver.
 14/03/14 17:05:02 INFO exec.Task: Stage-27 is filtered out by condition 
 resolver.
 Stage-28 is filtered out by condition resolver.
 14/03/14 17:05:02 INFO exec.Task: Stage-28 is filtered out by condition 
 resolver.
 Stage-3 is selected by condition resolver.
 {code}
 Stage-3 is a reduce join. Actually, the resolver should pick the map join



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6662) Vector Join operations with DATE columns fail

2014-03-14 Thread Jitendra Nath Pandey (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13935401#comment-13935401
 ] 

Jitendra Nath Pandey commented on HIVE-6662:


Please use DateWritable#getDays, the date representation is number of days 
since epoch.

 Vector Join operations with DATE columns fail
 -

 Key: HIVE-6662
 URL: https://issues.apache.org/jira/browse/HIVE-6662
 Project: Hive
  Issue Type: Bug
Reporter: Gopal V
Assignee: Gopal V
 Attachments: HIVE-6662.1.patch


 Trying to generate a DATE column as part of a JOIN's output throws an 
 exception
 {code}
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Incompatible 
 Long vector column and primitive category DATE
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorColumnAssignFactory.buildObjectAssign(VectorColumnAssignFactory.java:306)
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorColumnAssignFactory.buildAssigners(VectorColumnAssignFactory.java:414)
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.internalForward(VectorMapJoinOperator.java:235)
 at 
 org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genAllOneUniqueJoinObject(CommonJoinOperator.java:670)
 at 
 org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject(CommonJoinOperator.java:754)
 at 
 org.apache.hadoop.hive.ql.exec.MapJoinOperator.processOp(MapJoinOperator.java:229)
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.processOp(VectorMapJoinOperator.java:292)
 at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:790)
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Re: Review Request 19212: HIVE-6645: to_date()/to_unix_timestamp() fail with NPE if input is null

2014-03-14 Thread Jason Dere


 On March 14, 2014, 7:35 a.m., Mohammad Islam wrote:
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFToUnixTimeStamp.java,
   line 132
  https://reviews.apache.org/r/19212/diff/1/?file=519528#file519528line132
 
  do we need to check if arguments[1].get() is null  as done for 
  arguments[0]?
  Or the converter will handle it and return 'null'.

I'll add the testcase for this, and we'll find out :)


- Jason


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19212/#review37177
---


On March 14, 2014, 2:40 a.m., Jason Dere wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/19212/
 ---
 
 (Updated March 14, 2014, 2:40 a.m.)
 
 
 Review request for hive.
 
 
 Bugs: HIVE-6645
 https://issues.apache.org/jira/browse/HIVE-6645
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 - fix null inputs
 - allow char/varchar params
 - tests
 
 
 Diffs
 -
 
   ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFDate.java 
 c31174a 
   
 ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFToUnixTimeStamp.java
  dc259c6 
   ql/src/test/org/apache/hadoop/hive/ql/udf/TestGenericUDFDate.java 384ce4e 
   
 ql/src/test/org/apache/hadoop/hive/ql/udf/generic/TestGenericUDFToUnixTimestamp.java
  PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/19212/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Jason Dere
 




[jira] [Commented] (HIVE-4764) Support Kerberos HTTP authentication for HiveServer2 running in http mode

2014-03-14 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13935413#comment-13935413
 ] 

Thejas M Nair commented on HIVE-4764:
-

I have committed HIVE-6312 as the tests got run on it first, so this needs 
rebasing again! (Sorry the test run is the limiting factor!) [~vaibhavgumashta] 
Can you please rebase the patch, hopefully you can do that before the tests run 
on this one.


 Support Kerberos HTTP authentication for HiveServer2 running in http mode
 -

 Key: HIVE-4764
 URL: https://issues.apache.org/jira/browse/HIVE-4764
 Project: Hive
  Issue Type: Sub-task
  Components: HiveServer2
Affects Versions: 0.13.0
Reporter: Thejas M Nair
Assignee: Vaibhav Gumashta
 Fix For: 0.13.0

 Attachments: HIVE-4764.1.patch, HIVE-4764.2.patch, HIVE-4764.3.patch, 
 HIVE-4764.4.patch, HIVE-4764.5.patch


 Support Kerberos authentication for HiveServer2 running in http mode.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6651) broken link in WebHCat doc: Job Information — GET queue/:jobid

2014-03-14 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13935420#comment-13935420
 ] 

Eugene Koifman commented on HIVE-6651:
--

1. WebHCat just passes along to the user whatever information it finds in 
JobProfile object.
2. In H2, JobProfile was annotated with 
InterfaceAudience.LimitedPrivate({MapReduce}) and 
InterfaceStability.Unstable.  In H1 it's not annotated with anything.
3. JobStatus is a public (via annotations) class so we can assume that it will 
be stable.


So I think we should remove the link for JobProfile and just put 
org.apache.hadoop.mapred.JobProfile class name there with a note that WebHCat 
just passes along the info in this object which is subject to change from one 
Hadoop version to another.


 broken link in WebHCat doc: Job Information — GET queue/:jobid
 --

 Key: HIVE-6651
 URL: https://issues.apache.org/jira/browse/HIVE-6651
 Project: Hive
  Issue Type: Bug
  Components: Documentation, WebHCat
Reporter: Eugene Koifman
Priority: Minor

 https://cwiki.apache.org/confluence/display/Hive/WebHCat+Reference+JobInfo#WebHCatReferenceJobInfo-Results
 the link in the table to Class JobProfile is broken



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6613) Control when spcific Inputs / Outputs are started

2014-03-14 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated HIVE-6613:
-

Status: Open  (was: Patch Available)

 Control when spcific Inputs / Outputs are started
 -

 Key: HIVE-6613
 URL: https://issues.apache.org/jira/browse/HIVE-6613
 Project: Hive
  Issue Type: Improvement
Reporter: Siddharth Seth
Assignee: Siddharth Seth
 Attachments: TEZ-6613.1.txt


 When running with Tez - a couple of enhancement are possible
 1) Avoid re-fetching data in case of MapJoins - since the data is likely to 
 be cached after the first run (container re-use for the same query)
 2) Start Outputs only after required Inputs are ready - specifically useful 
 in case of Reduce - where shuffle requires a large memory, and the Output (if 
 it's a sorted output) also requires a fair amount of memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6613) Control when spcific Inputs / Outputs are started

2014-03-14 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated HIVE-6613:
-

Attachment: HIVE-6613.2.txt

Updated patch.

Changed cacheAccess to accept a configuration.

Haven't changed the way Inputs are cached - since this gives a way to iterate 
over cached inputs, which may be useful at some point.

Removed the LocalWork check. I'm not sure if a special check is required in 
case of a Bucketed Map Join.

 Control when spcific Inputs / Outputs are started
 -

 Key: HIVE-6613
 URL: https://issues.apache.org/jira/browse/HIVE-6613
 Project: Hive
  Issue Type: Improvement
Reporter: Siddharth Seth
Assignee: Siddharth Seth
 Attachments: HIVE-6613.2.txt, TEZ-6613.1.txt


 When running with Tez - a couple of enhancement are possible
 1) Avoid re-fetching data in case of MapJoins - since the data is likely to 
 be cached after the first run (container re-use for the same query)
 2) Start Outputs only after required Inputs are ready - specifically useful 
 in case of Reduce - where shuffle requires a large memory, and the Output (if 
 it's a sorted output) also requires a fair amount of memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6613) Control when spcific Inputs / Outputs are started

2014-03-14 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated HIVE-6613:
-

Status: Patch Available  (was: Open)

 Control when spcific Inputs / Outputs are started
 -

 Key: HIVE-6613
 URL: https://issues.apache.org/jira/browse/HIVE-6613
 Project: Hive
  Issue Type: Improvement
Reporter: Siddharth Seth
Assignee: Siddharth Seth
 Attachments: HIVE-6613.2.txt, TEZ-6613.1.txt


 When running with Tez - a couple of enhancement are possible
 1) Avoid re-fetching data in case of MapJoins - since the data is likely to 
 be cached after the first run (container re-use for the same query)
 2) Start Outputs only after required Inputs are ready - specifically useful 
 in case of Reduce - where shuffle requires a large memory, and the Output (if 
 it's a sorted output) also requires a fair amount of memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6060) Define API for RecordUpdater and UpdateReader

2014-03-14 Thread Owen O'Malley (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley updated HIVE-6060:


Attachment: HIVE-6060.patch

Re-upload for jenkins.

 Define API for RecordUpdater and UpdateReader
 -

 Key: HIVE-6060
 URL: https://issues.apache.org/jira/browse/HIVE-6060
 Project: Hive
  Issue Type: Sub-task
Reporter: Owen O'Malley
Assignee: Owen O'Malley
 Attachments: HIVE-6060.patch, HIVE-6060.patch, HIVE-6060.patch, 
 HIVE-6060.patch, HIVE-6060.patch, acid-io.patch, h-5317.patch, h-5317.patch, 
 h-5317.patch, h-6060.patch, h-6060.patch


 We need to define some new APIs for how Hive interacts with the file formats 
 since it needs to be much richer than the current RecordReader and 
 RecordWriter.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


  1   2   3   >