Re: Review Request 50359: HIVE-14270: Write temporary data to HDFS when doing inserts on tables located on S3

2016-07-28 Thread Reuben Kuhnert

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/50359/#review143975
---




ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java (line 3217)
<https://reviews.apache.org/r/50359/#comment209915>

This code in both branches of 'if/else' are identical except for the 
'destination path'. Maybe factor that out?


- Reuben Kuhnert


On 七月 27, 2016, 10:56 p.m., Sergio Pena wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/50359/
> ---
> 
> (Updated 七月 27, 2016, 10:56 p.m.)
> 
> 
> Review request for hive.
> 
> 
> Bugs: HIVE-14270
> https://issues.apache.org/jira/browse/HIVE-14270
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> This patch will create a temporary directory for Hive intermediate data on 
> HDFS when S3 tables are used.
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/common/BlobStorageUtils.java 
> PRE-CREATION 
>   common/src/test/org/apache/hadoop/hive/common/TestBlobStorageUtils.java 
> PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/Context.java 
> ec5d693d28a40925c44f844a05ebf3f5c10173c9 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 
> 9d927bd1a519f79bc7fa88c3b7e5c6cc2ef0637f 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 
> 2671cb1cf2ef74f9d6628f8cdf3f5ac99283dbd8 
>   ql/src/test/org/apache/hadoop/hive/ql/exec/TestContext.java PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/50359/diff/
> 
> 
> Testing
> ---
> 
> NO PATCH
> ** NON-PARTITIONED TABLE
> 
> - create table dummy (id int);
>3.651s
> - insert into table s3dummy values (1);   
>   39.231s
> - insert overwrite table s3dummy values (1);  
>   42.569s
> - insert overwrite directory 's3a://spena-bucket/dirs/s3dummy' select * from 
> dummy; 30.136s
> 
> EXTERNAL TABLE
> 
> - create table s3dummy_ext like s3dummy location 
> 's3a://spena-bucket/user/hive/warehouse/s3dummy';   9.297s
> - insert into table s3dummy_ext values (1);   
>   45.855s
> 
> WITH PATCH
> 
> ** NON-PARTITIONED TABLE
> - create table s3dummy (id int) location 
> 's3a://spena-bucket/user/hive/warehouse/s3dummy';   3.945s
> - insert into table s3dummy values (1);   
>   15.025s
> - insert overwrite table s3dummy values (1);  
>   25.149s 
> - insert overwrite directory 's3a://spena-bucket/dirs/s3dummy' select * from 
> dummy; 19.158s  
> - from dummy insert overwrite table s3dummy select *; 
>   25.469s  
> - from dummy insert into table s3dummy select *;  
>   14.501s
> 
> ** EXTERNAL TABLE
> - create table s3dummy_ext like s3dummy location 
> 's3a://spena-bucket/user/hive/warehouse/s3dummy';   4.827s
> - insert into table s3dummy_ext values (1);   
>   16.070s
> 
> ** PARTITIONED TABLE
> - create table s3dummypart (id int) partitioned by (part int)
>   location 's3a://spena-bucket/user/hive/warehouse/s3dummypart';  
>3.176s
> - alter table s3dummypart add partition (part=1); 
>3.229s
> - alter table s3dummypart add partition (part=2); 
>3.124s
> - insert into table s3dummypart partition (part=1) values (1);
>   14.876s
> - insert overwrite table s3dummypart partition (part=1) values (1);   
>   27.594s 
> - insert overwrite directory 's3a://spena-bucket/dirs/s3dummypart' select * 
> from dummypart; 22.298s  
> - from dummypart insert overwrite table s3dummypart partition (part=1) select 
> id;   29.001s  
> - from dummypart insert into table s3dummypart partition (part=1) select id;  
>   14.869s
> 
> ** DYNAMIC PARTITIONS
> - insert into table s3dummypart partition (part) select id, 1 from dummypart; 
>   15.185s
> - insert into table s3dummypart partition (part) select id, 1 from dummypart; 
>   18.820s
> 
> 
> Thanks,
> 
> Sergio Pena
> 
>



Re: Review Request 47040: Monitor changes to FairScheduler.xml file and automatically update / validate jobs submitted to fair-scheduler

2016-07-27 Thread Reuben Kuhnert

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/47040/
---

(Updated 七月 27, 2016, 5:35 p.m.)


Review request for hive, Lenni Kuff, Mohit Sabharwal, and Sergio Pena.


Changes
---

Diff Rebase


Bugs: HIVE-13696
https://issues.apache.org/jira/browse/HIVE-13696


Repository: hive-git


Description
---

Ensure that jobs sent to YARN with impersonation off are correctly routed to 
the proper queue based on fair-scheduler.xml. Monitor this file for changes and 
validate that jobs can only be sent to queues authorized for the user.


Diffs (updated)
-

  ql/src/java/org/apache/hadoop/hive/ql/Driver.java 14f221a 
  ql/src/java/org/apache/hadoop/hive/ql/session/YarnFairScheduling.java 
PRE-CREATION 
  service/src/java/org/apache/hive/service/cli/session/HiveSessionImpl.java 
72ad86c 
  shims/common/src/main/java/org/apache/hadoop/hive/shims/SchedulerShim.java 
63803b8 
  shims/scheduler/pom.xml 9141c1e 
  
shims/scheduler/src/main/java/org/apache/hadoop/hive/schshim/FairSchedulerQueueAllocator.java
 PRE-CREATION 
  
shims/scheduler/src/main/java/org/apache/hadoop/hive/schshim/FairSchedulerShim.java
 372244d 
  
shims/scheduler/src/main/java/org/apache/hadoop/hive/schshim/QueueAllocator.java
 PRE-CREATION 
  
shims/scheduler/src/test/java/org/apache/hadoop/hive/schshim/TestFairSchedulerQueueAllocator.java
 PRE-CREATION 

Diff: https://reviews.apache.org/r/47040/diff/


Testing
---


Thanks,

Reuben Kuhnert



Re: Review Request 48233: HIVE-13884: Disallow queries fetching more than a configured number of partitions in PartitionPruner

2016-06-09 Thread Reuben Kuhnert

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/48233/#review136810
---


Fix it, then Ship it!




Mostly minor cleanup nitpicks. Might make sense in the future to refactor this 
into a separate class that handles this sort of check, but this is fine for 
now. Fix then ship.


metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java (line 
3131)
<https://reviews.apache.org/r/48233/#comment201896>

Can we create a 'public static final String' for this instead of using a 
comment?



metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java (line 2523)
<https://reviews.apache.org/r/48233/#comment201897>

Nit: Strange extra space at the end, is that needed?



metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java (line 2830)
<https://reviews.apache.org/r/48233/#comment201898>

Maybe StringUtils.isEmpty? I think it will do both of these checks for you.


- Reuben Kuhnert


On 六月 6, 2016, 6:19 p.m., Sergio Pena wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/48233/
> ---
> 
> (Updated 六月 6, 2016, 6:19 p.m.)
> 
> 
> Review request for hive, Mohit Sabharwal and Naveen Gangam.
> 
> 
> Bugs: HIVE-13884
> https://issues.apache.org/jira/browse/HIVE-13884
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> The patch verifies the # of partitions a table has before fetching any from 
> the metastore. I
> t checks that limit from 'hive.limit.query.max.table.partition'.
> 
> A limitation added here is that the variable must be on hive-site.xml in 
> order to work, and it does not accept to set this through beeline because 
> HiveMetaStore.java does not read the variables set through beeline. I think 
> it is better to keep it this way to avoid users changing the value on fly, 
> and crashing the metastore.
> 
> Another change is that EXPLAIN commands won't be executed either. EXPLAIN 
> commands need to fetch partitions in order to create the operator tree. If we 
> allow EXPLAIN to do that, then we may have the same OOM situations for large 
> partitions.
> 
> 
> Diffs
> -
> 
>   metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 
> 94dd72e6624d13d2503f68d2fd2d2a84859a4500 
>   metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java 
> 8e0bba60cc73890c1566e0f5df965f0f0bcfe0ec 
>   metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java 
> b6d5276e49356f30147cb4f10262a2730ba99566 
>   metastore/src/java/org/apache/hadoop/hive/metastore/RawStore.java 
> a6d3f5385b33b8a4e31ee20ca5cb8f58c97c8702 
>   metastore/src/java/org/apache/hadoop/hive/metastore/hbase/HBaseStore.java 
> 31f0d7b89670b8a749bbe8a7ff2b4ff9f059a8e2 
>   
> metastore/src/test/org/apache/hadoop/hive/metastore/DummyRawStoreControlledCommit.java
>  3152e77c3c7152ac4dbe7e779ce35f28044fe3c9 
>   
> metastore/src/test/org/apache/hadoop/hive/metastore/DummyRawStoreForJdoConnection.java
>  86a243609b23e2ca9bb8849f0da863a95e477d5c 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 
> c3d903b8cc8197ba8bea17145bec1444ed14eb22 
> 
> Diff: https://reviews.apache.org/r/48233/diff/
> 
> 
> Testing
> ---
> 
> Waiting for HiveQA.
> 
> 
> Thanks,
> 
> Sergio Pena
> 
>



Re: [discuss] jdk8 support

2016-06-03 Thread Reuben Kuhnert
+1 Drop Java7 support, start using JDK8 features.

On Fri, Jun 3, 2016 at 12:39 PM, Prasanth Jayachandran <
pjayachand...@hortonworks.com> wrote:

> +1 for using jdk8 as minimum required version.
>
> Thanks
> Prasanth
>
>
>
>
> On Fri, Jun 3, 2016 at 10:21 AM -0700, "Siddharth Seth"  > wrote:
>
> +1. Drop Java7 support, and start using JDK8 features.
>
> On Fri, Jun 3, 2016 at 8:13 AM, Ashutosh Chauhan 
> wrote:
>
> > What I meant was that we start compiling using jdk8 as well. That will
> > allow devs to use jdk8 only features (lambda functions etc.)
> >
> > On Fri, Jun 3, 2016 at 8:10 AM, Sergio Pena 
> > wrote:
> >
> > > Hey Ashutosh,
> > >
> > > I switched to JDK8 in master last weekend. Jenkins has been running all
> > > tests in Java8 successfully for a week now.
> > > There are still a few tests we need to fix, but so far is looking good.
> > >
> > > Sergio
> > >
> > > On Fri, Jun 3, 2016 at 12:02 AM, Ashutosh Chauhan <
> hashut...@apache.org>
> > > wrote:
> > >
> > > > Now that branch-2.1 has been cut, I think its an opportune time to
> drop
> > > > support of jdk7 altogether in master. That is we compile using jdk8
> as
> > > > well. What do others think?
> > > >
> > > > Ashutosh
> > > >
> > > > On Fri, May 27, 2016 at 7:41 PM, Sergio Pena <
> sergio.p...@cloudera.com
> > >
> > > > wrote:
> > > >
> > > > > I did the change to JDK8 on Ptest now.
> > > > >
> > > > > Please let me tknow if there are issues with Java8.
> > > > >
> > > > > - Sergio
> > > > >
> > > > > On Fri, May 27, 2016 at 2:45 PM, Sergio Pena <
> > sergio.p...@cloudera.com
> > > >
> > > > > wrote:
> > > > >
> > > > > > Thanks Mohit.
> > > > > >
> > > > > > I will plan to do the JDK8 change on Jenkins today EOD and
> monitor
> > > any
> > > > > > issues through the weekend.
> > > > > >
> > > > > > - Sergio
> > > > > >
> > > > > >
> > > > > > On Fri, May 27, 2016 at 2:58 AM, Mohit Sabharwal <
> > mo...@cloudera.com
> > > >
> > > > > > wrote:
> > > > > >
> > > > > >> Update on moving Hive2 tests to JDK8: I've addressed almost all
> > test
> > > > > >> failures in HIVE-13547 on java8 branch. There is one remaining
> > > > > >> open item (HIVE-13834) which is currently assigned. Given
> current
> > > > > >> state of flaky test runs, there might be few more.
> > > > > >>
> > > > > >> I will work with Sergio to merge the test fixes to master and
> > switch
> > > > > >> the Hive2 pre-commit job to use JDK8, hopefully sometime
> tomorrow.
> > > > > >>
> > > > > >> After Hive2 tests switch, if your patch sees ordering related
> test
> > > > > >> failures in pre-commit run, it's likely JDK version related and
> > > you'll
> > > > > >> need to build & re-run the test using JDK8. Number of such tests
> > > > > >> should be relatively small.
> > > > > >>
> > > > > >> On Tue, Apr 19, 2016 at 10:43 AM, Mohit Sabharwal <
> > > mo...@cloudera.com
> > > > >
> > > > > >> wrote:
> > > > > >>
> > > > > >> > Created HIVE-13547 to track switching 2x tests to JDK8.
> > > > > >> >
> > > > > >> > On Wed, Apr 13, 2016 at 10:02 AM, Sergio Pena <
> > > > > sergio.p...@cloudera.com
> > > > > >> >
> > > > > >> > wrote:
> > > > > >> >
> > > > > >> >> I agree with such change as JDK7 is not longer supported.
> > > > > >> >>
> > > > > >> >> Changes on Jenkins and Hive PTest shouldn't be hard. We just
> > need
> > > > to
> > > > > >> >> replace the path from java7 to java8. But I think we should
> fix
> > > all
> > > > > >> JDK8
> > > > > >> >> issues or most of them before doing the change or we will end
> > up
> > > > > >> having a
> > > > > >> >> lot of failures on all JIRAs running pre-commit tests.
> > > > > >> >>
> > > > > >> >> +1 with the change.
> > > > > >> >>
> > > > > >> >> On Mon, Apr 11, 2016 at 9:34 PM, Siddharth Seth <
> > > ss...@apache.org>
> > > > > >> wrote:
> > > > > >> >>
> > > > > >> >> > Option 3 sounds good. I'd ideally like to make JDK8 the
> > minimum
> > > > > >> >> requirement
> > > > > >> >> > soon as well.
> > > > > >> >> >
> > > > > >> >> > On Mon, Apr 11, 2016 at 4:59 PM, Szehon Ho <
> > > sze...@cloudera.com>
> > > > > >> wrote:
> > > > > >> >> >
> > > > > >> >> > > Sounds like a good plan, +1
> > > > > >> >> > >
> > > > > >> >> > > On Mon, Apr 11, 2016 at 4:31 PM, Mohit Sabharwal <
> > > > > >> mo...@cloudera.com>
> > > > > >> >> > > wrote:
> > > > > >> >> > >
> > > > > >> >> > > > Thanks, Ashutosh. Makes sense to keep the source and
> > target
> > > > as
> > > > > >> 1.7
> > > > > >> >> > since
> > > > > >> >> > > > we're not using any JDK8 specific features yet. So,
> > option
> > > > (3)
> > > > > >> >> > > essentially
> > > > > >> >> > > > just means using JDK8 exclusively to build & test
> Hive2.
> > > > > >> >> > > >
> > > > > >> >> > > > On Sat, Apr 9, 2016 at 12:23 PM, Ashutosh Chauhan <
> > > > > >> >> > hashut...@apache.org>
> > > > > >> >> > > > wrote:
> > > > > >> >> > > >
> > > > > >> >> > > > > Hi Mohit,
> > > > > >> >> > > > >

Re: Review Request 47040: Monitor changes to FairScheduler.xml file and automatically update / validate jobs submitted to fair-scheduler

2016-05-24 Thread Reuben Kuhnert


> On May 17, 2016, 6:44 p.m., Yongzhi Chen wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/Driver.java, line 533
> > <https://reviews.apache.org/r/47040/diff/10/?file=1384158#file1384158line533>
> >
> > This if statement is duplicate with the Precondition. If you want to 
> > throw exception,only use Precondition, otherwise, just use if statement. 
> > Use both will end up checking the same condition twice.

This is correct. One is for checking that we're in a valid state. The other is 
for throwing if the user tries to call the function in an invalid state. Thanks!


- Reuben


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/47040/#review133592
-------


On May 24, 2016, 11:56 a.m., Reuben Kuhnert wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/47040/
> ---
> 
> (Updated May 24, 2016, 11:56 a.m.)
> 
> 
> Review request for hive, Lenni Kuff, Mohit Sabharwal, and Sergio Pena.
> 
> 
> Bugs: HIVE-13696
> https://issues.apache.org/jira/browse/HIVE-13696
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Ensure that jobs sent to YARN with impersonation off are correctly routed to 
> the proper queue based on fair-scheduler.xml. Monitor this file for changes 
> and validate that jobs can only be sent to queues authorized for the user.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/Driver.java 
> 3fecc5c4ca2a06a031c0c4a711fb49e757c49062 
>   ql/src/java/org/apache/hadoop/hive/ql/session/YarnFairScheduling.java 
> PRE-CREATION 
>   service/src/java/org/apache/hive/service/cli/session/HiveSessionImpl.java 
> a0015ebc655931f241b28c53fbb94cfe172841b1 
>   shims/common/src/main/java/org/apache/hadoop/hive/shims/SchedulerShim.java 
> 63803b8b0752745bd2fedaccc5d100befd97093b 
>   shims/scheduler/pom.xml b36c12325c588cdb609c6200b1edef73a2f79552 
>   
> shims/scheduler/src/main/java/org/apache/hadoop/hive/schshim/FairSchedulerQueueAllocator.java
>  PRE-CREATION 
>   
> shims/scheduler/src/main/java/org/apache/hadoop/hive/schshim/FairSchedulerShim.java
>  372244dc3c989d2a3ae2eb2bfb8cd0a235705e18 
>   
> shims/scheduler/src/main/java/org/apache/hadoop/hive/schshim/QueueAllocator.java
>  PRE-CREATION 
>   
> shims/scheduler/src/test/java/org/apache/hadoop/hive/schshim/TestFairSchedulerQueueAllocator.java
>  PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/47040/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Reuben Kuhnert
> 
>



Re: Review Request 47040: Monitor changes to FairScheduler.xml file and automatically update / validate jobs submitted to fair-scheduler

2016-05-24 Thread Reuben Kuhnert

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/47040/
---

(Updated May 24, 2016, 11:56 a.m.)


Review request for hive, Lenni Kuff, Mohit Sabharwal, and Sergio Pena.


Changes
---

Updates to fix test failures.


Bugs: HIVE-13696
https://issues.apache.org/jira/browse/HIVE-13696


Repository: hive-git


Description
---

Ensure that jobs sent to YARN with impersonation off are correctly routed to 
the proper queue based on fair-scheduler.xml. Monitor this file for changes and 
validate that jobs can only be sent to queues authorized for the user.


Diffs (updated)
-

  ql/src/java/org/apache/hadoop/hive/ql/Driver.java 
3fecc5c4ca2a06a031c0c4a711fb49e757c49062 
  ql/src/java/org/apache/hadoop/hive/ql/session/YarnFairScheduling.java 
PRE-CREATION 
  service/src/java/org/apache/hive/service/cli/session/HiveSessionImpl.java 
a0015ebc655931f241b28c53fbb94cfe172841b1 
  shims/common/src/main/java/org/apache/hadoop/hive/shims/SchedulerShim.java 
63803b8b0752745bd2fedaccc5d100befd97093b 
  shims/scheduler/pom.xml b36c12325c588cdb609c6200b1edef73a2f79552 
  
shims/scheduler/src/main/java/org/apache/hadoop/hive/schshim/FairSchedulerQueueAllocator.java
 PRE-CREATION 
  
shims/scheduler/src/main/java/org/apache/hadoop/hive/schshim/FairSchedulerShim.java
 372244dc3c989d2a3ae2eb2bfb8cd0a235705e18 
  
shims/scheduler/src/main/java/org/apache/hadoop/hive/schshim/QueueAllocator.java
 PRE-CREATION 
  
shims/scheduler/src/test/java/org/apache/hadoop/hive/schshim/TestFairSchedulerQueueAllocator.java
 PRE-CREATION 

Diff: https://reviews.apache.org/r/47040/diff/


Testing
---


Thanks,

Reuben Kuhnert



Re: Review Request 47040: Monitor changes to FairScheduler.xml file and automatically update / validate jobs submitted to fair-scheduler

2016-05-14 Thread Reuben Kuhnert

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/47040/
---

(Updated May 14, 2016, 5:51 p.m.)


Review request for hive, Lenni Kuff, Mohit Sabharwal, and Sergio Pena.


Bugs: HIVE-13696
https://issues.apache.org/jira/browse/HIVE-13696


Repository: hive-git


Description
---

Ensure that jobs sent to YARN with impersonation off are correctly routed to 
the proper queue based on fair-scheduler.xml. Monitor this file for changes and 
validate that jobs can only be sent to queues authorized for the user.


Diffs (updated)
-

  ql/src/java/org/apache/hadoop/hive/ql/Driver.java 
3fecc5c4ca2a06a031c0c4a711fb49e757c49062 
  ql/src/java/org/apache/hadoop/hive/ql/session/YarnFairScheduling.java 
PRE-CREATION 
  service/src/java/org/apache/hive/service/cli/session/HiveSessionImpl.java 
a0015ebc655931f241b28c53fbb94cfe172841b1 
  shims/common/src/main/java/org/apache/hadoop/hive/shims/SchedulerShim.java 
63803b8b0752745bd2fedaccc5d100befd97093b 
  shims/scheduler/pom.xml b36c12325c588cdb609c6200b1edef73a2f79552 
  
shims/scheduler/src/main/java/org/apache/hadoop/hive/schshim/FairSchedulerQueueAllocator.java
 PRE-CREATION 
  
shims/scheduler/src/main/java/org/apache/hadoop/hive/schshim/FairSchedulerShim.java
 372244dc3c989d2a3ae2eb2bfb8cd0a235705e18 
  
shims/scheduler/src/main/java/org/apache/hadoop/hive/schshim/QueueAllocator.java
 PRE-CREATION 
  
shims/scheduler/src/test/java/org/apache/hadoop/hive/schshim/TestFairSchedulerQueueAllocator.java
 PRE-CREATION 

Diff: https://reviews.apache.org/r/47040/diff/


Testing
---


Thanks,

Reuben Kuhnert



Re: Review Request 47040: Monitor changes to FairScheduler.xml file and automatically update / validate jobs submitted to fair-scheduler

2016-05-14 Thread Reuben Kuhnert

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/47040/
---

(Updated May 14, 2016, 5:47 p.m.)


Review request for hive, Lenni Kuff, Mohit Sabharwal, and Sergio Pena.


Changes
---

Remove stringutils changes.


Bugs: HIVE-13696
https://issues.apache.org/jira/browse/HIVE-13696


Repository: hive-git


Description
---

Ensure that jobs sent to YARN with impersonation off are correctly routed to 
the proper queue based on fair-scheduler.xml. Monitor this file for changes and 
validate that jobs can only be sent to queues authorized for the user.


Diffs (updated)
-

  ql/src/java/org/apache/hadoop/hive/ql/Driver.java 
3fecc5c4ca2a06a031c0c4a711fb49e757c49062 
  ql/src/java/org/apache/hadoop/hive/ql/session/YarnFairScheduling.java 
PRE-CREATION 
  service/src/java/org/apache/hive/service/cli/session/HiveSessionImpl.java 
a0015ebc655931f241b28c53fbb94cfe172841b1 
  shims/common/src/main/java/org/apache/hadoop/hive/shims/SchedulerShim.java 
63803b8b0752745bd2fedaccc5d100befd97093b 
  shims/scheduler/pom.xml b36c12325c588cdb609c6200b1edef73a2f79552 
  
shims/scheduler/src/main/java/org/apache/hadoop/hive/schshim/FairSchedulerQueueAllocator.java
 PRE-CREATION 
  
shims/scheduler/src/main/java/org/apache/hadoop/hive/schshim/FairSchedulerShim.java
 372244dc3c989d2a3ae2eb2bfb8cd0a235705e18 
  
shims/scheduler/src/main/java/org/apache/hadoop/hive/schshim/QueueAllocator.java
 PRE-CREATION 
  
shims/scheduler/src/test/java/org/apache/hadoop/hive/schshim/TestFairSchedulerQueueAllocator.java
 PRE-CREATION 

Diff: https://reviews.apache.org/r/47040/diff/


Testing
---


Thanks,

Reuben Kuhnert



Re: Review Request 47040: Monitor changes to FairScheduler.xml file and automatically update / validate jobs submitted to fair-scheduler

2016-05-14 Thread Reuben Kuhnert

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/47040/
---

(Updated May 14, 2016, 5:42 p.m.)


Review request for hive, Lenni Kuff, Mohit Sabharwal, and Sergio Pena.


Changes
---

Updated diff to include per submission checks and watching changes in 
fair-scheduler.xml location.


Bugs: HIVE-13696
https://issues.apache.org/jira/browse/HIVE-13696


Repository: hive-git


Description
---

Ensure that jobs sent to YARN with impersonation off are correctly routed to 
the proper queue based on fair-scheduler.xml. Monitor this file for changes and 
validate that jobs can only be sent to queues authorized for the user.


Diffs (updated)
-

  common/src/java/org/apache/hive/common/util/HiveStringUtils.java 
6d28396893532302fbbd66eace53ae32b71848c3 
  ql/src/java/org/apache/hadoop/hive/ql/Driver.java 
3fecc5c4ca2a06a031c0c4a711fb49e757c49062 
  ql/src/java/org/apache/hadoop/hive/ql/session/YarnFairScheduling.java 
PRE-CREATION 
  service/src/java/org/apache/hive/service/cli/session/HiveSessionImpl.java 
a0015ebc655931f241b28c53fbb94cfe172841b1 
  shims/common/src/main/java/org/apache/hadoop/hive/shims/SchedulerShim.java 
63803b8b0752745bd2fedaccc5d100befd97093b 
  shims/scheduler/pom.xml b36c12325c588cdb609c6200b1edef73a2f79552 
  
shims/scheduler/src/main/java/org/apache/hadoop/hive/schshim/FairSchedulerQueueAllocator.java
 PRE-CREATION 
  
shims/scheduler/src/main/java/org/apache/hadoop/hive/schshim/FairSchedulerShim.java
 372244dc3c989d2a3ae2eb2bfb8cd0a235705e18 
  
shims/scheduler/src/main/java/org/apache/hadoop/hive/schshim/QueueAllocator.java
 PRE-CREATION 
  
shims/scheduler/src/test/java/org/apache/hadoop/hive/schshim/TestFairSchedulerQueueAllocator.java
 PRE-CREATION 

Diff: https://reviews.apache.org/r/47040/diff/


Testing
---


Thanks,

Reuben Kuhnert



Re: Review Request 47040: Monitor changes to FairScheduler.xml file and automatically update / validate jobs submitted to fair-scheduler

2016-05-13 Thread Reuben Kuhnert

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/47040/
---

(Updated May 13, 2016, 3:26 p.m.)


Review request for hive, Lenni Kuff, Mohit Sabharwal, and Sergio Pena.


Bugs: HIVE-13696
https://issues.apache.org/jira/browse/HIVE-13696


Repository: hive-git


Description
---

Ensure that jobs sent to YARN with impersonation off are correctly routed to 
the proper queue based on fair-scheduler.xml. Monitor this file for changes and 
validate that jobs can only be sent to queues authorized for the user.


Diffs (updated)
-

  shims/common/src/main/java/org/apache/hadoop/fs/FileWatchService.java 
PRE-CREATION 
  shims/scheduler/pom.xml b36c12325c588cdb609c6200b1edef73a2f79552 
  
shims/scheduler/src/main/java/org/apache/hadoop/hive/schshim/FairSchedulerQueueAllocator.java
 PRE-CREATION 
  
shims/scheduler/src/main/java/org/apache/hadoop/hive/schshim/FairSchedulerShim.java
 372244dc3c989d2a3ae2eb2bfb8cd0a235705e18 
  
shims/scheduler/src/main/java/org/apache/hadoop/hive/schshim/QueueAllocator.java
 PRE-CREATION 
  
shims/scheduler/src/test/java/org/apache/hadoop/hive/schshim/TestFairScheduler.java
 PRE-CREATION 

Diff: https://reviews.apache.org/r/47040/diff/


Testing
---


Thanks,

Reuben Kuhnert



Re: Review Request 47040: Monitor changes to FairScheduler.xml file and automatically update / validate jobs submitted to fair-scheduler

2016-05-12 Thread Reuben Kuhnert

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/47040/
---

(Updated 5 12, 2016, 1:16 오후)


Review request for hive, Lenni Kuff, Mohit Sabharwal, and Sergio Pena.


Changes
---

Update diff


Bugs: HIVE-13696
https://issues.apache.org/jira/browse/HIVE-13696


Repository: hive-git


Description
---

Ensure that jobs sent to YARN with impersonation off are correctly routed to 
the proper queue based on fair-scheduler.xml. Monitor this file for changes and 
validate that jobs can only be sent to queues authorized for the user.


Diffs (updated)
-

  common/src/java/org/apache/hive/common/util/HiveStringUtils.java 
6d28396893532302fbbd66eace53ae32b71848c3 
  ql/src/java/org/apache/hadoop/hive/ql/Driver.java 
3fecc5c4ca2a06a031c0c4a711fb49e757c49062 
  ql/src/java/org/apache/hadoop/hive/ql/session/YarnFairScheduling.java 
PRE-CREATION 
  service/src/java/org/apache/hive/service/cli/session/HiveSessionImpl.java 
a0015ebc655931f241b28c53fbb94cfe172841b1 
  shims/common/src/main/java/org/apache/hadoop/fs/FileSystemWatcher.java 
PRE-CREATION 
  shims/common/src/main/java/org/apache/hadoop/hive/shims/SchedulerShim.java 
63803b8b0752745bd2fedaccc5d100befd97093b 
  shims/scheduler/pom.xml b36c12325c588cdb609c6200b1edef73a2f79552 
  
shims/scheduler/src/main/java/org/apache/hadoop/hive/schshim/FairSchedulerQueueAllocator.java
 PRE-CREATION 
  
shims/scheduler/src/main/java/org/apache/hadoop/hive/schshim/FairSchedulerShim.java
 372244dc3c989d2a3ae2eb2bfb8cd0a235705e18 
  
shims/scheduler/src/main/java/org/apache/hadoop/hive/schshim/QueueAllocator.java
 PRE-CREATION 
  
shims/scheduler/src/test/java/org/apache/hadoop/hive/schshim/TestFairScheduler.java
 PRE-CREATION 

Diff: https://reviews.apache.org/r/47040/diff/


Testing
---


Thanks,

Reuben Kuhnert



Re: Review Request 47040: Monitor changes to FairScheduler.xml file and automatically update / validate jobs submitted to fair-scheduler

2016-05-12 Thread Reuben Kuhnert
tern in Hive code is to just check if conf.get() returns null. 
> > Unless there is a advantage in this use case, let's maintain the pattern. 
> > (Just like the queueConfig != null check elsewhere)
> > 
> > Same for other places where defaultString() is used.

Coming from using F# nulls don't make sense to me, heh. Changed though.


> On May 12, 2016, 7:02 a.m., Mohit Sabharwal wrote:
> > shims/scheduler/src/main/java/org/apache/hadoop/hive/schshim/FairSchedulerShim.java,
> >  line 124
> > <https://reviews.apache.org/r/47040/diff/5/?file=1379895#file1379895line124>
> >
> > wait, refreshDefaultQueue() just calls attemptSetScheduleForUser().
> > 
> > Can you just inline attemptSetScheduleForUser here with 
> > YarnConfiguration.DEFAULT_QUEUE_NAME param ? It's cleaner without the extra 
> > indirection.

Same point as above. Currently the validation is weak/inorrectly-defined, but 
it still needs to exist. Will need to redefine the validation later (by 
modifying this function). It's important though to keep the two concepts 
separate for callers.


> On May 12, 2016, 7:02 a.m., Mohit Sabharwal wrote:
> > shims/scheduler/src/main/java/org/apache/hadoop/hive/schshim/FairSchedulerShim.java,
> >  line 137
> > <https://reviews.apache.org/r/47040/diff/5/?file=1379895#file1379895line137>
> >
> > We should make this LOG.info. There is nothing warn about here ...

Oops, thanks!


> On May 12, 2016, 7:02 a.m., Mohit Sabharwal wrote:
> > shims/scheduler/src/main/java/org/apache/hadoop/hive/schshim/FairSchedulerShim.java,
> >  line 132
> > <https://reviews.apache.org/r/47040/diff/5/?file=1379895#file1379895line132>
> >
> > nit: cleaner with
> > if (queueConfig == null)
> >   return;

Hmm, don't like it :/

Not against returning mid function, but it seems odd to change the cadence 
(return on one null check, but not on others). Man nulls are so weird, heh. 
'Options' would be way better.


- Reuben


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/47040/#review132819
---


On May 11, 2016, 8:02 p.m., Reuben Kuhnert wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/47040/
> ---
> 
> (Updated May 11, 2016, 8:02 p.m.)
> 
> 
> Review request for hive, Lenni Kuff, Mohit Sabharwal, and Sergio Pena.
> 
> 
> Bugs: HIVE-13696
> https://issues.apache.org/jira/browse/HIVE-13696
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Ensure that jobs sent to YARN with impersonation off are correctly routed to 
> the proper queue based on fair-scheduler.xml. Monitor this file for changes 
> and validate that jobs can only be sent to queues authorized for the user.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/Driver.java 
> 3fecc5c4ca2a06a031c0c4a711fb49e757c49062 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java 
> 926f6e883030b5a01d025994bd02c67f0f5a275c 
>   ql/src/java/org/apache/hadoop/hive/ql/session/YarnFairScheduling.java 
> PRE-CREATION 
>   service/src/java/org/apache/hive/service/cli/session/HiveSessionImpl.java 
> a0015ebc655931f241b28c53fbb94cfe172841b1 
>   shims/common/src/main/java/org/apache/hadoop/fs/FileSystemWatcher.java 
> PRE-CREATION 
>   shims/common/src/main/java/org/apache/hadoop/hive/shims/SchedulerShim.java 
> 63803b8b0752745bd2fedaccc5d100befd97093b 
>   shims/scheduler/pom.xml b36c12325c588cdb609c6200b1edef73a2f79552 
>   
> shims/scheduler/src/main/java/org/apache/hadoop/hive/schshim/FairSchedulerQueueAllocator.java
>  PRE-CREATION 
>   
> shims/scheduler/src/main/java/org/apache/hadoop/hive/schshim/FairSchedulerShim.java
>  372244dc3c989d2a3ae2eb2bfb8cd0a235705e18 
>   
> shims/scheduler/src/main/java/org/apache/hadoop/hive/schshim/QueueAllocator.java
>  PRE-CREATION 
>   
> shims/scheduler/src/test/java/org/apache/hadoop/hive/schshim/TestFairScheduler.java
>  PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/47040/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Reuben Kuhnert
> 
>



Re: Review Request 47040: Monitor changes to FairScheduler.xml file and automatically update / validate jobs submitted to fair-scheduler

2016-05-11 Thread Reuben Kuhnert

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/47040/
---

(Updated 5 11, 2016, 8:02 오후)


Review request for hive, Lenni Kuff, Mohit Sabharwal, and Sergio Pena.


Changes
---

Add file existence check to FileSystemWatcher


Bugs: HIVE-13696
https://issues.apache.org/jira/browse/HIVE-13696


Repository: hive-git


Description
---

Ensure that jobs sent to YARN with impersonation off are correctly routed to 
the proper queue based on fair-scheduler.xml. Monitor this file for changes and 
validate that jobs can only be sent to queues authorized for the user.


Diffs (updated)
-

  ql/src/java/org/apache/hadoop/hive/ql/Driver.java 
3fecc5c4ca2a06a031c0c4a711fb49e757c49062 
  ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java 
926f6e883030b5a01d025994bd02c67f0f5a275c 
  ql/src/java/org/apache/hadoop/hive/ql/session/YarnFairScheduling.java 
PRE-CREATION 
  service/src/java/org/apache/hive/service/cli/session/HiveSessionImpl.java 
a0015ebc655931f241b28c53fbb94cfe172841b1 
  shims/common/src/main/java/org/apache/hadoop/fs/FileSystemWatcher.java 
PRE-CREATION 
  shims/common/src/main/java/org/apache/hadoop/hive/shims/SchedulerShim.java 
63803b8b0752745bd2fedaccc5d100befd97093b 
  shims/scheduler/pom.xml b36c12325c588cdb609c6200b1edef73a2f79552 
  
shims/scheduler/src/main/java/org/apache/hadoop/hive/schshim/FairSchedulerQueueAllocator.java
 PRE-CREATION 
  
shims/scheduler/src/main/java/org/apache/hadoop/hive/schshim/FairSchedulerShim.java
 372244dc3c989d2a3ae2eb2bfb8cd0a235705e18 
  
shims/scheduler/src/main/java/org/apache/hadoop/hive/schshim/QueueAllocator.java
 PRE-CREATION 
  
shims/scheduler/src/test/java/org/apache/hadoop/hive/schshim/TestFairScheduler.java
 PRE-CREATION 

Diff: https://reviews.apache.org/r/47040/diff/


Testing
---


Thanks,

Reuben Kuhnert



Re: Review Request 47040: Monitor changes to FairScheduler.xml file and automatically update / validate jobs submitted to fair-scheduler

2016-05-11 Thread Reuben Kuhnert

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/47040/
---

(Updated 5 11, 2016, 7:19 오후)


Review request for hive, Lenni Kuff, Mohit Sabharwal, and Sergio Pena.


Changes
---

Added Requested Changes


Bugs: HIVE-13696
https://issues.apache.org/jira/browse/HIVE-13696


Repository: hive-git


Description
---

Ensure that jobs sent to YARN with impersonation off are correctly routed to 
the proper queue based on fair-scheduler.xml. Monitor this file for changes and 
validate that jobs can only be sent to queues authorized for the user.


Diffs (updated)
-

  ql/src/java/org/apache/hadoop/hive/ql/Driver.java 3fecc5c 
  ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java 926f6e8 
  ql/src/java/org/apache/hadoop/hive/ql/session/YarnFairScheduling.java 
PRE-CREATION 
  service/src/java/org/apache/hive/service/cli/session/HiveSessionImpl.java 
a0015eb 
  shims/common/src/main/java/org/apache/hadoop/fs/FileSystemWatcher.java 
PRE-CREATION 
  shims/common/src/main/java/org/apache/hadoop/hive/shims/SchedulerShim.java 
63803b8 
  shims/scheduler/pom.xml b36c123 
  
shims/scheduler/src/main/java/org/apache/hadoop/hive/schshim/FairSchedulerQueueAllocator.java
 PRE-CREATION 
  
shims/scheduler/src/main/java/org/apache/hadoop/hive/schshim/FairSchedulerShim.java
 372244d 
  
shims/scheduler/src/main/java/org/apache/hadoop/hive/schshim/QueueAllocator.java
 PRE-CREATION 
  
shims/scheduler/src/test/java/org/apache/hadoop/hive/schshim/TestFairScheduler.java
 PRE-CREATION 

Diff: https://reviews.apache.org/r/47040/diff/


Testing
---


Thanks,

Reuben Kuhnert



Re: Review Request 47040: Monitor changes to FairScheduler.xml file and automatically update / validate jobs submitted to fair-scheduler

2016-05-11 Thread Reuben Kuhnert


> On May 11, 2016, 2:23 a.m., Mohit Sabharwal wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java, line 502
> > <https://reviews.apache.org/r/47040/diff/3/?file=1378910#file1378910line502>
> >
> > Looks like you have the same conditional check both here and in the 
> > function you're calling (validateYarnQueue). In affect, we're doing the 
> > same check twice. Remove one ?

This is an assertion. It's exposed externally so callers can check when to 
perform this operation, but if called in an invalid state, we should fail. 
Ideally, an 'Optional' with a callback would be better though. C'est la vie.


> On May 11, 2016, 2:23 a.m., Mohit Sabharwal wrote:
> > shims/common/src/main/java/org/apache/hadoop/fs/FileSystemWatcher.java, 
> > line 136
> > <https://reviews.apache.org/r/47040/diff/3/?file=1378913#file1378913line136>
> >
> > Should this be synchronized at all ? The event is not shared state, 
> > right ? Are we mutating any shared state in this method ?
> > 
> > Also, make it private ?

'this.callbacks' can be modified from another thread if this function isn't 
monitored. Also, it's made protected for testing purposes (though if there's a 
better practice for this, let me know - not hyper familiar with Java).


> On May 11, 2016, 2:23 a.m., Mohit Sabharwal wrote:
> > shims/scheduler/src/main/java/org/apache/hadoop/hive/schshim/FairSchedulerShim.java,
> >  line 64
> > <https://reviews.apache.org/r/47040/diff/3/?file=1378917#file1378917line64>
> >
> > why synchronized ? Didn't see shared share getting modified anywhere...

Yeah, originally the file-system-watcher was static final (to ensure that if 
multiple instances of the shim exist for some reason, the configuration across 
them all is the same - and callbacks aren't fired multiple times). I ultimately 
decided against this though - this is just a legacy of that failed effort.


> On May 11, 2016, 2:23 a.m., Mohit Sabharwal wrote:
> > shims/common/src/main/java/org/apache/hadoop/fs/FileSystemWatcher.java, 
> > lines 74-76
> > <https://reviews.apache.org/r/47040/diff/3/?file=1378913#file1378913line74>
> >
> > This seems strange.
> > 
> > Why are we closing the watchService() and creating a new one here?
> > 
> > Can the existing watchService() watch more files?
> > 
> > Add comments explaining why this is necessary.

Overuse of 'functional'-style (Prefer new immutable objects over modifying 
existing ones, aka "state is evil"). This introduces a suble bug though if the 
file is modified while the watcher is being re-generated. This can't be helped 
if we're removing a watch though (since there's not 'unregister' function).


> On May 11, 2016, 2:23 a.m., Mohit Sabharwal wrote:
> > shims/common/src/main/java/org/apache/hadoop/fs/FileSystemWatcher.java, 
> > line 125
> > <https://reviews.apache.org/r/47040/diff/3/?file=1378913#file1378913line125>
> >
> > Should this be a warning ?
> > 
> > Also, add the exception to the log.
> > 
> > LOG.warn("..." + ex, ex)
> > 
> > This catch block also doesn't have the close() call. Need a finally 
> > block?

A finally would close the service after each iteration. But I modified 
everything else.


> On May 11, 2016, 2:23 a.m., Mohit Sabharwal wrote:
> > shims/scheduler/src/main/java/org/apache/hadoop/hive/schshim/FairSchedulerShim.java,
> >  line 125
> > <https://reviews.apache.org/r/47040/diff/3/?file=1378917#file1378917line125>
> >
> > rename to getConfigForUser(...)
> > 
> > Also, not quiet sure why this needs to be synchronized.

The configuration resolvers are cached. But the cache is cleared if the 
location of 'fair-scheduler.xml' (YARN_SCHEDULER_FILE_PROPERTY) changes, or is 
modified. No good if one thread clears the cache while another is reading it.


- Reuben


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/47040/#review132568
---


On May 10, 2016, 10:54 p.m., Reuben Kuhnert wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/47040/
> ---
> 
> (Updated May 10, 2016, 10:54 p.m.)
> 
> 
> Review request for hive, Lenni Kuff, Mohit Sabharwal, and Sergio Pena.
> 
> 
> Bugs: HIVE-13696
> https://issues.apache.org/jira/browse/HIVE-13696
> 
> 
> Repository: 

Re: Review Request 47040: Monitor changes to FairScheduler.xml file and automatically update / validate jobs submitted to fair-scheduler

2016-05-10 Thread Reuben Kuhnert

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/47040/
---

(Updated 5 10, 2016, 10:54 오후)


Review request for hive, Lenni Kuff, Mohit Sabharwal, and Sergio Pena.


Changes
---

Updated diff with unit test


Bugs: HIVE-13696
https://issues.apache.org/jira/browse/HIVE-13696


Repository: hive-git


Description
---

Ensure that jobs sent to YARN with impersonation off are correctly routed to 
the proper queue based on fair-scheduler.xml. Monitor this file for changes and 
validate that jobs can only be sent to queues authorized for the user.


Diffs (updated)
-

  ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java 926f6e8 
  ql/src/java/org/apache/hadoop/hive/ql/session/YarnFairScheduling.java 
PRE-CREATION 
  service/src/java/org/apache/hive/service/cli/session/HiveSessionImpl.java 
a0015eb 
  shims/common/src/main/java/org/apache/hadoop/fs/FileSystemWatcher.java 
PRE-CREATION 
  shims/common/src/main/java/org/apache/hadoop/hive/shims/SchedulerShim.java 
63803b8 
  shims/scheduler/pom.xml b36c123 
  
shims/scheduler/src/main/java/org/apache/hadoop/hive/schshim/FairSchedulerQueueAllocator.java
 PRE-CREATION 
  
shims/scheduler/src/main/java/org/apache/hadoop/hive/schshim/FairSchedulerShim.java
 372244d 
  
shims/scheduler/src/main/java/org/apache/hadoop/hive/schshim/QueueAllocator.java
 PRE-CREATION 
  
shims/scheduler/src/test/java/org/apache/hadoop/hive/schshim/TestFairScheduler.java
 PRE-CREATION 

Diff: https://reviews.apache.org/r/47040/diff/


Testing
---


Thanks,

Reuben Kuhnert



Re: Review Request 47040: Monitor changes to FairScheduler.xml file and automatically update / validate jobs submitted to fair-scheduler

2016-05-09 Thread Reuben Kuhnert

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/47040/
---

(Updated 5 10, 2016, 1:18 오전)


Review request for hive, Lenni Kuff, Mohit Sabharwal, and Sergio Pena.


Changes
---

Make requested changes to patch


Summary (updated)
-

Monitor changes to FairScheduler.xml file and automatically update / validate 
jobs submitted to fair-scheduler


Bugs: HIVE-13696
https://issues.apache.org/jira/browse/HIVE-13696


Repository: hive-git


Description (updated)
---

Ensure that jobs sent to YARN with impersonation off are correctly routed to 
the proper queue based on fair-scheduler.xml. Monitor this file for changes and 
validate that jobs can only be sent to queues authorized for the user.


Diffs (updated)
-

  ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java 
926f6e883030b5a01d025994bd02c67f0f5a275c 
  ql/src/java/org/apache/hadoop/hive/ql/session/YarnFairScheduling.java 
PRE-CREATION 
  service/src/java/org/apache/hive/service/cli/session/HiveSessionImpl.java 
a0015ebc655931f241b28c53fbb94cfe172841b1 
  shims/common/src/main/java/org/apache/hadoop/fs/FileSystemWatcher.java 
PRE-CREATION 
  shims/common/src/main/java/org/apache/hadoop/hive/shims/SchedulerShim.java 
63803b8b0752745bd2fedaccc5d100befd97093b 
  
shims/scheduler/src/main/java/org/apache/hadoop/hive/schshim/FairSchedulerShim.java
 372244dc3c989d2a3ae2eb2bfb8cd0a235705e18 

Diff: https://reviews.apache.org/r/47040/diff/


Testing
---


Thanks,

Reuben Kuhnert



Re: Review Request 47040: Validate jobs submitted to fair-scheduler

2016-05-09 Thread Reuben Kuhnert


> On 5 6, 2016, 4:21 오후, Sergio Pena wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/session/YarnFairScheduling.java, line 
> > 24
> > <https://reviews.apache.org/r/47040/diff/1/?file=1373935#file1373935line24>
> >
> > There is another validation on Hadoop23Shims.refreshDefaultQueue(). Why 
> > don't we have that one too here?

I'm not sure what this means.


- Reuben


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/47040/#review132054
-------


On 5 5, 2016, 8:06 오후, Reuben Kuhnert wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/47040/
> ---
> 
> (Updated 5 5, 2016, 8:06 오후)
> 
> 
> Review request for hive, Lenni Kuff, Mohit Sabharwal, and Sergio Pena.
> 
> 
> Bugs: HIVE-13696
> https://issues.apache.org/jira/browse/HIVE-13696
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Ensure that jobs sent to yarn with impersonation off are correctly routed to 
> the proper queue based on fair-scheduler.xml. Validate that jobs can only be 
> sent to queues authorized for the user.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/Driver.java 
> 6a610cbcb1deb7f7f55bb8aff58020b057454b31 
>   ql/src/java/org/apache/hadoop/hive/ql/session/YarnFairScheduling.java 
> PRE-CREATION 
>   service/src/java/org/apache/hive/service/cli/session/HiveSessionImpl.java 
> a0015ebc655931f241b28c53fbb94cfe172841b1 
>   shims/common/src/main/java/org/apache/hadoop/fs/FileSystemWatcher.java 
> PRE-CREATION 
>   shims/common/src/main/java/org/apache/hadoop/hive/shims/SchedulerShim.java 
> 63803b8b0752745bd2fedaccc5d100befd97093b 
>   
> shims/scheduler/src/main/java/org/apache/hadoop/hive/schshim/FairSchedulerShim.java
>  372244dc3c989d2a3ae2eb2bfb8cd0a235705e18 
> 
> Diff: https://reviews.apache.org/r/47040/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Reuben Kuhnert
> 
>



Review Request 47040: Validate jobs submitted to fair-scheduler

2016-05-05 Thread Reuben Kuhnert

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/47040/
---

Review request for hive, Lenni Kuff, Mohit Sabharwal, and Sergio Pena.


Bugs: HIVE-13696
https://issues.apache.org/jira/browse/HIVE-13696


Repository: hive-git


Description
---

Ensure that jobs sent to yarn with impersonation off are correctly routed to 
the proper queue based on fair-scheduler.xml. Validate that jobs can only be 
sent to queues authorized for the user.


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/Driver.java 
6a610cbcb1deb7f7f55bb8aff58020b057454b31 
  ql/src/java/org/apache/hadoop/hive/ql/session/YarnFairScheduling.java 
PRE-CREATION 
  service/src/java/org/apache/hive/service/cli/session/HiveSessionImpl.java 
a0015ebc655931f241b28c53fbb94cfe172841b1 
  shims/common/src/main/java/org/apache/hadoop/fs/FileSystemWatcher.java 
PRE-CREATION 
  shims/common/src/main/java/org/apache/hadoop/hive/shims/SchedulerShim.java 
63803b8b0752745bd2fedaccc5d100befd97093b 
  
shims/scheduler/src/main/java/org/apache/hadoop/hive/schshim/FairSchedulerShim.java
 372244dc3c989d2a3ae2eb2bfb8cd0a235705e18 

Diff: https://reviews.apache.org/r/47040/diff/


Testing
---


Thanks,

Reuben Kuhnert



[jira] [Created] (HIVE-13696) Validate jobs submitted to fair-scheduler

2016-05-05 Thread Reuben Kuhnert (JIRA)
Reuben Kuhnert created HIVE-13696:
-

 Summary: Validate jobs submitted to fair-scheduler
 Key: HIVE-13696
 URL: https://issues.apache.org/jira/browse/HIVE-13696
 Project: Hive
  Issue Type: Improvement
Reporter: Reuben Kuhnert
Assignee: Reuben Kuhnert


Ensure that jobs are placed into the correct queue according to 
{{fair-scheduler.xml}}. Jobs should be placed into the correct queue, and users 
should not be able to submit jobs to queues they do not have access to.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-13478) [Cleanup] Improve HookUtils performance

2016-04-11 Thread Reuben Kuhnert (JIRA)
Reuben Kuhnert created HIVE-13478:
-

 Summary: [Cleanup] Improve HookUtils performance
 Key: HIVE-13478
 URL: https://issues.apache.org/jira/browse/HIVE-13478
 Project: Hive
  Issue Type: Improvement
Reporter: Reuben Kuhnert
Assignee: Reuben Kuhnert
Priority: Minor


Minor cleanup. {{HookUtils.getHooks}} multiple times for every statement 
executed performing nearly identical work. Cache the results of the work to 
improve performance. 

Also introduce the {{@CacheableHook}} annotation which can be appended to hooks 
that don't need to be re-instantiated using expensive reflection (such as 
Sentry hooks that load configuration on initialization).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-13387) Beeline fails silently from missing dependency

2016-03-30 Thread Reuben Kuhnert (JIRA)
Reuben Kuhnert created HIVE-13387:
-

 Summary: Beeline fails silently from missing dependency
 Key: HIVE-13387
 URL: https://issues.apache.org/jira/browse/HIVE-13387
 Project: Hive
  Issue Type: Task
Reporter: Reuben Kuhnert
Assignee: Reuben Kuhnert
Priority: Minor


Beeline fails to connect because {{HiveSqlException}} dependency is not on 
classpath:

{code}
java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
org.apache.hive.beeline.ReflectiveCommandHandler.execute(ReflectiveCommandHandler.java:52)
at 
org.apache.hive.beeline.BeeLine.execCommandWithPrefix(BeeLine.java:1077)
at org.apache.hive.beeline.BeeLine.dispatch(BeeLine.java:1116)
at org.apache.hive.beeline.BeeLine.initArgs(BeeLine.java:762)
at org.apache.hive.beeline.BeeLine.begin(BeeLine.java:841)
at 
org.apache.hive.beeline.BeeLine.mainWithInputRedirection(BeeLine.java:493)
at org.apache.hive.beeline.BeeLine.main(BeeLine.java:476)
Caused by: java.lang.NoClassDefFoundError: 
org/apache/hive/service/cli/HiveSQLException
at org.apache.hive.jdbc.HiveConnection.(HiveConnection.java:131)
at org.apache.hive.jdbc.HiveDriver.connect(HiveDriver.java:107)
at java.sql.DriverManager.getConnection(DriverManager.java:571)
at java.sql.DriverManager.getConnection(DriverManager.java:187)
at 
org.apache.hive.beeline.DatabaseConnection.connect(DatabaseConnection.java:141)
at 
org.apache.hive.beeline.DatabaseConnection.getConnection(DatabaseConnection.java:205)
at org.apache.hive.beeline.Commands.connect(Commands.java:1393)
at org.apache.hive.beeline.Commands.connect(Commands.java:1314)
... 11 more
Caused by: java.lang.ClassNotFoundException: 
org.apache.hive.service.cli.HiveSQLException
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
... 19 more
{code}

This happens when trying to run beeline as a standalone java application:

{code}
sircodesalot@excalibur:~/Dev/Cloudera/hive/beeline$ mvn exec:java 
-Dexec.args='-u jdbc:hive2://localhost:1 sircodesalot' 
-Dexec.mainClass="org.apache.hive.beeline.BeeLine"
[INFO] Scanning for projects...
[INFO] 
[INFO] 
[INFO] Building Hive Beeline 2.1.0-SNAPSHOT
[INFO] 
[INFO] 
[INFO] --- exec-maven-plugin:1.4.0:java (default-cli) @ hive-beeline ---
Connecting to jdbc:hive2://localhost:1
ERROR StatusLogger No log4j2 configuration file found. Using default 
configuration: logging only errors to the console.
org/apache/hive/service/cli/HiveSQLException
Beeline version ??? by Apache Hive

// HERE: This will never connect because of ClassNotFoundException. 
0: jdbc:hive2://localhost:1 (closed)>
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-13385) [Cleanup] Streamline Beeline instantiation

2016-03-30 Thread Reuben Kuhnert (JIRA)
Reuben Kuhnert created HIVE-13385:
-

 Summary: [Cleanup] Streamline Beeline instantiation
 Key: HIVE-13385
 URL: https://issues.apache.org/jira/browse/HIVE-13385
 Project: Hive
  Issue Type: Task
Reporter: Reuben Kuhnert
Assignee: Reuben Kuhnert


Janitorial. Remove circular dependencies in {{BeelineCommandLineCompleter}}. 
Stream line code readability.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-13311) MetaDataFormatUtils throws NPE when HiveDecimal.create is null

2016-03-20 Thread Reuben Kuhnert (JIRA)
Reuben Kuhnert created HIVE-13311:
-

 Summary: MetaDataFormatUtils throws NPE when HiveDecimal.create is 
null
 Key: HIVE-13311
 URL: https://issues.apache.org/jira/browse/HIVE-13311
 Project: Hive
  Issue Type: Bug
Reporter: Reuben Kuhnert
Assignee: Reuben Kuhnert


The {{MetadataFormatUtils.convertToString}} functions have guards to validate 
for when valid is null, however the 

{code}
  private static String convertToString(Decimal val) {
if (val == null) {
  return "";
}

return HiveDecimal.create(new BigInteger(val.getUnscaled()), 
val.getScale()).toString();
  }
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Review Request 44521: HIVE-13231: Show helpful error message on failure to create table in nested directory

2016-03-08 Thread Reuben Kuhnert

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/44521/
---

Review request for hive, Aihua Xu, Sergio Pena, and Szehon Ho.


Repository: hive-git


Description
---

Show helpful error message on failure to create table in nested directory


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 
ad170963c4044bdf71e3d498cab669014e452ab3 

Diff: https://reviews.apache.org/r/44521/diff/


Testing
---

Tested locally + Tested against Hive-Jenkins.


Thanks,

Reuben Kuhnert



[jira] [Created] (HIVE-13231) Show helpful error message on failure to create nested table in nested directory

2016-03-08 Thread Reuben Kuhnert (JIRA)
Reuben Kuhnert created HIVE-13231:
-

 Summary: Show helpful error message on failure to create nested 
table in nested directory
 Key: HIVE-13231
 URL: https://issues.apache.org/jira/browse/HIVE-13231
 Project: Hive
  Issue Type: Bug
  Components: Hive
Reporter: Reuben Kuhnert
Assignee: Reuben Kuhnert
Priority: Minor


cannot store data in a directory whose parent doesn't exist, even though the 
target dir does have an existing ancestor on HDFS.

{code}
0: jdbc:hive2://10.17.81.192:1/default> create table test3 location 
'/user/hive/data/yshi/nonexisting/test3' as select * from sample_07;
Error: Error while processing statement: FAILED: Execution Error, return code 1 
from org.apache.hadoop.hive.ql.exec.MoveTask (state=08S01,code=1)
Error message:
2015-10-29 19:04:46,323 ERROR org.apache.hadoop.hive.ql.exec.Task: Failed with 
exception Unable to rename: 
hdfs://host-10-17-81-192.coe.cloudera.com:8020/user/hive/warehouse/.hive-staging_hive_2015-10-29_19-04-08_375_5385987873542863570-3/-ext-10001
 to: /user/hive/data/yshi/nonexisting/test3
org.apache.hadoop.hive.ql.metadata.HiveException: Unable to rename: 
hdfs://host-10-17-81-192.coe.cloudera.com:8020/user/hive/warehouse/.hive-staging_hive_2015-10-29_19-04-08_375_5385987873542863570-3/-ext-10001
 to: /user/hive/data/yshi/nonexisting/test3
at org.apache.hadoop.hive.ql.exec.MoveTask.moveFile(MoveTask.java:101)
at org.apache.hadoop.hive.ql.exec.MoveTask.execute(MoveTask.java:209)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:153)
at 
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1554)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1321)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1139)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:962)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:957)
at 
org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:144)
at 
org.apache.hive.service.cli.operation.SQLOperation.access$000(SQLOperation.java:68)
at 
org.apache.hive.service.cli.operation.SQLOperation$1$1.run(SQLOperation.java:199)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1642)
at 
org.apache.hadoop.hive.shims.HadoopShimsSecure.doAs(HadoopShimsSecure.java:502)
at 
org.apache.hive.service.cli.operation.SQLOperation$1.run(SQLOperation.java:212)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Review Request 42726: HIVE-12891: Hive fails when java.io.tmpdir is set to a relative location

2016-01-25 Thread Reuben Kuhnert

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/42726/
---

Review request for hive, Sergio Pena, Szehon Ho, and Xuefu Zhang.


Repository: hive-git


Description
---

Create a tool for hooking into SystemVariable requests to validate/coerce 
values before sending them to the user.


Diffs
-

  common/src/java/org/apache/hadoop/hive/common/FileUtils.java 
5dd9f407c23316f0a52651446a6038b9b69c6f12 
  common/src/java/org/apache/hadoop/hive/conf/SystemVariables.java 
9f59f11ca6459853b15ca80fa9751db934befc71 
  
common/src/java/org/apache/hadoop/hive/conf/valcoersion/JavaIOTmpdirVariableCoercion.java
 PRE-CREATION 
  common/src/java/org/apache/hadoop/hive/conf/valcoersion/VariableCoercion.java 
PRE-CREATION 
  
common/src/java/org/apache/hadoop/hive/conf/valcoersion/VariableCoercionSet.java
 PRE-CREATION 
  common/src/test/org/apache/hadoop/hive/conf/TestValueCoersions.java 
PRE-CREATION 

Diff: https://reviews.apache.org/r/42726/diff/


Testing
---

Tested locally, includes unit-test, tested against jenkins.


Thanks,

Reuben Kuhnert



[jira] [Created] (HIVE-12891) Hive fails when java.io.tmpdir is set to a relative location

2016-01-19 Thread Reuben Kuhnert (JIRA)
Reuben Kuhnert created HIVE-12891:
-

 Summary: Hive fails when java.io.tmpdir is set to a relative 
location
 Key: HIVE-12891
 URL: https://issues.apache.org/jira/browse/HIVE-12891
 Project: Hive
  Issue Type: Bug
Reporter: Reuben Kuhnert
Assignee: Reuben Kuhnert


The function {{SessionState.createSessionDirs}} fails when trying to create 
directories where {{java.io.tmpdir}} is set to a relative location.

{code}
\[uber-SubtaskRunner] ERROR o.a.h.hive..ql.Driver - FAILED: 
IllegalArgumentException java.net.URISyntaxException: Relative path in absolute 
URI: file:./tmp///hive_2015_12_11_09-12-25_352_4325234652356-1
...
Minor variations:
\[uber-SubtaskRunner] ERROR o.a.h.hive..ql.Driver - FAILED: SemanticException 
Exception while processing Exception while writing out the local file 
o.a.h.hive.ql/parse.SemanticException: Exception while processing exception 
while writing out local file 
... 
caused by: java.lang.IllegalArgumentException: java.net.URISyntaxException: 
Relative path in absolute URI: 
file:./tmp///hive_2015_12_11_09-12-25_352_4325234652356-1 
at o.a.h.fs.Path.initialize (206) 
at o.a.h.fs.Path.(197)... 
at o.a.h.hive.ql.context.getScratchDir(267) 
{code}





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-12469) Bump Commons-Collections dependency from 3.2.1 to 3.2.2. to address vulnerability

2015-11-19 Thread Reuben Kuhnert (JIRA)
Reuben Kuhnert created HIVE-12469:
-

 Summary: Bump Commons-Collections dependency from 3.2.1 to 3.2.2. 
to address vulnerability
 Key: HIVE-12469
 URL: https://issues.apache.org/jira/browse/HIVE-12469
 Project: Hive
  Issue Type: Bug
  Components: Hive
Reporter: Reuben Kuhnert
Assignee: Reuben Kuhnert
Priority: Blocker






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 36942: HIVE-11401: Predicate push down does not work with Parquet when partitions are in the expression

2015-07-30 Thread Reuben Kuhnert

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/36942/#review93651
---


This looks good to me.

- Reuben Kuhnert


On July 30, 2015, 9:22 p.m., Sergio Pena wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/36942/
 ---
 
 (Updated July 30, 2015, 9:22 p.m.)
 
 
 Review request for hive, Aihua Xu, cheng xu, Dong Chen, and Szehon Ho.
 
 
 Bugs: HIVE-11401
 https://issues.apache.org/jira/browse/HIVE-11401
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 The following patch reviews the predicate created by Hive, and removes any 
 column that does not belong to the Parquet schema, such as partitioned 
 columns. This way Parquet can filter the columns correctly.
 
 
 Diffs
 -
 
   
 ql/src/java/org/apache/hadoop/hive/ql/io/parquet/read/ParquetFilterPredicateConverter.java
  PRE-CREATION 
   
 ql/src/java/org/apache/hadoop/hive/ql/io/parquet/read/ParquetRecordReaderWrapper.java
  49e52da2e26fd7213df1db88716eaee94cb536b8 
   
 ql/src/test/org/apache/hadoop/hive/ql/io/parquet/TestParquetRecordReaderWrapper.java
  87dd344534f09c7fc565fdc467ac82a51f37ebba 
   
 ql/src/test/org/apache/hadoop/hive/ql/io/parquet/read/TestParquetFilterPredicate.java
  PRE-CREATION 
   
 ql/src/test/org/apache/hadoop/hive/ql/io/sarg/TestConvertAstToSearchArg.java 
 85e952fb6855a2a03902ed971f54191837b32dac 
   ql/src/test/queries/clientpositive/parquet_predicate_pushdown.q 
 PRE-CREATION 
   ql/src/test/results/clientpositive/parquet_predicate_pushdown.q.out 
 PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/36942/diff/
 
 
 Testing
 ---
 
 Unit tests: TestParquetFilterPredicate.java
 Integration tests: parquet_predicate_pushdown.q
 
 
 Thanks,
 
 Sergio Pena
 




Re: Review Request 36942: HIVE-11401: Predicate push down does not work with Parquet when partitions are in the expression

2015-07-30 Thread Reuben Kuhnert

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/36942/#review93587
---



ql/src/java/org/apache/hadoop/hive/ql/io/parquet/read/ParquetFilterPredicateConverter.java
 (line 54)
https://reviews.apache.org/r/36942/#comment147977

If the goal here is to get just the top-level fields, can we do something 
like:

```
for (Type field : schema.getFields()) {  
  columns.add(field.getName());
}
``` 

This might be a little bit clearer.



ql/src/java/org/apache/hadoop/hive/ql/io/parquet/read/ParquetFilterPredicateConverter.java
 (line 64)
https://reviews.apache.org/r/36942/#comment147969

Minor nit: Since we have the opportunity to fix it, can we change 'leafs' 
to 'leaves'.



ql/src/java/org/apache/hadoop/hive/ql/io/parquet/read/ParquetFilterPredicateConverter.java
 (line 102)
https://reviews.apache.org/r/36942/#comment147978

ListT has O(N) lookup time. Can we store this in a SetT (O(1)) instead?


- Reuben Kuhnert


On July 30, 2015, 3:43 p.m., Sergio Pena wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/36942/
 ---
 
 (Updated July 30, 2015, 3:43 p.m.)
 
 
 Review request for hive, Aihua Xu, cheng xu, Dong Chen, and Szehon Ho.
 
 
 Bugs: HIVE-11401
 https://issues.apache.org/jira/browse/HIVE-11401
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 The following patch reviews the predicate created by Hive, and removes any 
 column that does not belong to the Parquet schema, such as partitioned 
 columns. This way Parquet can filter the columns correctly.
 
 
 Diffs
 -
 
   
 ql/src/java/org/apache/hadoop/hive/ql/io/parquet/read/ParquetFilterPredicateConverter.java
  PRE-CREATION 
   
 ql/src/java/org/apache/hadoop/hive/ql/io/parquet/read/ParquetRecordReaderWrapper.java
  49e52da2e26fd7213df1db88716eaee94cb536b8 
   
 ql/src/test/org/apache/hadoop/hive/ql/io/parquet/TestParquetRecordReaderWrapper.java
  87dd344534f09c7fc565fdc467ac82a51f37ebba 
   
 ql/src/test/org/apache/hadoop/hive/ql/io/parquet/read/TestParquetFilterPredicate.java
  PRE-CREATION 
   
 ql/src/test/org/apache/hadoop/hive/ql/io/sarg/TestConvertAstToSearchArg.java 
 85e952fb6855a2a03902ed971f54191837b32dac 
   ql/src/test/queries/clientpositive/parquet_predicate_pushdown.q 
 PRE-CREATION 
   ql/src/test/results/clientpositive/parquet_predicate_pushdown.q.out 
 PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/36942/diff/
 
 
 Testing
 ---
 
 Unit tests: TestParquetFilterPredicate.java
 Integration tests: parquet_predicate_pushdown.q
 
 
 Thanks,
 
 Sergio Pena
 




[jira] [Created] (HIVE-10738) Beeline does not respect hive.cli.print.current.db

2015-05-18 Thread Reuben Kuhnert (JIRA)
Reuben Kuhnert created HIVE-10738:
-

 Summary: Beeline does not respect hive.cli.print.current.db
 Key: HIVE-10738
 URL: https://issues.apache.org/jira/browse/HIVE-10738
 Project: Hive
  Issue Type: Bug
Reporter: Reuben Kuhnert
Assignee: Reuben Kuhnert
Priority: Minor



Hive CLI (shows default database):
{code}
hive set hive.cli.print.current.db=true;
set hive.cli.print.current.db=true;
hive (default) 
{code}

Beeline (no change):
{code}
0: jdbc:hive2://localhost:1 set hive.cli.print.current.db=true;
set hive.cli.print.current.db=true;
No rows affected (3.016 seconds)
0: jdbc:hive2://localhost:1 
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 33806: Add Tree traversal tools to ParseUtil class that allow for checking node structures with general predicate

2015-05-10 Thread Reuben Kuhnert

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33806/
---

(Updated May 10, 2015, 2:05 p.m.)


Review request for hive and Sergio Pena.


Changes
---

Updated diff that is verifyably passing.


Bugs: HIVE-10190
https://issues.apache.org/jira/browse/HIVE-10190


Repository: hive-git


Description
---

HIVE-10190: CBO: AST mode checks for TABLESAMPLE with 
AST.toString().contains(TOK_TABLESPLITSAMPLE)


Diffs (updated)
-

  ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/HiveCalciteUtil.java 
372c93d9af01608538b2e2e5a50c45188acb04f9 
  ql/src/java/org/apache/hadoop/hive/ql/parse/ParseUtils.java 
373429cbf666f1b19828c532aea3c07f08f95e1a 

Diff: https://reviews.apache.org/r/33806/diff/


Testing
---

Tested locally


Thanks,

Reuben Kuhnert



[jira] [Created] (HIVE-10656) Beeline set var=value not carrying over to queries

2015-05-08 Thread Reuben Kuhnert (JIRA)
Reuben Kuhnert created HIVE-10656:
-

 Summary: Beeline set var=value not carrying over to queries
 Key: HIVE-10656
 URL: https://issues.apache.org/jira/browse/HIVE-10656
 Project: Hive
  Issue Type: Bug
Reporter: Reuben Kuhnert
Priority: Minor


After performing a {{set name=value}} I would expect that the variable name 
would carry over to all locations within the session. It appears to work when 
querying the value via {{set;}}, but not when trying to do actual sql 
statements.

Example:

{code}
0: jdbc:hive2://localhost:1 set foo;
+--+--+
|   set|
+--+--+
| foo=bar  |
+--+--+
1 row selected (0.932 seconds)

0: jdbc:hive2://localhost:1 select * from ${foo};
Error: Error while compiling statement: FAILED: SemanticException [Error 
10001]: Line 1:14 Table not found 'bar' (state=42S02,code=10001)

0: jdbc:hive2://localhost:1 show tables;
++--+
|  tab_name  |
++--+
| my |
| purchases  |
++--+
2 rows selected (0.437 seconds)
0: jdbc:hive2://localhost:1 set foo=my;

No rows affected (0.017 seconds)
0: jdbc:hive2://localhost:1 set foo;
+-+--+
|   set   |
+-+--+
| foo=my  |
+-+--+
1 row selected (0.02 seconds)

0: jdbc:hive2://localhost:1 select * from ${foo};
select * from ${foo};
Error: Error while compiling statement: FAILED: SemanticException [Error 
10001]: Line 1:14 Table not found 'bar' (state=42S02,code=10001)
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 33816: HIVE-10597: Relative path doesn't work with CREATE TABLE LOCATION 'relative/path'

2015-05-07 Thread Reuben Kuhnert

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33816/#review82818
---



metastore/src/java/org/apache/hadoop/hive/metastore/Warehouse.java
https://reviews.apache.org/r/33816/#comment133647

Originally I used the warehouse home directory as the base for relative 
paths. However, per Lenni's comment, it sounds like it's best to throw an 
exception if the user tries to use a anything other than an absolute path.


- Reuben Kuhnert


On May 5, 2015, 3:14 p.m., Reuben Kuhnert wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/33816/
 ---
 
 (Updated May 5, 2015, 3:14 p.m.)
 
 
 Review request for hive and Sergio Pena.
 
 
 Bugs: HIVE-10597
 https://issues.apache.org/jira/browse/HIVE-10597
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 Allow warehouse to work with relative locations.
 
 
 Diffs
 -
 
   metastore/src/java/org/apache/hadoop/hive/metastore/Warehouse.java 
 25119abf97382df7c0615edbaff29ba20624a137 
   metastore/src/test/org/apache/hadoop/hive/metastore/TestWarehouse.java 
 PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/33816/diff/
 
 
 Testing
 ---
 
 Tested locally
 
 
 Thanks,
 
 Reuben Kuhnert
 




Re: Review Request 33816: HIVE-10597: Relative path doesn't work with CREATE TABLE LOCATION 'relative/path'

2015-05-07 Thread Reuben Kuhnert


 On May 7, 2015, 1:03 a.m., cheng xu wrote:
  metastore/src/java/org/apache/hadoop/hive/metastore/Warehouse.java, line 143
  https://reviews.apache.org/r/33816/diff/2/?file=950357#file950357line143
 
  Will this patch work with relative path? Seems only a check added.

Originally I used the warehouse home directory as the base for relative paths. 
However, per Lenni's comment, it sounds like it's best to throw an exception if 
the user tries to use a anything other than an absolute path.


- Reuben


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33816/#review82781
---


On May 5, 2015, 3:14 p.m., Reuben Kuhnert wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/33816/
 ---
 
 (Updated May 5, 2015, 3:14 p.m.)
 
 
 Review request for hive and Sergio Pena.
 
 
 Bugs: HIVE-10597
 https://issues.apache.org/jira/browse/HIVE-10597
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 Allow warehouse to work with relative locations.
 
 
 Diffs
 -
 
   metastore/src/java/org/apache/hadoop/hive/metastore/Warehouse.java 
 25119abf97382df7c0615edbaff29ba20624a137 
   metastore/src/test/org/apache/hadoop/hive/metastore/TestWarehouse.java 
 PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/33816/diff/
 
 
 Testing
 ---
 
 Tested locally
 
 
 Thanks,
 
 Reuben Kuhnert
 




Re: Review Request 33816: HIVE-10597: Relative path doesn't work with CREATE TABLE LOCATION 'relative/path'

2015-05-06 Thread Reuben Kuhnert


 On May 4, 2015, 9:03 p.m., Sergio Pena wrote:
  metastore/src/java/org/apache/hadoop/hive/metastore/Warehouse.java, line 141
  https://reviews.apache.org/r/33816/diff/1/?file=949031#file949031line141
 
  does this line handle the relative path correctly?

This code works correctly.


- Reuben


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33816/#review82440
---


On May 5, 2015, 3:14 p.m., Reuben Kuhnert wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/33816/
 ---
 
 (Updated May 5, 2015, 3:14 p.m.)
 
 
 Review request for hive and Sergio Pena.
 
 
 Bugs: HIVE-10597
 https://issues.apache.org/jira/browse/HIVE-10597
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 Allow warehouse to work with relative locations.
 
 
 Diffs
 -
 
   metastore/src/java/org/apache/hadoop/hive/metastore/Warehouse.java 
 25119abf97382df7c0615edbaff29ba20624a137 
   metastore/src/test/org/apache/hadoop/hive/metastore/TestWarehouse.java 
 PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/33816/diff/
 
 
 Testing
 ---
 
 Tested locally
 
 
 Thanks,
 
 Reuben Kuhnert
 




Re: Review Request 33816: HIVE-10597: Relative path doesn't work with CREATE TABLE LOCATION 'relative/path'

2015-05-05 Thread Reuben Kuhnert

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33816/
---

(Updated 5 5, 2015, 3:14 오후)


Review request for hive and Sergio Pena.


Changes
---

Updated diff to throw exceptions on relative path, also updated comments and 
added unit tests.


Bugs: HIVE-10597
https://issues.apache.org/jira/browse/HIVE-10597


Repository: hive-git


Description
---

Allow warehouse to work with relative locations.


Diffs (updated)
-

  metastore/src/java/org/apache/hadoop/hive/metastore/Warehouse.java 
25119abf97382df7c0615edbaff29ba20624a137 
  metastore/src/test/org/apache/hadoop/hive/metastore/TestWarehouse.java 
PRE-CREATION 

Diff: https://reviews.apache.org/r/33816/diff/


Testing
---

Tested locally


Thanks,

Reuben Kuhnert



Re: Review Request 33680: Create new hive-site property for supporting port configuration

2015-05-04 Thread Reuben Kuhnert

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33680/
---

(Updated 5 4, 2015, 12:30 오후)


Review request for hive and Sergio Pena.


Changes
---

Updated with Lefty's suggestions


Bugs: HIVE-9365
https://issues.apache.org/jira/browse/HIVE-9365


Repository: hive-git


Description
---

HIVE-9365: The Metastore should take port configuration from hive-site.xml


Diffs (updated)
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 
c9ee423cf73706feb2774dacca9fc7b94fe80617 
  metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 
3f267ff0eb20560c36a19b74353f9d6749c8b333 
  metastore/src/test/org/apache/hadoop/hive/metastore/TestHiveMetastoreCli.java 
PRE-CREATION 

Diff: https://reviews.apache.org/r/33680/diff/


Testing
---

Unit tests attached. Also tested locally.


Thanks,

Reuben Kuhnert



Review Request 33806: Add Tree traversal tools to ParseUtil class that allow for checking node structures with general predicate

2015-05-04 Thread Reuben Kuhnert

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33806/
---

Review request for hive and Sergio Pena.


Bugs: HIVE-10190
https://issues.apache.org/jira/browse/HIVE-10190


Repository: hive-git


Description
---

HIVE-10190: CBO: AST mode checks for TABLESAMPLE with 
AST.toString().contains(TOK_TABLESPLITSAMPLE)


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/HiveCalciteUtil.java 
7614463525262f01375c1336e89a18670862bb7d 
  ql/src/java/org/apache/hadoop/hive/ql/parse/ParseUtils.java 
373429cbf666f1b19828c532aea3c07f08f95e1a 

Diff: https://reviews.apache.org/r/33806/diff/


Testing
---

Tested locally


Thanks,

Reuben Kuhnert



Re: Review Request 33806: Add Tree traversal tools to ParseUtil class that allow for checking node structures with general predicate

2015-05-04 Thread Reuben Kuhnert

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33806/
---

(Updated 5 4, 2015, 4:09 오후)


Review request for hive and Sergio Pena.


Changes
---

Replaced Java 8 Predicate with 
org.apache.hadoop.hive.ql.exec.PTFUtils.Predicate (Java 7)


Bugs: HIVE-10190
https://issues.apache.org/jira/browse/HIVE-10190


Repository: hive-git


Description
---

HIVE-10190: CBO: AST mode checks for TABLESAMPLE with 
AST.toString().contains(TOK_TABLESPLITSAMPLE)


Diffs (updated)
-

  ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/HiveCalciteUtil.java 
7614463525262f01375c1336e89a18670862bb7d 
  ql/src/java/org/apache/hadoop/hive/ql/parse/ParseUtils.java 
373429cbf666f1b19828c532aea3c07f08f95e1a 

Diff: https://reviews.apache.org/r/33806/diff/


Testing
---

Tested locally


Thanks,

Reuben Kuhnert



Review Request 33813: Change default for ignorenonexistent

2015-05-04 Thread Reuben Kuhnert

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33813/
---

Review request for hive and Sergio Pena.


Bugs: HIVE-6754
https://issues.apache.org/jira/browse/HIVE-6754


Repository: hive-git


Description
---

HIVE-6754: Inconsistent default for hive.exec.drop.ignorenonexistent


Diffs
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 
e138800e6dadd6fe76345f21eb76c906165c438d 

Diff: https://reviews.apache.org/r/33813/diff/


Testing
---


Thanks,

Reuben Kuhnert



[jira] [Created] (HIVE-10597) Relative path doesn't work with CREATE TABLE LOCATION 'relative/path'

2015-05-04 Thread Reuben Kuhnert (JIRA)
Reuben Kuhnert created HIVE-10597:
-

 Summary: Relative path doesn't work with CREATE TABLE LOCATION 
'relative/path'
 Key: HIVE-10597
 URL: https://issues.apache.org/jira/browse/HIVE-10597
 Project: Hive
  Issue Type: Bug
Reporter: Reuben Kuhnert
Assignee: Reuben Kuhnert
Priority: Critical


{code}
0: jdbc:hive2://a2110.halxg.cloudera.com:1000 CREATE EXTERNAL TABLE IF NOT 
EXISTS mydb.employees3 like mydb.employees LOCATION 'data/stock';
Error: Error while processing statement: FAILED: Execution Error, return code 1 
from org.apache.hadoop.hive.ql.exec.DDLTask. 
MetaException(message:java.lang.NullPointerException) (state=08S01,code=1)

0: jdbc:hive2://a2110.halxg.cloudera.com:1000 CREATE EXTERNAL TABLE IF NOT 
EXISTS mydb.employees3 like mydb.employees LOCATION '/user/hive/data/stock';
No rows affected (0.369 seconds)
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Review Request 33816: HIVE-10597: Relative path doesn't work with CREATE TABLE LOCATION 'relative/path'

2015-05-04 Thread Reuben Kuhnert

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33816/
---

Review request for hive and Sergio Pena.


Bugs: HIVE-10597
https://issues.apache.org/jira/browse/HIVE-10597


Repository: hive-git


Description
---

Allow warehouse to work with relative locations.


Diffs
-

  metastore/src/java/org/apache/hadoop/hive/metastore/Warehouse.java 
25119abf97382df7c0615edbaff29ba20624a137 

Diff: https://reviews.apache.org/r/33816/diff/


Testing
---

Tested locally


Thanks,

Reuben Kuhnert



Review Request 33680: Create new hive-site property for supporting port configuration

2015-04-29 Thread Reuben Kuhnert

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33680/
---

Review request for hive and Sergio Pena.


Bugs: HIVE-9365
https://issues.apache.org/jira/browse/HIVE-9365


Repository: hive-git


Description
---

HIVE-9365: The Metastore should take port configuration from hive-site.xml


Diffs
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 
c9ee423cf73706feb2774dacca9fc7b94fe80617 
  metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 
3f267ff0eb20560c36a19b74353f9d6749c8b333 
  metastore/src/test/org/apache/hadoop/hive/metastore/TestHiveMetastoreCli.java 
PRE-CREATION 

Diff: https://reviews.apache.org/r/33680/diff/


Testing
---

Unit tests attached. Also tested locally.


Thanks,

Reuben Kuhnert



Re: Review Request 33680: Create new hive-site property for supporting port configuration

2015-04-29 Thread Reuben Kuhnert

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33680/
---

(Updated April 29, 2015, 6:35 p.m.)


Review request for hive and Sergio Pena.


Changes
---

Updated with @spena's suggestions


Bugs: HIVE-9365
https://issues.apache.org/jira/browse/HIVE-9365


Repository: hive-git


Description
---

HIVE-9365: The Metastore should take port configuration from hive-site.xml


Diffs (updated)
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 
c9ee423cf73706feb2774dacca9fc7b94fe80617 
  metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 
3f267ff0eb20560c36a19b74353f9d6749c8b333 
  metastore/src/test/org/apache/hadoop/hive/metastore/TestHiveMetastoreCli.java 
PRE-CREATION 

Diff: https://reviews.apache.org/r/33680/diff/


Testing
---

Unit tests attached. Also tested locally.


Thanks,

Reuben Kuhnert



Add to Developer List

2015-04-10 Thread Reuben Kuhnert
Hi, Can I be added to the Hive Developer List. My apache ID is
'sircodesalot'.

Thank you


Add to Contributor List

2015-03-30 Thread Reuben Kuhnert
Hi,

My name is Reuben Kuhnert from (Engineer, Cloudera). I would like to be
added to the hive contributor list if possible.

Thank you