[jira] [Created] (HIVE-10846) LLAP: preemption in AM due to failures / out of order scheduling

2015-05-27 Thread Siddharth Seth (JIRA)
Siddharth Seth created HIVE-10846:
-

 Summary: LLAP: preemption in AM due to failures / out of order 
scheduling
 Key: HIVE-10846
 URL: https://issues.apache.org/jira/browse/HIVE-10846
 Project: Hive
  Issue Type: Sub-task
Reporter: Siddharth Seth
Assignee: Siddharth Seth
 Fix For: llap






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-10845) TezJobMonitor uses killedTaskCount instead of killedTaskAttemptCount

2015-05-27 Thread Siddharth Seth (JIRA)
Siddharth Seth created HIVE-10845:
-

 Summary: TezJobMonitor uses killedTaskCount instead of 
killedTaskAttemptCount
 Key: HIVE-10845
 URL: https://issues.apache.org/jira/browse/HIVE-10845
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0
Reporter: Siddharth Seth
Assignee: Siddharth Seth
 Fix For: 1.2.1






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Review Request 34757: HIVE-10844: Combine equivalent Works for HoS[Spark Branch]

2015-05-27 Thread chengxiang li

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34757/
---

Review request for hive and Xuefu Zhang.


Bugs: HIVE-10844
https://issues.apache.org/jira/browse/HIVE-10844


Repository: hive-git


Description
---

Some Hive queries(like TPCDS Q39) may share the same subquery, which translated 
into sperate, but equivalent Works in SparkWork, combining these equivalent 
Works into a single one would help to benifit from following dynamic RDD 
caching optimization.


Diffs
-

  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/CombineEquivalentWorkResolver.java
 PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkCompiler.java 19aae70 

Diff: https://reviews.apache.org/r/34757/diff/


Testing
---


Thanks,

chengxiang li



Revise docs for Hive indexing

2015-05-27 Thread Lefty Leverenz
Will Hive indexing ever be fixed?  If not, should we remove the doc I
cobbled together (Indexing
)
or just revise it?  And should the design doc be moved from the Completed
section to Incomplete (Hive Design Docs
)?

What about bitmap indexes, do they work (Bitmap Indexes
 --
HIVE-1803 )?

-- Lefty


Review Request 34752: Beeline-CLI: Implement CLI source command using Beeline functionality

2015-05-27 Thread cheng xu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34752/
---

Review request for hive and Xuefu Zhang.


Bugs: HIVE-10821
https://issues.apache.org/jira/browse/HIVE-10821


Repository: hive-git


Description
---

Add source command support for CLI using beeline


Diffs
-

  beeline/src/java/org/apache/hive/beeline/BeeLine.java 4a82635 
  beeline/src/test/org/apache/hive/beeline/cli/TestHiveCli.java cc0b598 

Diff: https://reviews.apache.org/r/34752/diff/


Testing
---

Newly created UT passed


Thanks,

cheng xu



Re: Review Request 34455: HIVE-10550 Dynamic RDD caching optimization for HoS.[Spark Branch]

2015-05-27 Thread Xuefu Zhang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34455/#review85509
---

Ship it!


Ship It!

- Xuefu Zhang


On May 28, 2015, 3:30 a.m., chengxiang li wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/34455/
> ---
> 
> (Updated May 28, 2015, 3:30 a.m.)
> 
> 
> Review request for hive, Chao Sun, Jimmy Xiang, and Xuefu Zhang.
> 
> 
> Bugs: HIVE-10550
> https://issues.apache.org/jira/browse/HIVE-10550
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> see jira description
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/CacheTran.java 
> PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/MapTran.java 2170243 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/ReduceTran.java e60dfac 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkPlan.java ee5c78a 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkPlanGenerator.java 
> 3f240f5 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkUtilities.java 
> e6c845c 
> 
> Diff: https://reviews.apache.org/r/34455/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> chengxiang li
> 
>



Re: Review Request 34455: HIVE-10550 Dynamic RDD caching optimization for HoS.[Spark Branch]

2015-05-27 Thread Xuefu Zhang


> On May 27, 2015, 10:13 p.m., Xuefu Zhang wrote:
> > common/src/java/org/apache/hadoop/hive/conf/HiveConf.java, line 2062
> > 
> >
> > Sorry for pointing this out late. I'm not certain if it's a good idea 
> > to expose these two configurations. Also this introduces a change of  
> > behavior. For now, can we get rid of them and change the persistency level 
> > back to MEM+DISK?
> > 
> > We can come back to revisit this later on. At this moment, I don't feel 
> > confident to make the call.
> 
> chengxiang li wrote:
> persistent to MEM + DISK may hurt the performance in certain cases, i 
> think at least we should have a switch to open/close this optimization,
> 
> Xuefu Zhang wrote:
> Agreed. However, before we find out more about in what cases this helps 
> or hurts, I think it's better we keep the existing behavior. This doesn't 
> prevent us from adding a flag later on.
> 
> chengxiang li wrote:
> Ok, i would remove these configurations from patch in temp, we can 
> discuss later when we got more knowledge about it.

Please feel free to create a followup JIRA to do more research. We can try 
different data sizes and persistancy levels to see the result. At that time, we 
can decide if it makes sense to introduce configurations. Thanks.


- Xuefu


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34455/#review85451
---


On May 28, 2015, 3:30 a.m., chengxiang li wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/34455/
> ---
> 
> (Updated May 28, 2015, 3:30 a.m.)
> 
> 
> Review request for hive, Chao Sun, Jimmy Xiang, and Xuefu Zhang.
> 
> 
> Bugs: HIVE-10550
> https://issues.apache.org/jira/browse/HIVE-10550
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> see jira description
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/CacheTran.java 
> PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/MapTran.java 2170243 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/ReduceTran.java e60dfac 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkPlan.java ee5c78a 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkPlanGenerator.java 
> 3f240f5 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkUtilities.java 
> e6c845c 
> 
> Diff: https://reviews.apache.org/r/34455/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> chengxiang li
> 
>



Review Request 34754: NumberFormatException while running analyze table partition compute statics query

2015-05-27 Thread Ashutosh Chauhan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34754/
---

Review request for hive and pengcheng xiong.


Bugs: HIVE-10840
https://issues.apache.org/jira/browse/HIVE-10840


Repository: hive-git


Description
---

NumberFormatException while running analyze table partition compute statics 
query


Diffs
-

  itests/src/test/resources/testconfiguration.properties ae03283 
  ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUtils.java ad481bc 
  ql/src/test/queries/clientpositive/stats_only_null.q a91022c 
  ql/src/test/results/clientpositive/tez/stats_only_null.q.out PRE-CREATION 

Diff: https://reviews.apache.org/r/34754/diff/


Testing
---

Modified existing test to increase its coverage.


Thanks,

Ashutosh Chauhan



[jira] [Created] (HIVE-10844) Combine equivalent Works for HoS[Spark Branch]

2015-05-27 Thread Chengxiang Li (JIRA)
Chengxiang Li created HIVE-10844:


 Summary: Combine equivalent Works for HoS[Spark Branch]
 Key: HIVE-10844
 URL: https://issues.apache.org/jira/browse/HIVE-10844
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Chengxiang Li
Assignee: Chengxiang Li


Some Hive queries(like [TPCDS 
Q39|https://github.com/hortonworks/hive-testbench/blob/hive14/sample-queries-tpcds/query39.sql])
 may share the same subquery, which translated into sperate, but equivalent 
Works in SparkWork, combining these equivalent Works into a single one would 
help to benifit from following dynamic RDD caching optimization.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 34455: HIVE-10550 Dynamic RDD caching optimization for HoS.[Spark Branch]

2015-05-27 Thread chengxiang li

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34455/
---

(Updated May 28, 2015, 3:30 a.m.)


Review request for hive, Chao Sun, Jimmy Xiang, and Xuefu Zhang.


Changes
---

remove configs, and move common parent match logic in SparkPlanGenerator 
directly.


Bugs: HIVE-10550
https://issues.apache.org/jira/browse/HIVE-10550


Repository: hive-git


Description
---

see jira description


Diffs (updated)
-

  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/CacheTran.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/MapTran.java 2170243 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/ReduceTran.java e60dfac 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkPlan.java ee5c78a 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkPlanGenerator.java 
3f240f5 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkUtilities.java e6c845c 

Diff: https://reviews.apache.org/r/34455/diff/


Testing
---


Thanks,

chengxiang li



Re: Review Request 34455: HIVE-10550 Dynamic RDD caching optimization for HoS.[Spark Branch]

2015-05-27 Thread chengxiang li


> On 五月 27, 2015, 10:13 p.m., Xuefu Zhang wrote:
> > common/src/java/org/apache/hadoop/hive/conf/HiveConf.java, line 2062
> > 
> >
> > Sorry for pointing this out late. I'm not certain if it's a good idea 
> > to expose these two configurations. Also this introduces a change of  
> > behavior. For now, can we get rid of them and change the persistency level 
> > back to MEM+DISK?
> > 
> > We can come back to revisit this later on. At this moment, I don't feel 
> > confident to make the call.
> 
> chengxiang li wrote:
> persistent to MEM + DISK may hurt the performance in certain cases, i 
> think at least we should have a switch to open/close this optimization,
> 
> Xuefu Zhang wrote:
> Agreed. However, before we find out more about in what cases this helps 
> or hurts, I think it's better we keep the existing behavior. This doesn't 
> prevent us from adding a flag later on.

Ok, i would remove these configurations from patch in temp, we can discuss 
later when we got more knowledge about it.


- chengxiang


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34455/#review85451
---


On 五月 27, 2015, 1:50 a.m., chengxiang li wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/34455/
> ---
> 
> (Updated 五月 27, 2015, 1:50 a.m.)
> 
> 
> Review request for hive, Chao Sun, Jimmy Xiang, and Xuefu Zhang.
> 
> 
> Bugs: HIVE-10550
> https://issues.apache.org/jira/browse/HIVE-10550
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> see jira description
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 43c53fc 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/CacheTran.java 
> PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/MapTran.java 2170243 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/ReduceTran.java e60dfac 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkPlan.java ee5c78a 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkPlanGenerator.java 
> 3f240f5 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkUtilities.java 
> e6c845c 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/SparkRddCachingResolver.java
>  PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkCompiler.java 
> 19aae70 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/SparkWork.java bb5dd79 
> 
> Diff: https://reviews.apache.org/r/34455/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> chengxiang li
> 
>



Re: Review Request 34447: HIVE-10761 : Create codahale-based metrics system for Hive

2015-05-27 Thread Xuefu Zhang


> On May 27, 2015, 9:29 p.m., Xuefu Zhang wrote:
> > common/src/java/org/apache/hadoop/hive/common/metrics/metrics2/Metrics.java,
> >  line 141
> > 
> >
> > If the synchronized block is for the whole method, we might just as 
> > well declare the whole method as synchronized.
> 
> Szehon Ho wrote:
> In this context, I think a object synchronization makes more sense than 
> synchronizing on the class (sycnrhonized method).

I think they are equivalent. A synchronized method is synchronizing on "this". 
It will be on the class if the method is static.


- Xuefu


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34447/#review85418
---


On May 28, 2015, 2:11 a.m., Szehon Ho wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/34447/
> ---
> 
> (Updated May 28, 2015, 2:11 a.m.)
> 
> 
> Review request for hive, Chao Sun, Jimmy Xiang, and Xuefu Zhang.
> 
> 
> Bugs: HIVE-10761
> https://issues.apache.org/jira/browse/HIVE-10761
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> See JIRA for the motivation.  Summary: There is an existing metric system 
> that uses some custom model and hooked up to JMX reporting, codahale-based 
> metrics system will be desirable for standard model and reporting.
> 
> This adds a codahale-based metrics system to HiveServer2 and HiveMetastore.  
> Metrics implementation is now internally pluggable, and the existing Metrics 
> system can be re-enabled by configuration if desired for 
> backward-compatibility.
> 
> Following metrics are supported by Metrics system:
> 1.  JVMPauseMonitor (used to call Hadoop's internal implementation, now 
> forked off to integrate with Metrics system)
> 2.  HMS API calls
> 3.  Standard JVM metrics (only for new implementation, as its free with 
> codahale).
> 
> The following metrics reporting are supported by new system (configuration 
> exposed)
> 1.  JMX
> 2.  CONSOLE
> 3.  JSON_FILE (periodic file of metrics that gets overwritten).
> 
> A goal is to add a webserver that exposes the JSON metrics, but this will 
> defer to a later implementation.
> 
> 
> Diffs
> -
> 
>   common/pom.xml a615c1e 
>   common/src/java/org/apache/hadoop/hive/common/JvmPauseMonitor.java 
> PRE-CREATION 
>   common/src/java/org/apache/hadoop/hive/common/metrics/LegacyMetrics.java 
> PRE-CREATION 
>   common/src/java/org/apache/hadoop/hive/common/metrics/Metrics.java 01c9d1d 
>   common/src/java/org/apache/hadoop/hive/common/metrics/common/Metrics.java 
> PRE-CREATION 
>   
> common/src/java/org/apache/hadoop/hive/common/metrics/common/MetricsFactory.java
>  PRE-CREATION 
>   
> common/src/java/org/apache/hadoop/hive/common/metrics/metrics2/CodahaleMetrics.java
>  PRE-CREATION 
>   
> common/src/java/org/apache/hadoop/hive/common/metrics/metrics2/MetricsReporting.java
>  PRE-CREATION 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 49b8f97 
>   
> common/src/test/org/apache/hadoop/hive/common/metrics/TestLegacyMetrics.java 
> PRE-CREATION 
>   common/src/test/org/apache/hadoop/hive/common/metrics/TestMetrics.java 
> e85d3f8 
>   
> common/src/test/org/apache/hadoop/hive/common/metrics/metrics2/TestCodahaleMetrics.java
>  PRE-CREATION 
>   
> itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/TestMetaStoreMetrics.java
>  PRE-CREATION 
>   metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 
> d81c856 
>   pom.xml b21d894 
>   service/src/java/org/apache/hive/service/server/HiveServer2.java 58e8e49 
>   shims/0.20S/src/main/java/org/apache/hadoop/hive/shims/Hadoop20SShims.java 
> 6d8166c 
>   shims/0.23/src/main/java/org/apache/hadoop/hive/shims/Hadoop23Shims.java 
> 19324b8 
>   shims/common/src/main/java/org/apache/hadoop/hive/shims/HadoopShims.java 
> 5a6bc44 
> 
> Diff: https://reviews.apache.org/r/34447/diff/
> 
> 
> Testing
> ---
> 
> New unit test added.  Manually tested.
> 
> 
> Thanks,
> 
> Szehon Ho
> 
>



Re: Review Request 34248: HIVE-10684 Fix the unit test failures for HIVE-7553 after HIVE-10674 removed the binary jar files

2015-05-27 Thread cheng xu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34248/
---

(Updated May 28, 2015, 2:31 a.m.)


Review request for hive and Sushanth Sowmyan.


Bugs: HIVE-10684
https://issues.apache.org/jira/browse/HIVE-10684


Repository: hive-git


Description
---

Remove binaries from source and fix the failed cases


Diffs (updated)
-

  ql/src/test/org/apache/hadoop/hive/ql/session/TestSessionState.java 45ba07e 
  ql/src/test/resources/RefreshedJarClassV1.txt PRE-CREATION 
  ql/src/test/resources/RefreshedJarClassV2.txt PRE-CREATION 

Diff: https://reviews.apache.org/r/34248/diff/


Testing
---

UT passed


Thanks,

cheng xu



Re: Review Request 34447: HIVE-10761 : Create codahale-based metrics system for Hive

2015-05-27 Thread Szehon Ho


> On May 27, 2015, 9:29 p.m., Xuefu Zhang wrote:
> > common/src/java/org/apache/hadoop/hive/common/metrics/metrics2/Metrics.java,
> >  line 141
> > 
> >
> > If the synchronized block is for the whole method, we might just as 
> > well declare the whole method as synchronized.

In this context, I think a object synchronization makes more sense than 
synchronizing on the class (sycnrhonized method).


- Szehon


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34447/#review85418
---


On May 28, 2015, 2:11 a.m., Szehon Ho wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/34447/
> ---
> 
> (Updated May 28, 2015, 2:11 a.m.)
> 
> 
> Review request for hive, Chao Sun, Jimmy Xiang, and Xuefu Zhang.
> 
> 
> Bugs: HIVE-10761
> https://issues.apache.org/jira/browse/HIVE-10761
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> See JIRA for the motivation.  Summary: There is an existing metric system 
> that uses some custom model and hooked up to JMX reporting, codahale-based 
> metrics system will be desirable for standard model and reporting.
> 
> This adds a codahale-based metrics system to HiveServer2 and HiveMetastore.  
> Metrics implementation is now internally pluggable, and the existing Metrics 
> system can be re-enabled by configuration if desired for 
> backward-compatibility.
> 
> Following metrics are supported by Metrics system:
> 1.  JVMPauseMonitor (used to call Hadoop's internal implementation, now 
> forked off to integrate with Metrics system)
> 2.  HMS API calls
> 3.  Standard JVM metrics (only for new implementation, as its free with 
> codahale).
> 
> The following metrics reporting are supported by new system (configuration 
> exposed)
> 1.  JMX
> 2.  CONSOLE
> 3.  JSON_FILE (periodic file of metrics that gets overwritten).
> 
> A goal is to add a webserver that exposes the JSON metrics, but this will 
> defer to a later implementation.
> 
> 
> Diffs
> -
> 
>   common/pom.xml a615c1e 
>   common/src/java/org/apache/hadoop/hive/common/JvmPauseMonitor.java 
> PRE-CREATION 
>   common/src/java/org/apache/hadoop/hive/common/metrics/LegacyMetrics.java 
> PRE-CREATION 
>   common/src/java/org/apache/hadoop/hive/common/metrics/Metrics.java 01c9d1d 
>   common/src/java/org/apache/hadoop/hive/common/metrics/common/Metrics.java 
> PRE-CREATION 
>   
> common/src/java/org/apache/hadoop/hive/common/metrics/common/MetricsFactory.java
>  PRE-CREATION 
>   
> common/src/java/org/apache/hadoop/hive/common/metrics/metrics2/CodahaleMetrics.java
>  PRE-CREATION 
>   
> common/src/java/org/apache/hadoop/hive/common/metrics/metrics2/MetricsReporting.java
>  PRE-CREATION 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 49b8f97 
>   
> common/src/test/org/apache/hadoop/hive/common/metrics/TestLegacyMetrics.java 
> PRE-CREATION 
>   common/src/test/org/apache/hadoop/hive/common/metrics/TestMetrics.java 
> e85d3f8 
>   
> common/src/test/org/apache/hadoop/hive/common/metrics/metrics2/TestCodahaleMetrics.java
>  PRE-CREATION 
>   
> itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/TestMetaStoreMetrics.java
>  PRE-CREATION 
>   metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 
> d81c856 
>   pom.xml b21d894 
>   service/src/java/org/apache/hive/service/server/HiveServer2.java 58e8e49 
>   shims/0.20S/src/main/java/org/apache/hadoop/hive/shims/Hadoop20SShims.java 
> 6d8166c 
>   shims/0.23/src/main/java/org/apache/hadoop/hive/shims/Hadoop23Shims.java 
> 19324b8 
>   shims/common/src/main/java/org/apache/hadoop/hive/shims/HadoopShims.java 
> 5a6bc44 
> 
> Diff: https://reviews.apache.org/r/34447/diff/
> 
> 
> Testing
> ---
> 
> New unit test added.  Manually tested.
> 
> 
> Thanks,
> 
> Szehon Ho
> 
>



Re: Review Request 34447: HIVE-10761 : Create codahale-based metrics system for Hive

2015-05-27 Thread Szehon Ho

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34447/
---

(Updated May 28, 2015, 2:11 a.m.)


Review request for hive, Chao Sun, Jimmy Xiang, and Xuefu Zhang.


Changes
---

Address review comments.


Bugs: HIVE-10761
https://issues.apache.org/jira/browse/HIVE-10761


Repository: hive-git


Description
---

See JIRA for the motivation.  Summary: There is an existing metric system that 
uses some custom model and hooked up to JMX reporting, codahale-based metrics 
system will be desirable for standard model and reporting.

This adds a codahale-based metrics system to HiveServer2 and HiveMetastore.  
Metrics implementation is now internally pluggable, and the existing Metrics 
system can be re-enabled by configuration if desired for backward-compatibility.

Following metrics are supported by Metrics system:
1.  JVMPauseMonitor (used to call Hadoop's internal implementation, now forked 
off to integrate with Metrics system)
2.  HMS API calls
3.  Standard JVM metrics (only for new implementation, as its free with 
codahale).

The following metrics reporting are supported by new system (configuration 
exposed)
1.  JMX
2.  CONSOLE
3.  JSON_FILE (periodic file of metrics that gets overwritten).

A goal is to add a webserver that exposes the JSON metrics, but this will defer 
to a later implementation.


Diffs (updated)
-

  common/pom.xml a615c1e 
  common/src/java/org/apache/hadoop/hive/common/JvmPauseMonitor.java 
PRE-CREATION 
  common/src/java/org/apache/hadoop/hive/common/metrics/LegacyMetrics.java 
PRE-CREATION 
  common/src/java/org/apache/hadoop/hive/common/metrics/Metrics.java 01c9d1d 
  common/src/java/org/apache/hadoop/hive/common/metrics/common/Metrics.java 
PRE-CREATION 
  
common/src/java/org/apache/hadoop/hive/common/metrics/common/MetricsFactory.java
 PRE-CREATION 
  
common/src/java/org/apache/hadoop/hive/common/metrics/metrics2/CodahaleMetrics.java
 PRE-CREATION 
  
common/src/java/org/apache/hadoop/hive/common/metrics/metrics2/MetricsReporting.java
 PRE-CREATION 
  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 49b8f97 
  common/src/test/org/apache/hadoop/hive/common/metrics/TestLegacyMetrics.java 
PRE-CREATION 
  common/src/test/org/apache/hadoop/hive/common/metrics/TestMetrics.java 
e85d3f8 
  
common/src/test/org/apache/hadoop/hive/common/metrics/metrics2/TestCodahaleMetrics.java
 PRE-CREATION 
  
itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/TestMetaStoreMetrics.java
 PRE-CREATION 
  metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 
d81c856 
  pom.xml b21d894 
  service/src/java/org/apache/hive/service/server/HiveServer2.java 58e8e49 
  shims/0.20S/src/main/java/org/apache/hadoop/hive/shims/Hadoop20SShims.java 
6d8166c 
  shims/0.23/src/main/java/org/apache/hadoop/hive/shims/Hadoop23Shims.java 
19324b8 
  shims/common/src/main/java/org/apache/hadoop/hive/shims/HadoopShims.java 
5a6bc44 

Diff: https://reviews.apache.org/r/34447/diff/


Testing
---

New unit test added.  Manually tested.


Thanks,

Szehon Ho



Re: Review Request 34455: HIVE-10550 Dynamic RDD caching optimization for HoS.[Spark Branch]

2015-05-27 Thread Xuefu Zhang


> On May 27, 2015, 10:13 p.m., Xuefu Zhang wrote:
> > common/src/java/org/apache/hadoop/hive/conf/HiveConf.java, line 2062
> > 
> >
> > Sorry for pointing this out late. I'm not certain if it's a good idea 
> > to expose these two configurations. Also this introduces a change of  
> > behavior. For now, can we get rid of them and change the persistency level 
> > back to MEM+DISK?
> > 
> > We can come back to revisit this later on. At this moment, I don't feel 
> > confident to make the call.
> 
> chengxiang li wrote:
> persistent to MEM + DISK may hurt the performance in certain cases, i 
> think at least we should have a switch to open/close this optimization,

Agreed. However, before we find out more about in what cases this helps or 
hurts, I think it's better we keep the existing behavior. This doesn't prevent 
us from adding a flag later on.


- Xuefu


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34455/#review85451
---


On May 27, 2015, 1:50 a.m., chengxiang li wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/34455/
> ---
> 
> (Updated May 27, 2015, 1:50 a.m.)
> 
> 
> Review request for hive, Chao Sun, Jimmy Xiang, and Xuefu Zhang.
> 
> 
> Bugs: HIVE-10550
> https://issues.apache.org/jira/browse/HIVE-10550
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> see jira description
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 43c53fc 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/CacheTran.java 
> PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/MapTran.java 2170243 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/ReduceTran.java e60dfac 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkPlan.java ee5c78a 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkPlanGenerator.java 
> 3f240f5 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkUtilities.java 
> e6c845c 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/SparkRddCachingResolver.java
>  PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkCompiler.java 
> 19aae70 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/SparkWork.java bb5dd79 
> 
> Diff: https://reviews.apache.org/r/34455/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> chengxiang li
> 
>



[GitHub] hive pull request: HIVE-10843

2015-05-27 Thread thejasmn
GitHub user thejasmn opened a pull request:

https://github.com/apache/hive/pull/40

HIVE-10843



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/thejasmn/hive HIVE-10843

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/hive/pull/40.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #40


commit 9a99f25a0acc1d5b8e611fead6f5dffa985176e8
Author: Thejas Nair 
Date:   2015-05-27T18:12:52Z

show tables now passes the current db name

commit 574e3da1220500d1548d4b2431883db8a7da6028
Author: Thejas Nair 
Date:   2015-05-28T00:35:21Z

add db info in describe db command




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] hive pull request: Hive 10843

2015-05-27 Thread thejasmn
GitHub user thejasmn opened a pull request:

https://github.com/apache/hive/pull/39

Hive 10843



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/thejasmn/hive HIVE-10843

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/hive/pull/39.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #39


commit 9a99f25a0acc1d5b8e611fead6f5dffa985176e8
Author: Thejas Nair 
Date:   2015-05-27T18:12:52Z

show tables now passes the current db name

commit 574e3da1220500d1548d4b2431883db8a7da6028
Author: Thejas Nair 
Date:   2015-05-28T00:35:21Z

add db info in describe db command




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] hive pull request: Hive 10843

2015-05-27 Thread thejasmn
Github user thejasmn closed the pull request at:

https://github.com/apache/hive/pull/39


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (HIVE-10843) desc database and show tables commands don't pass db to HiveAuthorizer check

2015-05-27 Thread Thejas M Nair (JIRA)
Thejas M Nair created HIVE-10843:


 Summary: desc database and show tables commands don't pass db to 
HiveAuthorizer check
 Key: HIVE-10843
 URL: https://issues.apache.org/jira/browse/HIVE-10843
 Project: Hive
  Issue Type: Bug
Reporter: Thejas M Nair
Assignee: Thejas M Nair


'show tables' and 'describe database' command should pass the database 
information for the command to HiveAuthorizer . This is needed for any auditing 
the hive authorizer might implement, or any authorization check it might decide 
to do based on the given database name.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 34455: HIVE-10550 Dynamic RDD caching optimization for HoS.[Spark Branch]

2015-05-27 Thread chengxiang li


> On 五月 27, 2015, 10:13 p.m., Xuefu Zhang wrote:
> > common/src/java/org/apache/hadoop/hive/conf/HiveConf.java, line 2062
> > 
> >
> > Sorry for pointing this out late. I'm not certain if it's a good idea 
> > to expose these two configurations. Also this introduces a change of  
> > behavior. For now, can we get rid of them and change the persistency level 
> > back to MEM+DISK?
> > 
> > We can come back to revisit this later on. At this moment, I don't feel 
> > confident to make the call.

persistent to MEM + DISK may hurt the performance in certain cases, i think at 
least we should have a switch to open/close this optimization,


- chengxiang


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34455/#review85451
---


On 五月 27, 2015, 1:50 a.m., chengxiang li wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/34455/
> ---
> 
> (Updated 五月 27, 2015, 1:50 a.m.)
> 
> 
> Review request for hive, Chao Sun, Jimmy Xiang, and Xuefu Zhang.
> 
> 
> Bugs: HIVE-10550
> https://issues.apache.org/jira/browse/HIVE-10550
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> see jira description
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 43c53fc 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/CacheTran.java 
> PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/MapTran.java 2170243 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/ReduceTran.java e60dfac 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkPlan.java ee5c78a 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkPlanGenerator.java 
> 3f240f5 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkUtilities.java 
> e6c845c 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/SparkRddCachingResolver.java
>  PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkCompiler.java 
> 19aae70 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/SparkWork.java bb5dd79 
> 
> Diff: https://reviews.apache.org/r/34455/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> chengxiang li
> 
>



[jira] [Created] (HIVE-10842) LLAP: DAGs get stuck in yet another way

2015-05-27 Thread Sergey Shelukhin (JIRA)
Sergey Shelukhin created HIVE-10842:
---

 Summary: LLAP: DAGs get stuck in yet another way
 Key: HIVE-10842
 URL: https://issues.apache.org/jira/browse/HIVE-10842
 Project: Hive
  Issue Type: Sub-task
Reporter: Sergey Shelukhin


Looks exactly like HIVE-10744. Last comment there has internal app IDs. Logs 
upon request.
6 (number of slots) tasks from a machine are stuck.
jstack for target daemon sayeth:
{noformat}
   7 Found one Java-level deadlock:
  8 =
  9 
 10 "IPC Server handler 4 on 15001":
 11   waiting to lock Monitor@0x7f3cb0005cb8 (Object@0x8cc3ce98, a 
java/lang/Object),
 12   which is held by "Wait-Queue-Scheduler-0"
 13 "Wait-Queue-Scheduler-0":
 14   waiting to lock Monitor@0x7f3cb0004d98 (Object@0x9234cf58, a 
org/apache/hadoop/hive/llap/daemon/impl/Q ueryInfo$FinishableStateTracker),
 15   which is held by "IPC Server handler 4 on 15001"
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: How to debug hive unit test in eclipse ?

2015-05-27 Thread Lefty Leverenz
This is great -- thanks Bob!

Would you be willing to contribute it to the Hive wiki, or at least allow
us to link to it from the Testing Docs overview
?

-- Lefty


On Mon, May 25, 2015 at 12:02 PM, Bob Freitas 
wrote:

> Hi Jeff,
>
> I recently needed to figure out how to do unit testing of Hive scripts, and
> it turned out to be something of an adventure.  I had done some previous
> work in this area but things have changed with MR2 and YARN, gee go
> figure...
>
> What I ended up doing was going through the Hive source code to figure out
> how the dev team was doing the testing.  To help out people who come after
> me, I put together an article and github repo
>
> http://www.lopakalogic.com/articles/hadoop-articles/hive-testing/
>
> With this I was able to step through my script, the Hadoop code, the Hive
> code, it was pretty cool!
>
> Hope it helps!
>


Re: Big Lock in Driver.compileInternal

2015-05-27 Thread Sergey Shelukhin
Hi. As luck would have it, we are currently looking at this issue :)
I have a small patch up at
https://issues.apache.org/jira/browse/HIVE-4239; I tested it a bit w/a
unit test and some manual cluster testing. Would you be willing to test it
on your setup?

On 15/5/25, 20:54, "Loudongfeng"  wrote:

>Hi, All
>
>I notice that there is a big lock in org.apache.hadoop.hive.ql.Driver
>Following is a piece of code from Apache Hive 1.2.0
>
>private static final Object compileMonitor = new Object();
>
>private int compileInternal(String command) {
>  int ret;
>  synchronized (compileMonitor) {
>ret = compile(command);
>  }
>...
>}
>
>This means HQLs submitted concurrently from clients side will be compiled
>one by one on Hive Server side.
>This will cause problem when compile phase is slow.
>
>My question is ,what does this lock protect for? Is it possible to remove
>it ?
>
>Best Regards
>Nemon



Re: [VOTE] Stable releases from branch-1 and experimental releases from master

2015-05-27 Thread Lefty Leverenz
+1

-- Lefty

On Wed, May 27, 2015 at 3:21 PM, Alexander Pivovarov 
wrote:

> +1
> On May 27, 2015 10:45 AM, "Vikram Dixit K"  wrote:
>
> > +1 for all the reasons outlined.
> >
> > On Tue, May 26, 2015 at 6:13 PM, Thejas Nair 
> > wrote:
> > > +1
> > > - This is great for users who want to take longer to upgrade from
> > > hadoop-1 and care mainly for bug fixes and incremental features,
> > > rather than radical new features.
> > > - The ability to release initial 2.x releases marked as alpha/beta
> > > also helps to get users to try it out, and also lets them choose what
> > > is right for them.
> > > - This also lets developers focus on major new features without the
> > > burden of maintaining hadoop-1 compatibility.
> > >
> > > On Tue, May 26, 2015 at 11:41 AM, Alan Gates 
> > wrote:
> > >> We have discussed this for several weeks now.  Some concerns have been
> > >> raised which I have tried to address.  I think it is time to vote on
> it
> > as
> > >> our release plan.  To be specific, I propose:
> > >>
> > >> Hive makes a branch-1 from the current master.  This would be used for
> > 1.3
> > >> and future 1.x releases.  This branch would not deprecate existing
> > >> functionality.  Any new features in this branch would also need to be
> > put on
> > >> master.  An upgrade path for users will be maintained from one 1.x
> > release
> > >> to the next, as well as from the latest 1.x release to the latest 2.x
> > >> release.
> > >>
> > >> Going forward releases numbered 2.x will be made from master.  The
> > purpose
> > >> of these releases will be to enable users to get access to new
> features
> > >> being developed in Hive and allow developers to get feedback.  It is
> > >> expected that for a while these releases will not be production ready
> > and
> > >> will be clearly so labeled.  Some legacy features, such as Hadoop 1
> and
> > >> MapReduce, will no longer be supported in the master.  Any critical
> bug
> > >> fixes (security, incorrect results, crashes) fixed in master will also
> > be
> > >> ported to branch-1 for at least a year.  This time period may be
> > extended in
> > >> the future based on the stability and adoption of 2.x releases.
> > >>
> > >> Based on Hive's bylaws this release plan vote will be open for 3 days
> > and
> > >> all active committers have binding votes.
> > >>
> > >> Here's my +1.
> > >>
> > >> Alan.
> >
> >
> >
> > --
> > Nothing better than when appreciated for hard work.
> > -Mark
> >
>


[jira] [Created] (HIVE-10841) [WHERE col is not null] does not work for large queries

2015-05-27 Thread Alexander Pivovarov (JIRA)
Alexander Pivovarov created HIVE-10841:
--

 Summary: [WHERE col is not null] does not work for large queries
 Key: HIVE-10841
 URL: https://issues.apache.org/jira/browse/HIVE-10841
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Alexander Pivovarov


The result from the following SELCT query is 3 rows but it should be 1 row.
I checked it in MySQL - it returned 1 row.

To reproduce the issue in Hive
1. prepare tables
{code}
drop table if exists L;
drop table if exists LA;
drop table if exists FR;
drop table if exists A;
drop table if exists PI;
drop table if exists acct;

create table L as select 4436 id;
create table LA as select 4436 loan_id, 4748 aid, 4415 pi_id;
create table FR as select 4436 loan_id;
create table A as select 4748 id;
create table PI as select 4415 id;

create table acct as select 4748 aid, 10 acc_n, 122 brn;
insert into table acct values(4748, null, null);
insert into table acct values(4748, null, null);
{code}

2. run SELECT query
{code}
select
  acct.ACC_N,
  acct.brn
FROM L
JOIN LA ON L.id = LA.loan_id
JOIN FR ON L.id = FR.loan_id
JOIN A ON LA.aid = A.id
JOIN PI ON PI.id = LA.pi_id
JOIN acct ON A.id = acct.aid
WHERE
  L.id = 4436
  and acct.brn is not null;
{code}

the result is 3 rows
{code}
10  122
NULLNULL
NULLNULL
{code}

but it should be 1 row

{code}
10  122
{code}

3. workaround is to put "acct.brn is not null" to join condition
{code}
select
  acct.ACC_N,
  acct.brn
FROM L
JOIN LA ON L.id = LA.loan_id
JOIN FR ON L.id = FR.loan_id
JOIN A ON LA.aid = A.id
JOIN PI ON PI.id = LA.pi_id
JOIN acct ON A.id = acct.aid and acct.brn is not null
WHERE
  L.id = 4436;
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


JIRA: sort attachments by date

2015-05-27 Thread Lefty Leverenz
Is there any way to change the default for JIRA attachments to "Sort By
Date" instead of "Sort By Name"?

"Manage Attachments" doesn't have anything useful.

-- Lefty


[jira] [Created] (HIVE-10840) NumberFormatException while running analyze table partition compute statics query

2015-05-27 Thread Ashutosh Chauhan (JIRA)
Ashutosh Chauhan created HIVE-10840:
---

 Summary: NumberFormatException while running analyze table 
partition compute statics query
 Key: HIVE-10840
 URL: https://issues.apache.org/jira/browse/HIVE-10840
 Project: Hive
  Issue Type: Bug
  Components: Statistics
Affects Versions: 1.2.0
Reporter: Jagruti Varia
Assignee: Ashutosh Chauhan






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-10839) TestHCatLoaderEncryption.* tests fail in windows because of path related issues

2015-05-27 Thread Hari Sankar Sivarama Subramaniyan (JIRA)
Hari Sankar Sivarama Subramaniyan created HIVE-10839:


 Summary: TestHCatLoaderEncryption.* tests fail in windows because 
of path related issues
 Key: HIVE-10839
 URL: https://issues.apache.org/jira/browse/HIVE-10839
 Project: Hive
  Issue Type: Bug
  Components: Tests
 Environment: Windows OS
Reporter: Hari Sankar Sivarama Subramaniyan
Assignee: Hari Sankar Sivarama Subramaniyan


I am getting the following errors while trying to run 
org.apache.hive.hcatalog.pig.TestHCatLoaderEncryption.* tests in windows.

{code}
Encryption key created: 'key_128'
(1,Encryption Processor Helper Failed:Pathname 
/D:/w/hv/hcatalog/hcatalog-pig-adapter/target/tmp/org.apache.hive.hcatalog.pig.TestHCatLoader-1432579852919/warehouse/encryptedTable
 from 
D:/w/hv/hcatalog/hcatalog-pig-adapter/target/tmp/org.apache.hive.hcatalog.pig.TestHCatLoader-1432579852919/warehouse/encryptedTable
 is not a valid DFS filename.,null)
Encryption key deleted: 'key_128'
{code}

{code}
Error Message

Could not fully delete 
D:\w\hv\hcatalog\hcatalog-pig-adapter\target\tmp\dfs\name1
Stacktrace

java.io.IOException: Could not fully delete 
D:\w\hv\hcatalog\hcatalog-pig-adapter\target\tmp\dfs\name1
at 
org.apache.hadoop.hdfs.MiniDFSCluster.createNameNodesAndSetConf(MiniDFSCluster.java:940)
at 
org.apache.hadoop.hdfs.MiniDFSCluster.initMiniDFSCluster(MiniDFSCluster.java:811)
at org.apache.hadoop.hdfs.MiniDFSCluster.(MiniDFSCluster.java:742)
at org.apache.hadoop.hdfs.MiniDFSCluster.(MiniDFSCluster.java:612)
at 
org.apache.hadoop.hive.shims.Hadoop23Shims.getMiniDfs(Hadoop23Shims.java:523)
at 
org.apache.hive.hcatalog.pig.TestHCatLoaderEncryption.initEncryptionShim(TestHCatLoaderEncryption.java:242)
at 
org.apache.hive.hcatalog.pig.TestHCatLoaderEncryption.setup(TestHCatLoaderEncryption.java:190)
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-10838) Allow Hive metastore client can use different hostname which has multiple hostnames when security is enable

2015-05-27 Thread HeeSoo Kim (JIRA)
HeeSoo Kim created HIVE-10838:
-

 Summary: Allow Hive metastore client can use different hostname 
which has multiple hostnames when security is enable
 Key: HIVE-10838
 URL: https://issues.apache.org/jira/browse/HIVE-10838
 Project: Hive
  Issue Type: Task
Reporter: HeeSoo Kim
Assignee: HeeSoo Kim


Currently if Hive metastore client (e.g. HS2, oozie) tries to connect the hive 
metastore to when security is enabled, the Hive metastore client will fail to 
connect with an error like the following:
{code}
2015-05-21 23:17:59,554 ERROR metadata.Hive 
(Hive.java:getDelegationToken(2638)) - MetaException(message:Unauthorized 
connection for super-user: 
hiveserver/hiveserver-dpci.s3s.altiscale@test.altiscale.com from IP 
10.250.16.43)
at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_delegation_token_result$get_delegation_token_resultStandardScheme.read(ThriftHiveMetastore.java)
at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_delegation_token_result$get_delegation_token_resultStandardScheme.read(ThriftHiveMetastore.java)
at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_delegation_token_result.read(ThriftHiveMetastore.java)
at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78)
at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_get_delegation_token(ThriftHiveMetastore.java:3293)
at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_delegation_token(ThriftHiveMetastore.java:3279)
at 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getDelegationToken(HiveMetaStoreClient.java:1559)
{code}
This is the case when if Hive metastore client's default IP address is the 
different from hostname of the Hive metastore client's kerberos principal. And 
the Hive metastore client has multiple IP addresses.
We need to set the bind address when Hive metastore client tries to connect 
Hive metastore based on hostname of Kerberos.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [VOTE] Stable releases from branch-1 and experimental releases from master

2015-05-27 Thread Alexander Pivovarov
+1
On May 27, 2015 10:45 AM, "Vikram Dixit K"  wrote:

> +1 for all the reasons outlined.
>
> On Tue, May 26, 2015 at 6:13 PM, Thejas Nair 
> wrote:
> > +1
> > - This is great for users who want to take longer to upgrade from
> > hadoop-1 and care mainly for bug fixes and incremental features,
> > rather than radical new features.
> > - The ability to release initial 2.x releases marked as alpha/beta
> > also helps to get users to try it out, and also lets them choose what
> > is right for them.
> > - This also lets developers focus on major new features without the
> > burden of maintaining hadoop-1 compatibility.
> >
> > On Tue, May 26, 2015 at 11:41 AM, Alan Gates 
> wrote:
> >> We have discussed this for several weeks now.  Some concerns have been
> >> raised which I have tried to address.  I think it is time to vote on it
> as
> >> our release plan.  To be specific, I propose:
> >>
> >> Hive makes a branch-1 from the current master.  This would be used for
> 1.3
> >> and future 1.x releases.  This branch would not deprecate existing
> >> functionality.  Any new features in this branch would also need to be
> put on
> >> master.  An upgrade path for users will be maintained from one 1.x
> release
> >> to the next, as well as from the latest 1.x release to the latest 2.x
> >> release.
> >>
> >> Going forward releases numbered 2.x will be made from master.  The
> purpose
> >> of these releases will be to enable users to get access to new features
> >> being developed in Hive and allow developers to get feedback.  It is
> >> expected that for a while these releases will not be production ready
> and
> >> will be clearly so labeled.  Some legacy features, such as Hadoop 1 and
> >> MapReduce, will no longer be supported in the master.  Any critical bug
> >> fixes (security, incorrect results, crashes) fixed in master will also
> be
> >> ported to branch-1 for at least a year.  This time period may be
> extended in
> >> the future based on the stability and adoption of 2.x releases.
> >>
> >> Based on Hive's bylaws this release plan vote will be open for 3 days
> and
> >> all active committers have binding votes.
> >>
> >> Here's my +1.
> >>
> >> Alan.
>
>
>
> --
> Nothing better than when appreciated for hard work.
> -Mark
>


Re: Review Request 34455: HIVE-10550 Dynamic RDD caching optimization for HoS.[Spark Branch]

2015-05-27 Thread Xuefu Zhang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34455/#review85451
---



common/src/java/org/apache/hadoop/hive/conf/HiveConf.java


Sorry for pointing this out late. I'm not certain if it's a good idea to 
expose these two configurations. Also this introduces a change of  behavior. 
For now, can we get rid of them and change the persistency level back to 
MEM+DISK?

We can come back to revisit this later on. At this moment, I don't feel 
confident to make the call.


- Xuefu Zhang


On May 27, 2015, 1:50 a.m., chengxiang li wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/34455/
> ---
> 
> (Updated May 27, 2015, 1:50 a.m.)
> 
> 
> Review request for hive, Chao Sun, Jimmy Xiang, and Xuefu Zhang.
> 
> 
> Bugs: HIVE-10550
> https://issues.apache.org/jira/browse/HIVE-10550
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> see jira description
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 43c53fc 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/CacheTran.java 
> PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/MapTran.java 2170243 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/ReduceTran.java e60dfac 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkPlan.java ee5c78a 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkPlanGenerator.java 
> 3f240f5 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkUtilities.java 
> e6c845c 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/SparkRddCachingResolver.java
>  PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkCompiler.java 
> 19aae70 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/SparkWork.java bb5dd79 
> 
> Diff: https://reviews.apache.org/r/34455/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> chengxiang li
> 
>



Re: Review Request 34447: HIVE-10761 : Create codahale-based metrics system for Hive

2015-05-27 Thread Xuefu Zhang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34447/#review85418
---



common/src/java/org/apache/hadoop/hive/common/JvmPauseMonitor.java


Maybe a more informational message



common/src/java/org/apache/hadoop/hive/common/JvmPauseMonitor.java


should we check isStarted()?



common/src/java/org/apache/hadoop/hive/common/metrics/MetricsLegacy.java


LegacyMetrics?



common/src/java/org/apache/hadoop/hive/common/metrics/common/MetricsFactory.java


This should be also synchronized.



common/src/java/org/apache/hadoop/hive/common/metrics/common/MetricsFactory.java


Should we call it deinit()?



common/src/java/org/apache/hadoop/hive/common/metrics/metrics2/Metrics.java


Could we rename the class so that we don't have to handle the duplicated 
class/interface names?



common/src/java/org/apache/hadoop/hive/common/metrics/metrics2/Metrics.java


Could we rename the class so that we don't have to handle the duplicated 
class/interface names?



common/src/java/org/apache/hadoop/hive/common/metrics/metrics2/Metrics.java


If the synchronized block is for the whole method, we might just as well 
declare the whole method as synchronized.



common/src/java/org/apache/hadoop/hive/common/metrics/metrics2/Metrics.java


Same as above.



common/src/java/org/apache/hadoop/hive/common/metrics/metrics2/Metrics.java


Shouldn't this be private?



common/src/java/org/apache/hadoop/hive/common/metrics/metrics2/Metrics.java


I think fd needs to be closed properly in final block.



common/src/java/org/apache/hadoop/hive/common/metrics/metrics2/Metrics.java


I think checking initialized needs to be synchronized.



common/src/java/org/apache/hadoop/hive/common/metrics/metrics2/Metrics.java


Same as above.



metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java


Where do we call uninit() or it doesn't matter? Same for HS2.


- Xuefu Zhang


On May 27, 2015, 6:25 p.m., Szehon Ho wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/34447/
> ---
> 
> (Updated May 27, 2015, 6:25 p.m.)
> 
> 
> Review request for hive, Chao Sun, Jimmy Xiang, and Xuefu Zhang.
> 
> 
> Bugs: HIVE-10761
> https://issues.apache.org/jira/browse/HIVE-10761
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> See JIRA for the motivation.  Summary: There is an existing metric system 
> that uses some custom model and hooked up to JMX reporting, codahale-based 
> metrics system will be desirable for standard model and reporting.
> 
> This adds a codahale-based metrics system to HiveServer2 and HiveMetastore.  
> Metrics implementation is now internally pluggable, and the existing Metrics 
> system can be re-enabled by configuration if desired for 
> backward-compatibility.
> 
> Following metrics are supported by Metrics system:
> 1.  JVMPauseMonitor (used to call Hadoop's internal implementation, now 
> forked off to integrate with Metrics system)
> 2.  HMS API calls
> 3.  Standard JVM metrics (only for new implementation, as its free with 
> codahale).
> 
> The following metrics reporting are supported by new system (configuration 
> exposed)
> 1.  JMX
> 2.  CONSOLE
> 3.  JSON_FILE (periodic file of metrics that gets overwritten).
> 
> A goal is to add a webserver that exposes the JSON metrics, but this will 
> defer to a later implementation.
> 
> 
> Diffs
> -
> 
>   common/pom.xml a615c1e 
>   common/src/java/org/apache/hadoop/hive/common/JvmPauseMonitor.java 
> PRE-CREATION 
>   common/src/java/org/apache/hadoop/hive/common/metrics/Metrics.java 01c9d1d 
>   common/src/java/org/apache/hadoop/hive/common/metrics/MetricsLegacy.java 
> PRE-CREATION 
>   common/src/java/org/apache/hadoop/hive/common/metrics/common/Metrics.java 
> PRE-CREATION 
>   
> common/src/java/org/apache/hadoop/hive/common/metrics/common/MetricsFactory.java
>  PRE-CREATION 
>   common/src/java/org/apache/hadoop/hive/common/metrics/metrics2/Metrics.java 
> PRE-

[jira] [Created] (HIVE-10837) Running large queries (inserts) fails and crashes hiveserver2

2015-05-27 Thread Patrick McAnneny (JIRA)
Patrick McAnneny created HIVE-10837:
---

 Summary: Running large queries (inserts) fails and crashes 
hiveserver2
 Key: HIVE-10837
 URL: https://issues.apache.org/jira/browse/HIVE-10837
 Project: Hive
  Issue Type: Bug
 Environment: Hive 1.1.0 on RHEL with Cloudera (cdh5.4.0)
Reporter: Patrick McAnneny
Priority: Critical


When running a large insert statement through beeline or pyhs2, a thrift error 
is returned and hiveserver2 crashes.

I ran into this with large insert statements -- my initial failing query was 
around 6million characters. After further testing however it seems like the 
failure threshold is based on number of inserted rows rather than the query's 
size in characters. My testing shows the failure threshold between 199,000 and 
230,000 inserted rows.

The thrift error is as follows:

Error: org.apache.thrift.transport.TTransportException: 
java.net.SocketException: Broken pipe (state=08S01,code=0)


Also note for anyone that tests this issue - when testing different queries I 
ran into https://issues.apache.org/jira/browse/HIVE-10836




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 34696: HIVE-686 add UDF substring_index

2015-05-27 Thread Swarnim Kulkarni


> On May 27, 2015, 4:42 a.m., Swarnim Kulkarni wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFSubstringIndex.java,
> >  line 45
> > 
> >
> > Worth mentinoning in your example what the expected output would look 
> > like?
> 
> Alexander Pivovarov wrote:
> Not sure I got the issue...
> 
> --- desc output
> hive> desc function extended substring_index;
> OK
> ...
> Example:
>  > SELECT substring_index('www.apache.org', '.', 2);
>  'www.apache'
> 
> 
> -- actual select
> hive> SELECT substring_index('www.apache.org', '.', 2);
> OK
> www.apache
> 
> Swarnim Kulkarni wrote:
> My point was just that why not also include a sample result what the 
> users could expect to see after this command is executed. Might improve the 
> readability a bit.
> 
> Alexander Pivovarov wrote:
> it's included. The result is 'www.apache' - right adter \n symbol

Ah ok. Sorry missed that.


- Swarnim


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34696/#review85318
---


On May 27, 2015, 3:35 a.m., Alexander Pivovarov wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/34696/
> ---
> 
> (Updated May 27, 2015, 3:35 a.m.)
> 
> 
> Review request for hive, Hao Cheng, Jason Dere, namit jain, and Thejas Nair.
> 
> 
> Bugs: HIVE-686
> https://issues.apache.org/jira/browse/HIVE-686
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-686 add UDF substring_index
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java 
> 94a3b1787e2b3571eb7a8102c28f7334ae3fa829 
>   
> ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFSubstringIndex.java
>  PRE-CREATION 
>   
> ql/src/test/org/apache/hadoop/hive/ql/udf/generic/TestGenericUDFSubstringIndex.java
>  PRE-CREATION 
>   ql/src/test/queries/clientpositive/udf_substring_index.q PRE-CREATION 
>   ql/src/test/results/clientpositive/show_functions.q.out 
> 16820ca887320da13a42bebe0876f29eec373c8f 
>   ql/src/test/results/clientpositive/udf_substring_index.q.out PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/34696/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Alexander Pivovarov
> 
>



Re: Review Request 34576: Bucketized Table feature fails in some cases

2015-05-27 Thread John Pullokkaran


> On May 24, 2015, 2:03 a.m., Xuefu Zhang wrote:
> > Have you thought of what if the client is not interactive, such as JDBC or 
> > thrift?
> 
> pengcheng xiong wrote:
> I am sorry that we have not thought about it yet. We admitted that the 
> patch will not cover the case when the client is not interactive. Do you have 
> any good ideas that you can share with us? Do you think logging this besides 
> printing a waring msg is good enough? Thanks.
> 
> Xuefu Zhang wrote:
> There are all kinds of issues with data loading into bucketed tables. 
> While advanced users might be able to load data correctly, I think that's 
> really rare. The data in a bucketed table needs to be generated by Hive. 
> Thefore, I think we should disable "insert into" and "load data 
> into|overwrite" for a bucketed table. We should also disallow external tables 
> for the same reason.
> 
> To allow the advanced user to achieve what they used to do, we can have a 
> flag, such as "hive.enforce.strict.bucketing", which defaults to true. Those 
> users can proceed by turning this off.
> 
> Another option for "insert into" would be supporting appending new data, 
> such as proposed in HIVE-3244.
> 
> Gopal V wrote:
> Why would you disable "insert into" bucketed tables? How else would ACID 
> work?
> 
> Xuefu Zhang wrote:
> yeah. but I guess we were talking about things out of the context of 
> ACID. Even before ACID, user can do "insert into" a bucketed table, which can 
> be very harmful.

This patch is only addressing "Load" path. Which i think we all agree is a 
problem.


- John


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34576/#review85082
---


On May 23, 2015, 5:47 p.m., pengcheng xiong wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/34576/
> ---
> 
> (Updated May 23, 2015, 5:47 p.m.)
> 
> 
> Review request for hive and John Pullokkaran.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Bucketized Table feature fails in some cases. if src & destination is 
> bucketed on same key, and if actual data in the src is not bucketed (because 
> data got loaded using LOAD DATA LOCAL INPATH ) then the data won't be 
> bucketed while writing to destination.
> Example
> --
> CREATE TABLE P1(key STRING, val STRING)
> CLUSTERED BY (key) SORTED BY (key) INTO 2 BUCKETS STORED AS TEXTFILE;
> LOAD DATA LOCAL INPATH '/Users/jp/apache-hive1/data/files/P1.txt' INTO TABLE 
> P1;
> – perform an insert to make sure there are 2 files
> INSERT OVERWRITE TABLE P1 select key, val from P1;
> --
> This is not a regression. This has never worked.
> This got only discovered due to Hadoop2 changes.
> In Hadoop1, in local mode, number of reducers will always be 1, regardless of 
> what is requested by app. Hadoop2 now honors the number of reducer setting in 
> local mode (by spawning threads).
> Long term solution seems to be to prevent load data for bucketed table.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java e53933e 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/LoadSemanticAnalyzer.java 
> 1a9b42b 
>   ql/src/test/results/clientnegative/bucket_mapjoin_mismatch1.q.out 623c2e8 
>   
> ql/src/test/results/clientnegative/bucket_mapjoin_wrong_table_metadata_1.q.out
>  f4522d2 
>   
> ql/src/test/results/clientnegative/bucket_mapjoin_wrong_table_metadata_2.q.out
>  9aa9b5d 
>   ql/src/test/results/clientnegative/exim_11_nonpart_noncompat_sorting.q.out 
> 9220c8e 
>   ql/src/test/results/clientpositive/auto_join32.q.out bfc8be8 
>   ql/src/test/results/clientpositive/auto_join_filters.q.out a6720d9 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_1.q.out 383defd 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_11.q.out e6e7ef3 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_12.q.out e9fb705 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_2.q.out c089419 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_3.q.out 6e443fa 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_4.q.out feaea04 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_5.q.out f64ecf0 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_7.q.out e89f548 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_8.q.out 44c037f 
>   ql/src/test/results/clientpositive/bucket_map_join_1.q.out d778203 
>   ql/src/test/results/clientpositive/bucket_map_join_2.q.out aef77aa 
>   ql/src/test/results/clientpositive/bucket_map_join_spark1.q.out 870ecdd 
>   ql/src/test/results/clientpositive/buck

Re: Review Request 34576: Bucketized Table feature fails in some cases

2015-05-27 Thread John Pullokkaran


> On May 24, 2015, 1:50 a.m., Xuefu Zhang wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/parse/LoadSemanticAnalyzer.java, line 
> > 226
> > 
> >
> > Warning is proper, but I think the words should say "might" because the 
> > source data might be already bucketed and matches the target, in which 
> > case, there is no problem.

Load command doesn't excersise bucketizing. IMO "will not" is correct.


- John


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34576/#review85081
---


On May 23, 2015, 5:47 p.m., pengcheng xiong wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/34576/
> ---
> 
> (Updated May 23, 2015, 5:47 p.m.)
> 
> 
> Review request for hive and John Pullokkaran.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Bucketized Table feature fails in some cases. if src & destination is 
> bucketed on same key, and if actual data in the src is not bucketed (because 
> data got loaded using LOAD DATA LOCAL INPATH ) then the data won't be 
> bucketed while writing to destination.
> Example
> --
> CREATE TABLE P1(key STRING, val STRING)
> CLUSTERED BY (key) SORTED BY (key) INTO 2 BUCKETS STORED AS TEXTFILE;
> LOAD DATA LOCAL INPATH '/Users/jp/apache-hive1/data/files/P1.txt' INTO TABLE 
> P1;
> – perform an insert to make sure there are 2 files
> INSERT OVERWRITE TABLE P1 select key, val from P1;
> --
> This is not a regression. This has never worked.
> This got only discovered due to Hadoop2 changes.
> In Hadoop1, in local mode, number of reducers will always be 1, regardless of 
> what is requested by app. Hadoop2 now honors the number of reducer setting in 
> local mode (by spawning threads).
> Long term solution seems to be to prevent load data for bucketed table.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java e53933e 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/LoadSemanticAnalyzer.java 
> 1a9b42b 
>   ql/src/test/results/clientnegative/bucket_mapjoin_mismatch1.q.out 623c2e8 
>   
> ql/src/test/results/clientnegative/bucket_mapjoin_wrong_table_metadata_1.q.out
>  f4522d2 
>   
> ql/src/test/results/clientnegative/bucket_mapjoin_wrong_table_metadata_2.q.out
>  9aa9b5d 
>   ql/src/test/results/clientnegative/exim_11_nonpart_noncompat_sorting.q.out 
> 9220c8e 
>   ql/src/test/results/clientpositive/auto_join32.q.out bfc8be8 
>   ql/src/test/results/clientpositive/auto_join_filters.q.out a6720d9 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_1.q.out 383defd 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_11.q.out e6e7ef3 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_12.q.out e9fb705 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_2.q.out c089419 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_3.q.out 6e443fa 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_4.q.out feaea04 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_5.q.out f64ecf0 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_7.q.out e89f548 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_8.q.out 44c037f 
>   ql/src/test/results/clientpositive/bucket_map_join_1.q.out d778203 
>   ql/src/test/results/clientpositive/bucket_map_join_2.q.out aef77aa 
>   ql/src/test/results/clientpositive/bucket_map_join_spark1.q.out 870ecdd 
>   ql/src/test/results/clientpositive/bucket_map_join_spark2.q.out 33f5c46 
>   ql/src/test/results/clientpositive/bucket_map_join_spark3.q.out 067d1ff 
>   ql/src/test/results/clientpositive/bucketcontext_1.q.out 77bfcf9 
>   ql/src/test/results/clientpositive/bucketcontext_2.q.out a9db13d 
>   ql/src/test/results/clientpositive/bucketcontext_3.q.out 9ba3e0c 
>   ql/src/test/results/clientpositive/bucketcontext_4.q.out a2b37a8 
>   ql/src/test/results/clientpositive/bucketcontext_5.q.out 3ee1f0e 
>   ql/src/test/results/clientpositive/bucketcontext_6.q.out d2304fa 
>   ql/src/test/results/clientpositive/bucketcontext_7.q.out 1a105ed 
>   ql/src/test/results/clientpositive/bucketcontext_8.q.out 138e415 
>   ql/src/test/results/clientpositive/bucketizedhiveinputformat_auto.q.out 
> 215efdd 
>   ql/src/test/results/clientpositive/bucketmapjoin1.q.out 72f2a07 
>   ql/src/test/results/clientpositive/bucketmapjoin10.q.out b0e849d 
>   ql/src/test/results/clientpositive/bucketmapjoin11.q.out 4263cab 
>   ql/src/test/results/clientpositive/bucketmapjoin12.q.out bcd7394 
>   ql/src/test/results/clientpositive/bucketmapjoin2.q.out a8d9e9d 
>   ql/src/test/

[jira] [Created] (HIVE-10836) Beeline OutOfMemoryError due to large history

2015-05-27 Thread Patrick McAnneny (JIRA)
Patrick McAnneny created HIVE-10836:
---

 Summary: Beeline OutOfMemoryError due to large history
 Key: HIVE-10836
 URL: https://issues.apache.org/jira/browse/HIVE-10836
 Project: Hive
  Issue Type: Bug
 Environment: Hive 1.1.0 on RHEL with Cloudera (cdh5.4.0)
Reporter: Patrick McAnneny


Attempting to run beeline via commandline fails with the error below due to 
large commands in the ~/.beeline/history file. Not sure if the problem also 
exists with many lines in the history or just big lines.

I had a few lines in my history file with over 1 million characters each. 
Deleting said lines from the history file resolved the issue.

Beeline version 1.1.0-cdh5.4.0 by Apache Hive
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOf(Arrays.java:2367)
at 
java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:130)
at 
java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:114)
at 
java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:535)
at java.lang.StringBuffer.append(StringBuffer.java:322)
at java.io.BufferedReader.readLine(BufferedReader.java:363)
at java.io.BufferedReader.readLine(BufferedReader.java:382)
at jline.console.history.FileHistory.load(FileHistory.java:69)
at jline.console.history.FileHistory.load(FileHistory.java:61)
at org.apache.hive.beeline.BeeLine.getConsoleReader(BeeLine.java:869)
at org.apache.hive.beeline.BeeLine.begin(BeeLine.java:766)
at 
org.apache.hive.beeline.BeeLine.mainWithInputRedirection(BeeLine.java:480)
at org.apache.hive.beeline.BeeLine.main(BeeLine.java:463)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Review Request 34727: HIVE-10835: Concurrency issues in JDBC driver

2015-05-27 Thread Chaoyu Tang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34727/
---

Review request for hive, Szehon Ho, Thejas Nair, and Xuefu Zhang.


Bugs: HIVE-10835
https://issues.apache.org/jira/browse/HIVE-10835


Repository: hive-git


Description
---

There exist race conditions between DatabaseMetaData, Statement and ResultSet 
when they make RPC calls to HS2 using same Thrift transport, which happens 
within same connection. 
The patch is to have a connection level lock to serialize the RPC calls within 
a single connection.


Diffs
-

  jdbc/src/java/org/apache/hive/jdbc/HiveConnection.java 1b2891b 
  jdbc/src/java/org/apache/hive/jdbc/HiveDatabaseMetaData.java 13e42b5 
  jdbc/src/java/org/apache/hive/jdbc/HivePreparedStatement.java 8a0671f 
  jdbc/src/java/org/apache/hive/jdbc/HiveQueryResultSet.java e93795a 
  jdbc/src/java/org/apache/hive/jdbc/HiveStatement.java 6b3d05c 

Diff: https://reviews.apache.org/r/34727/diff/


Testing
---

Some multi-thread tests.


Thanks,

Chaoyu Tang



Re: Review Request 34666: HIVE-9152 - Dynamic Partition Pruning [Spark Branch]

2015-05-27 Thread Xuefu Zhang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34666/#review85230
---


This a big patch, for a big feature. It's hard to review offline. Here I 
offered about things that are obvious. For better understanding, I think an 
in-person review would be more effective.


ql/if/queryplan.thrift


I'm not sure if it matters, but it's probably better if we add it as the 
last.



ql/src/java/org/apache/hadoop/hive/ql/exec/GroupByOperator.java


Did you make any changes in this file? If not, let's leave it as it is.



ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkDynamicPartitionPruner.java


File descriptor needs to be closed in final block. In addition, closing in 
is not sufficient, as in might be null while fs.open(fstatus.getPath() returns 
not null.



ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkUtilities.java


Any chance that an op might be visited multiple times?



ql/src/java/org/apache/hadoop/hive/ql/io/CombineHiveInputFormat.java


numThread could be <= 0?



ql/src/java/org/apache/hadoop/hive/ql/optimizer/Optimizer.java


what's this change about?



ql/src/test/results/clientpositive/spark/smb_mapjoin_11.q.out


why the stats are gone?


- Xuefu Zhang


On May 26, 2015, 4:28 p.m., Chao Sun wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/34666/
> ---
> 
> (Updated May 26, 2015, 4:28 p.m.)
> 
> 
> Review request for hive, chengxiang li and Xuefu Zhang.
> 
> 
> Bugs: HIVE-9152
> https://issues.apache.org/jira/browse/HIVE-9152
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Tez implemented dynamic partition pruning in HIVE-7826. This is a nice 
> optimization and we should implement the same in HOS.
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 43c53fc 
>   itests/src/test/resources/testconfiguration.properties 2a5f7e3 
>   metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore.h 0f86117 
>   metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore.cpp a0b34cb 
>   metastore/src/gen/thrift/gen-cpp/hive_metastore_types.h 55e0385 
>   metastore/src/gen/thrift/gen-cpp/hive_metastore_types.cpp 749c97a 
>   metastore/src/gen/thrift/gen-py/hive_metastore/ThriftHiveMetastore.py 
> 4cc54e8 
>   ql/if/queryplan.thrift c8dfa35 
>   ql/src/gen/thrift/gen-cpp/queryplan_types.h ac73bc5 
>   ql/src/gen/thrift/gen-cpp/queryplan_types.cpp 19d4806 
>   
> ql/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/ql/plan/api/OperatorType.java
>  e18f935 
>   ql/src/gen/thrift/gen-php/Types.php 7121ed4 
>   ql/src/gen/thrift/gen-py/queryplan/ttypes.py 53c0106 
>   ql/src/gen/thrift/gen-rb/queryplan_types.rb c2c4220 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/GroupByOperator.java 9867739 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/OperatorFactory.java 91e8a02 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveSparkClientFactory.java 
> 21398d8 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkDynamicPartitionPruner.java
>  PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkUtilities.java 
> e6c845c 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorSparkPartitionPruningSinkOperator.java
>  PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/io/CombineHiveInputFormat.java 
> 1de7e40 
>   ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java 9d5730d 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/Optimizer.java ea5efe5 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/SparkDynamicPartitionPruningOptimization.java
>  PRE-CREATION 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/SparkRemoveDynamicPruningBySize.java
>  PRE-CREATION 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/SparkMapJoinResolver.java
>  8e56263 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/Vectorizer.java 
> 5f731d7 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SparkPartitionPruningSinkDesc.java
>  PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/spark/GenSparkProcContext.java 
> 447f104 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/spark/GenSparkUtils.java 
> e27ce0d 
>   
> ql/src/java/org/apache/hadoop/hive/ql/parse/spark/OptimizeSparkProcContext.java
>  f7586a4 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkCompil

[jira] [Created] (HIVE-10835) Concurrency issues in JDBC driver

2015-05-27 Thread Chaoyu Tang (JIRA)
Chaoyu Tang created HIVE-10835:
--

 Summary: Concurrency issues in JDBC driver
 Key: HIVE-10835
 URL: https://issues.apache.org/jira/browse/HIVE-10835
 Project: Hive
  Issue Type: Bug
  Components: JDBC
Affects Versions: 1.2.0
Reporter: Chaoyu Tang
Assignee: Chaoyu Tang


Though JDBC specification specifies that "Each Connection object can create 
multiple Statement objects that may be used concurrently by the program", but 
that does not work in current Hive JDBC driver. In addition, there also exist  
race conditions between DatabaseMetaData, Statement and ResultSet as long as 
they make RPC calls to HS2 using same Thrift transport, which happens within a 
connection.
So we need a connection level lock to serialize all these RPC calls in a 
connection.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Review Request 34726: HIVE-10533

2015-05-27 Thread Jesús Camacho Rodríguez

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34726/
---

Review request for hive and Ashutosh Chauhan.


Bugs: HIVE-10533
https://issues.apache.org/jira/browse/HIVE-10533


Repository: hive-git


Description
---

CBO (Calcite Return Path): Join to MultiJoin support for outer joins


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/HiveCalciteUtil.java 
f4e7c45242cd7e714148da281a08fbf90552d720 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/reloperators/HiveMultiJoin.java
 PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveInsertExchange4JoinRule.java
 30db8fd75a716442b1ae3c3e9c2e42b36d4fea9f 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveJoinToMultiJoinRule.java
 532d7d3b56377946f6a9ad883d7b7dbf1325a8c7 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/translator/HiveOpConverter.java
 efc254297df51756e555fb75d015a49b0ae11a71 

Diff: https://reviews.apache.org/r/34726/diff/


Testing
---


Thanks,

Jesús Camacho Rodríguez



Re: normalizing spark tarball dependency in Hive build

2015-05-27 Thread Sergey Shelukhin
It’s possible to publish binaries to central.
For example, your kit redistributable is published this way:
http://search.maven.org/#browse|928812221


On 15/5/26, 21:35, "Xuefu Zhang"  wrote:

>We thought of that, but unfortunate there this is a binary which isn't
>published anywhere in public maven repositories. That's why we hosted it
>at
>cloudfront.
>
>I think this is a general problem for any binaries required by tests. We
>are open to suggestions though.
>
>Thanks,
>Xuefu
>
>On Tue, May 26, 2015 at 1:35 PM, Sergey Shelukhin 
>wrote:
>
>> Hi.
>> I was trying to build Hive on a slow connection (or I could have no
>> connection for that matter), and pulling
>> "
>> 
>>http://d3jw87u4immizc.cloudfront.net/spark-tarball/spark-1.3.0-bin-hadoop
>>2
>> -without-hive.tgz” was taking forever (I ctrl-c-ed it eventually).
>> On a good note it did appear to respect “-o” on rebuild attempt (either
>> that, or whatever was remaining from the canceled build sufficed for the
>> mvn install -o … build that followed).
>> Is it possible to get this dependency via some more conventional means
>> like maven?
>>
>>



[jira] [Created] (HIVE-10834) Support First_value()/last_value() over x preceding and y preceding windowing

2015-05-27 Thread Aihua Xu (JIRA)
Aihua Xu created HIVE-10834:
---

 Summary: Support First_value()/last_value() over x preceding and y 
preceding windowing
 Key: HIVE-10834
 URL: https://issues.apache.org/jira/browse/HIVE-10834
 Project: Hive
  Issue Type: Sub-task
  Components: PTF-Windowing
Reporter: Aihua Xu
Assignee: Aihua Xu


Currently the following query
{noformat}
select ts, f, first_value(f) over (partition by ts order by t rows between 2 
preceding and 1 preceding) from over10k limit 100;
{noformat}
throws exception:
{noformat}
java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: 
Hive Runtime Error while processing row (tag=0) 
{"key":{"reducesinkkey0":"2013-03-01 
09:11:58.703071","reducesinkkey1":-3},"value":{"_col3":0.83}}
at 
org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:256)
at 
org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:506)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:447)
at 
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:449)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error 
while processing row (tag=0) {"key":{"reducesinkkey0":"2013-03-01 
09:11:58.703071","reducesinkkey1":-3},"value":{"_col3":0.83}}
at 
org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:244)
... 3 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Internal Error: 
cannot generate all output rows for a Partition
at 
org.apache.hadoop.hive.ql.udf.ptf.WindowingTableFunction.finishPartition(WindowingTableFunction.java:519)
at 
org.apache.hadoop.hive.ql.exec.PTFOperator$PTFInvocation.finishPartition(PTFOperator.java:337)
at 
org.apache.hadoop.hive.ql.exec.PTFOperator.process(PTFOperator.java:114)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837)
at 
org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:88)
at 
org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:235)
{noformat}




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 34447: HIVE-10761 : Create codahale-based metrics system for Hive

2015-05-27 Thread Szehon Ho

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34447/
---

(Updated May 27, 2015, 6:25 p.m.)


Review request for hive, Chao Sun, Jimmy Xiang, and Xuefu Zhang.


Changes
---

Rebase the patch.


Bugs: HIVE-10761
https://issues.apache.org/jira/browse/HIVE-10761


Repository: hive-git


Description
---

See JIRA for the motivation.  Summary: There is an existing metric system that 
uses some custom model and hooked up to JMX reporting, codahale-based metrics 
system will be desirable for standard model and reporting.

This adds a codahale-based metrics system to HiveServer2 and HiveMetastore.  
Metrics implementation is now internally pluggable, and the existing Metrics 
system can be re-enabled by configuration if desired for backward-compatibility.

Following metrics are supported by Metrics system:
1.  JVMPauseMonitor (used to call Hadoop's internal implementation, now forked 
off to integrate with Metrics system)
2.  HMS API calls
3.  Standard JVM metrics (only for new implementation, as its free with 
codahale).

The following metrics reporting are supported by new system (configuration 
exposed)
1.  JMX
2.  CONSOLE
3.  JSON_FILE (periodic file of metrics that gets overwritten).

A goal is to add a webserver that exposes the JSON metrics, but this will defer 
to a later implementation.


Diffs (updated)
-

  common/pom.xml a615c1e 
  common/src/java/org/apache/hadoop/hive/common/JvmPauseMonitor.java 
PRE-CREATION 
  common/src/java/org/apache/hadoop/hive/common/metrics/Metrics.java 01c9d1d 
  common/src/java/org/apache/hadoop/hive/common/metrics/MetricsLegacy.java 
PRE-CREATION 
  common/src/java/org/apache/hadoop/hive/common/metrics/common/Metrics.java 
PRE-CREATION 
  
common/src/java/org/apache/hadoop/hive/common/metrics/common/MetricsFactory.java
 PRE-CREATION 
  common/src/java/org/apache/hadoop/hive/common/metrics/metrics2/Metrics.java 
PRE-CREATION 
  
common/src/java/org/apache/hadoop/hive/common/metrics/metrics2/MetricsReporting.java
 PRE-CREATION 
  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 49b8f97 
  common/src/test/org/apache/hadoop/hive/common/metrics/TestMetrics.java 
e85d3f8 
  common/src/test/org/apache/hadoop/hive/common/metrics/TestMetricsLegacy.java 
PRE-CREATION 
  
common/src/test/org/apache/hadoop/hive/common/metrics/metrics2/TestMetrics.java 
PRE-CREATION 
  
itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/TestMetaStoreMetrics.java
 PRE-CREATION 
  metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 
d81c856 
  pom.xml b21d894 
  service/src/java/org/apache/hive/service/server/HiveServer2.java 58e8e49 
  shims/0.20S/src/main/java/org/apache/hadoop/hive/shims/Hadoop20SShims.java 
6d8166c 
  shims/0.23/src/main/java/org/apache/hadoop/hive/shims/Hadoop23Shims.java 
19324b8 
  shims/common/src/main/java/org/apache/hadoop/hive/shims/HadoopShims.java 
5a6bc44 

Diff: https://reviews.apache.org/r/34447/diff/


Testing
---

New unit test added.  Manually tested.


Thanks,

Szehon Ho



Hive-0.14 - Build # 967 - Failure

2015-05-27 Thread Apache Jenkins Server
Changes for Build #967



No tests ran.

The Apache Jenkins build system has built Hive-0.14 (build #967)

Status: Failure

Check console output at https://builds.apache.org/job/Hive-0.14/967/ to view 
the results.

Re: Review Request 34696: HIVE-686 add UDF substring_index

2015-05-27 Thread Alexander Pivovarov


> On May 27, 2015, 4:42 a.m., Swarnim Kulkarni wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFSubstringIndex.java,
> >  line 45
> > 
> >
> > Worth mentinoning in your example what the expected output would look 
> > like?
> 
> Alexander Pivovarov wrote:
> Not sure I got the issue...
> 
> --- desc output
> hive> desc function extended substring_index;
> OK
> ...
> Example:
>  > SELECT substring_index('www.apache.org', '.', 2);
>  'www.apache'
> 
> 
> -- actual select
> hive> SELECT substring_index('www.apache.org', '.', 2);
> OK
> www.apache
> 
> Swarnim Kulkarni wrote:
> My point was just that why not also include a sample result what the 
> users could expect to see after this command is executed. Might improve the 
> readability a bit.

it's included. The result is 'www.apache' - right adter \n symbol


- Alexander


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34696/#review85318
---


On May 27, 2015, 3:35 a.m., Alexander Pivovarov wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/34696/
> ---
> 
> (Updated May 27, 2015, 3:35 a.m.)
> 
> 
> Review request for hive, Hao Cheng, Jason Dere, namit jain, and Thejas Nair.
> 
> 
> Bugs: HIVE-686
> https://issues.apache.org/jira/browse/HIVE-686
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-686 add UDF substring_index
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java 
> 94a3b1787e2b3571eb7a8102c28f7334ae3fa829 
>   
> ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFSubstringIndex.java
>  PRE-CREATION 
>   
> ql/src/test/org/apache/hadoop/hive/ql/udf/generic/TestGenericUDFSubstringIndex.java
>  PRE-CREATION 
>   ql/src/test/queries/clientpositive/udf_substring_index.q PRE-CREATION 
>   ql/src/test/results/clientpositive/show_functions.q.out 
> 16820ca887320da13a42bebe0876f29eec373c8f 
>   ql/src/test/results/clientpositive/udf_substring_index.q.out PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/34696/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Alexander Pivovarov
> 
>



Re: [VOTE] Stable releases from branch-1 and experimental releases from master

2015-05-27 Thread Vikram Dixit K
+1 for all the reasons outlined.

On Tue, May 26, 2015 at 6:13 PM, Thejas Nair  wrote:
> +1
> - This is great for users who want to take longer to upgrade from
> hadoop-1 and care mainly for bug fixes and incremental features,
> rather than radical new features.
> - The ability to release initial 2.x releases marked as alpha/beta
> also helps to get users to try it out, and also lets them choose what
> is right for them.
> - This also lets developers focus on major new features without the
> burden of maintaining hadoop-1 compatibility.
>
> On Tue, May 26, 2015 at 11:41 AM, Alan Gates  wrote:
>> We have discussed this for several weeks now.  Some concerns have been
>> raised which I have tried to address.  I think it is time to vote on it as
>> our release plan.  To be specific, I propose:
>>
>> Hive makes a branch-1 from the current master.  This would be used for 1.3
>> and future 1.x releases.  This branch would not deprecate existing
>> functionality.  Any new features in this branch would also need to be put on
>> master.  An upgrade path for users will be maintained from one 1.x release
>> to the next, as well as from the latest 1.x release to the latest 2.x
>> release.
>>
>> Going forward releases numbered 2.x will be made from master.  The purpose
>> of these releases will be to enable users to get access to new features
>> being developed in Hive and allow developers to get feedback.  It is
>> expected that for a while these releases will not be production ready and
>> will be clearly so labeled.  Some legacy features, such as Hadoop 1 and
>> MapReduce, will no longer be supported in the master.  Any critical bug
>> fixes (security, incorrect results, crashes) fixed in master will also be
>> ported to branch-1 for at least a year.  This time period may be extended in
>> the future based on the stability and adoption of 2.x releases.
>>
>> Based on Hive's bylaws this release plan vote will be open for 3 days and
>> all active committers have binding votes.
>>
>> Here's my +1.
>>
>> Alan.



-- 
Nothing better than when appreciated for hard work.
-Mark


Re: Caching metastore objects

2015-05-27 Thread Scott C Gray


Great, that is perfect (I think :)).   The only thing it appears to be
missing is the ability to change multiple listeners together, but that
would be a relatively simple patch.

Thanks for pointing me to it!




From:   Ashutosh Chauhan 
To: "dev@hive.apache.org" 
Date:   05/27/2015 01:25 AM
Subject:Re: Caching metastore objects



Siva / Scott,

Such a framework exists in some form  :
https://issues.apache.org/jira/browse/HIVE-2038
To make it even more generic there was a proposal
https://issues.apache.org/jira/browse/HIVE-2147 But there was a resistance
from a community for it. May be now community is ready for it : )

Ashutosh

On Tue, May 26, 2015 at 10:12 PM, Sivaramakrishnan Narayanan <
tarb...@gmail.com> wrote:

> Thanks for the replies.
>
> @Ashutosh - thanks for the pointer! Yes I was running 0.11 metastore. Let
> me try with 0.13 metastore! Maybe my woes will be gone. If they don't
then
> I'll continue working along these lines.
>
> @Alan - agreed. Caching MTables seems like a better approach if 0.13
> metastore perf is not as good as I'd like.
>
> @Scott - a pluggable hook for metastore calls would be super useful. If
you
> want to generate events for client-side actions, I suppose you could just
> implement a dynamic proxy class over the metastore client class which
does
> whatever you need it to. Similar technique could work in the server side
-
> I believe there is already a RetryingMetaStoreClient proxy class in
place.
>
>
> On Wed, May 27, 2015 at 7:32 AM, Ashutosh Chauhan 
> wrote:
>
> > Are you running pre-0.12 or with hive.metastore.try.direct.sql = false;
> >
> > Work done on https://issues.apache.org/jira/browse/HIVE-4051 should
> > alleviate some of your problems.
> >
> >
> > On Mon, May 25, 2015 at 8:19 PM, Sivaramakrishnan Narayanan <
> > tarb...@gmail.com> wrote:
> >
> > > Apologies if this has been discussed in the past - my searches did
not
> > pull
> > > up any relevant threads. If there are better solutions available out
of
> > the
> > > box, please let me know!
> > >
> > > Problem statement
> > > --
> > >
> > > We have a setup where a single metastoredb is used by Hive, Presto
and
> > > SparkSQL. In addition, there are 1000s of hive queries submitted in
> batch
> > > form from multiple machines. Oftentimes, the metastoredb ends up
being
> > > remote (in a different region in AWS etc) and round-trip latency is
> high.
> > > We've seen single thrift calls getting translated into lots of small
> SQL
> > > calls by datanucleus and the roundtrip latency ends up killing
> > performance.
> > > Furthermore, any of these systems may create / modify a hive table
and
> > this
> > > should be reflected in the other system. Example, I may create a
table
> in
> > > hive and query it using Presto or vice versa. In our setup, there may
> be
> > > multiple thrift metastore servers pointing to the same metastore db.
> > >
> > > Investigation
> > > ---
> > >
> > > Basically, we've been looking at caching to solve this problem (will
> come
> > > to invalidation in a bit). I looked briefly at DN's support for
> caching -
> > > these two parameters seem to be switched off by default.
> > >
> > > METASTORE_CACHE_LEVEL2("datanucleus.cache.level2", false),
> > > METASTORE_CACHE_LEVEL2_TYPE("datanucleus.cache.level2.type",
> "none"),
> > >
> > > Furthermore, my reading of
> > > http://www.datanucleus.org/products/datanucleus/jdo/cache.html
> suggests
> > > that there is no sophistication in invalidation - seems like only
> > > time-based invalidation is supported and it can't work across
multiple
> > PMFs
> > > (therefore, multiple thrift metastore servers)
> > >
> > > Solution Outline
> > > ---
> > >
> > >- Every table / partition will have an additional property called
> > >'version'
> > >- Any call that modifies table or partition will bump up version
of
> > the
> > >table / partition
> > >- Guava based cache of thrift objects that come from metastore
calls
> > >- We fire a single SQL matching versions before returning from
cache
> > >- It is conceivable to have a mode wherein invalidation based on
> > version
> > >happens in a background thread (for higher performance, lower
> > fidelity)
> > >- Not proposing any locking (not shooting for world peace
here :) )
> > >- We could extend HiveMetaStore class or create a new server
> > altogether
> > >
> > > Is this something that would be interesting to the community? Is this
> > > problem already solved and should I spend my time watching GoT
instead?
> > >
> > > Thanks
> > > Siva
> > >
> >
>


Review Request 34716: HIVE-10826 Support min()/max() functions over x preceding and y preceding windowing

2015-05-27 Thread Aihua Xu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34716/
---

Review request for hive.


Repository: hive-git


Description
---

HIVE-10826 Support min()/max() functions over x preceding and y preceding 
windowing


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFMax.java 
6b7808aa6e1104a0acff3bc0fe89fc92bb200803 
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFMin.java 
d931d52d0235fcd19571d317715f8a6663aeb49c 
  ql/src/test/queries/clientpositive/windowing_windowspec2.q 
d85cea987462e4c15129334aa4aed9263ef8cc01 
  ql/src/test/results/clientpositive/windowing_windowspec2.q.out 
bf916398b2d7b0198713623d23d27c2a76551bcb 

Diff: https://reviews.apache.org/r/34716/diff/


Testing
---


Thanks,

Aihua Xu



[jira] [Created] (HIVE-10833) RowResolver looks mangled with CBO

2015-05-27 Thread Eugene Koifman (JIRA)
Eugene Koifman created HIVE-10833:
-

 Summary: RowResolver looks mangled with CBO 
 Key: HIVE-10833
 URL: https://issues.apache.org/jira/browse/HIVE-10833
 Project: Hive
  Issue Type: Bug
Affects Versions: 1.3.0
Reporter: Eugene Koifman
Assignee: Laljo John Pullokkaran


While working on HIVE-10828 I noticed that internal state of RowResolver looks 
odd when CBO is enabled.
Consider the script below.
{noformat}
set hive.enforce.bucketing=true;
set hive.exec.dynamic.partition.mode=nonstrict;
set hive.cbo.enable=false;

drop table if exists acid_partitioned;
create table acid_partitioned (a int, c string)
  partitioned by (p int)
  clustered by (a) into 1 buckets;
  
insert into acid_partitioned partition (p) (a,p) values(1,1);
{noformat}

With CBO on,
if you put a break point in {noformat}SemanticAnalyzer.genSelectPlan(String 
dest, ASTNode selExprList, QB qb, Operator input,
  Operator inputForSelectStar, boolean outerLV){noformat} at line 

_selectStar = selectStar && exprList.getChildCount() == posn + 1;_

(currently 3865) and examine _out_rwsch.rslvMap_ variable looks like 
{noformat}{null={values__tmp__table__1.tmp_values_col1=_col0: string, 
values__tmp__table__1.tmp_values_col2=_col1: string}}{noformat}

with CBO disabled, the same _out_rwsch.rslvMap_ looks like
{noformat}{values__tmp__table__1={tmp_values_col1=_col0: string, 
tmp_values_col2=_col1: string}}{noformat}

The _out_rwsch.invRslvMap_ also differs in the same way.

It seems that the version you get with CBO off is the correct one since
_insert into acid_partitioned partition (p) (a,p) values(1,1)_ is rewritten to
_insert into acid_partitioned partition (p) (a,p) select * from 
values__tmp__table__1_

CC [~ashutoshc]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 34696: HIVE-686 add UDF substring_index

2015-05-27 Thread Swarnim Kulkarni


> On May 27, 2015, 4:42 a.m., Swarnim Kulkarni wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFSubstringIndex.java,
> >  line 45
> > 
> >
> > Worth mentinoning in your example what the expected output would look 
> > like?
> 
> Alexander Pivovarov wrote:
> Not sure I got the issue...
> 
> --- desc output
> hive> desc function extended substring_index;
> OK
> ...
> Example:
>  > SELECT substring_index('www.apache.org', '.', 2);
>  'www.apache'
> 
> 
> -- actual select
> hive> SELECT substring_index('www.apache.org', '.', 2);
> OK
> www.apache

My point was just that why not also include a sample result what the users 
could expect to see after this command is executed. Might improve the 
readability a bit.


- Swarnim


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34696/#review85318
---


On May 27, 2015, 3:35 a.m., Alexander Pivovarov wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/34696/
> ---
> 
> (Updated May 27, 2015, 3:35 a.m.)
> 
> 
> Review request for hive, Hao Cheng, Jason Dere, namit jain, and Thejas Nair.
> 
> 
> Bugs: HIVE-686
> https://issues.apache.org/jira/browse/HIVE-686
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-686 add UDF substring_index
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java 
> 94a3b1787e2b3571eb7a8102c28f7334ae3fa829 
>   
> ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFSubstringIndex.java
>  PRE-CREATION 
>   
> ql/src/test/org/apache/hadoop/hive/ql/udf/generic/TestGenericUDFSubstringIndex.java
>  PRE-CREATION 
>   ql/src/test/queries/clientpositive/udf_substring_index.q PRE-CREATION 
>   ql/src/test/results/clientpositive/show_functions.q.out 
> 16820ca887320da13a42bebe0876f29eec373c8f 
>   ql/src/test/results/clientpositive/udf_substring_index.q.out PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/34696/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Alexander Pivovarov
> 
>



[jira] [Created] (HIVE-10832) ColumnStatsTask failure when processing large amount of partitions

2015-05-27 Thread Chao Sun (JIRA)
Chao Sun created HIVE-10832:
---

 Summary: ColumnStatsTask failure when processing large amount of 
partitions
 Key: HIVE-10832
 URL: https://issues.apache.org/jira/browse/HIVE-10832
 Project: Hive
  Issue Type: Bug
  Components: Statistics
Affects Versions: 1.1.0
Reporter: Chao Sun


We are trying to populate column stats for a TPC-DS 4TB dataset, and, every 
time we try to do:

{code}
analyze table catalog_sales partition(cs_sold_date_sk) compute statistics for 
columns;
{code}

it ends up with the failure:

{noformat}
2015-05-26 12:14:53,128 WARN 
org.apache.hadoop.hive.metastore.RetryingMetaStoreClient: MetaStoreClient lost 
connection. Attempting to reconnect.
org.apache.thrift.transport.TTransportException: 
java.net.SocketTimeoutException: Read timed out
at 
org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:129)
at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86)
at 
org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:429)
at 
org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:318)
at 
org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:219)
at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69)
at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_set_aggr_stats_for(ThriftHiveMetastore.java:2974)
at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.set_aggr_stats_for(ThriftHiveMetastore.java:2961)
at 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.setPartitionColumnStatistics(HiveMetaStoreClient.java:1376)
at sun.reflect.GeneratedMethodAccessor44.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:91)
at com.sun.proxy.$Proxy10.setPartitionColumnStatistics(Unknown Source)
at 
org.apache.hadoop.hive.ql.metadata.Hive.setPartitionColumnStatistics(Hive.java:2921)
at 
org.apache.hadoop.hive.ql.exec.ColumnStatsTask.persistPartitionStats(ColumnStatsTask.java:349)
at org.apache.hadoop.hive.ql.exec.ColumnStatsTask.execute(Write failed: 
Broken pipe
~ $ at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160)
at 
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:88)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1638)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1397)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1181)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1047)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1042)
at 
org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:145)
at 
org.apache.hive.service.cli.operation.SQLOperation.access$100(SQLOperation.java:70)
at 
org.apache.hive.service.cli.operation.SQLOperation$1$1.run(SQLOperation.java:197)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
at 
org.apache.hive.service.cli.operation.SQLOperation$1.run(SQLOperation.java:209)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.net.SocketTimeoutException: Read timed out
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:152)
at java.net.SocketInputStream.read(SocketInputStream.java:122)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:275)
at java.io.BufferedInputStream.read(BufferedInputStream.java:334)
at 
org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127)
... 35 more
{noformat}

We didn't see this issue for smaller amount of partitions, and seems like 
ColumnStatsTask has a scalability issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Review Request 34713: Invalidate basic stats for insert queries if autogather=false

2015-05-27 Thread Ashutosh Chauhan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34713/
---

Review request for hive and Gopal V.


Bugs: HIVE-10807
https://issues.apache.org/jira/browse/HIVE-10807


Repository: hive-git


Description
---

Invalidate basic stats for insert queries if autogather=false


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/QueryProperties.java e8f7fba 
  ql/src/java/org/apache/hadoop/hive/ql/exec/StatsTask.java 2a8167a 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMRFileSink1.java e5b9c2b 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java acd9bf5 
  ql/src/java/org/apache/hadoop/hive/ql/parse/QBParseInfo.java 14a7e9c 
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 7f355e5 
  ql/src/test/queries/clientpositive/insert_into1.q f19506a 
  ql/src/test/results/clientnegative/stats_partialscan_autogether.q.out 321ebe5 
  ql/src/test/results/clientpositive/auto_join_filters.q.out a6720d9 
  ql/src/test/results/clientpositive/auto_join_nulls.q.out 4416f3e 
  ql/src/test/results/clientpositive/auto_sortmerge_join_1.q.out 5114038 
  ql/src/test/results/clientpositive/auto_sortmerge_join_11.q.out e6e7ef3 
  ql/src/test/results/clientpositive/auto_sortmerge_join_12.q.out b2e782f 
  ql/src/test/results/clientpositive/auto_sortmerge_join_2.q.out 210f1ab 
  ql/src/test/results/clientpositive/auto_sortmerge_join_3.q.out a307b13 
  ql/src/test/results/clientpositive/auto_sortmerge_join_4.q.out f4ceee7 
  ql/src/test/results/clientpositive/auto_sortmerge_join_5.q.out 3c2951a 
  ql/src/test/results/clientpositive/auto_sortmerge_join_7.q.out e1f3888 
  ql/src/test/results/clientpositive/auto_sortmerge_join_8.q.out 38ecdbe 
  ql/src/test/results/clientpositive/bucket_map_join_1.q.out 42e6a3f 
  ql/src/test/results/clientpositive/bucket_map_join_2.q.out af73309 
  ql/src/test/results/clientpositive/bucket_map_join_spark1.q.out 870ecdd 
  ql/src/test/results/clientpositive/bucket_map_join_spark2.q.out 33f5c46 
  ql/src/test/results/clientpositive/bucket_map_join_spark3.q.out 067d1ff 
  ql/src/test/results/clientpositive/bucketcontext_1.q.out 77bfcf9 
  ql/src/test/results/clientpositive/bucketcontext_2.q.out a9db13d 
  ql/src/test/results/clientpositive/bucketcontext_3.q.out 9ba3e0c 
  ql/src/test/results/clientpositive/bucketcontext_4.q.out a2b37a8 
  ql/src/test/results/clientpositive/bucketcontext_5.q.out 3ee1f0e 
  ql/src/test/results/clientpositive/bucketcontext_6.q.out d2304fa 
  ql/src/test/results/clientpositive/bucketcontext_7.q.out 1a105ed 
  ql/src/test/results/clientpositive/bucketcontext_8.q.out 138e415 
  ql/src/test/results/clientpositive/bucketmapjoin1.q.out 471ff73 
  ql/src/test/results/clientpositive/bucketmapjoin10.q.out b0e849d 
  ql/src/test/results/clientpositive/bucketmapjoin11.q.out 4263cab 
  ql/src/test/results/clientpositive/bucketmapjoin12.q.out bcd7394 
  ql/src/test/results/clientpositive/bucketmapjoin2.q.out a8d9e9d 
  ql/src/test/results/clientpositive/bucketmapjoin3.q.out c759f05 
  ql/src/test/results/clientpositive/bucketmapjoin4.q.out f61500c 
  ql/src/test/results/clientpositive/bucketmapjoin5.q.out 0cb2825 
  ql/src/test/results/clientpositive/bucketmapjoin7.q.out 667a9db 
  ql/src/test/results/clientpositive/bucketmapjoin8.q.out 252b377 
  ql/src/test/results/clientpositive/bucketmapjoin9.q.out 5e28dc3 
  ql/src/test/results/clientpositive/bucketmapjoin_negative.q.out 6ae127d 
  ql/src/test/results/clientpositive/bucketmapjoin_negative2.q.out 4c9f54a 
  ql/src/test/results/clientpositive/bucketmapjoin_negative3.q.out 9a0bfc4 
  ql/src/test/results/clientpositive/columnstats_partlvl.q.out e0c4cfe 
  ql/src/test/results/clientpositive/columnstats_tbllvl.q.out 19283bb 
  ql/src/test/results/clientpositive/display_colstats_tbllvl.q.out 7c91248 
  
ql/src/test/results/clientpositive/encrypted/encryption_insert_partition_dynamic.q.out
 939e206 
  
ql/src/test/results/clientpositive/encrypted/encryption_insert_partition_static.q.out
 fd7932e 
  
ql/src/test/results/clientpositive/encrypted/encryption_join_unencrypted_tbl.q.out
 9b6f750 
  ql/src/test/results/clientpositive/groupby_sort_6.q.out 0169430 
  ql/src/test/results/clientpositive/insert_into1.q.out 9e5f3bb 
  ql/src/test/results/clientpositive/join_filters.q.out 4f112bd 
  ql/src/test/results/clientpositive/join_nulls.q.out 46e0170 
  ql/src/test/results/clientpositive/list_bucket_dml_8.q.java1.7.out a9522e0 
  ql/src/test/results/clientpositive/parquet_serde.q.out e753180 
  ql/src/test/results/clientpositive/ql_rewrite_gbtoidx_cbo_2.q.out 3ee2e0f 
  ql/src/test/results/clientpositive/skewjoin_union_remove_1.q.out 1f21877 
  ql/src/test/results/clientpositive/spark/auto_sortmerge_join_1.q.out 09d2692 
  ql/src/test/results/clientpositive/spark/auto_sortmerge_join_12.q.out a70b161 
  ql/src/test/results/clientpositive/spark/auto

[jira] [Created] (HIVE-10831) HiveQL Parse error in 1.1.1

2015-05-27 Thread JIRA
Zoltán Szatmári created HIVE-10831:
--

 Summary: HiveQL Parse error in 1.1.1
 Key: HIVE-10831
 URL: https://issues.apache.org/jira/browse/HIVE-10831
 Project: Hive
  Issue Type: Bug
  Components: Hive, HiveServer2
Affects Versions: 1.1.1
 Environment: CentOS 6.4, Apache Hadoop 2.7 and Hive 1.1.1 based on the 
following binaries:
- https://archive.apache.org/dist/hive/hive-1.1.1/apache-hive-1.1.1-bin.tar.gz
- http://www.eu.apache.org/dist/hadoop/common/hadoop-2.7.0/hadoop-2.7.0.tar.gz

Reporter: Zoltán Szatmári


The "create table ... stored as textfile" query fails with AssertionError 
during parsing the query text. Without "stored as something" it works. These 
query is ok in 1.0.0, 1.0.1, 1.1.0 and 1.2.0 (with the exactly same 
configuration), but fails in 1.1.1.

We tried using both Hive CLI and also beeline. Almost the same stacktrace is 
shown in Hive CLI or in the HiveServer log (when using beeline). The 
interesting is that the Hive CLI crashes. 

hive> CREATE TABLE r3 (a1 DOUBLE , a2 DOUBLE) stored as textfile;
Exception in thread "main" java.lang.AssertionError: Unknown token: 
[@-1,0:0='TOK_FILEFORMAT_GENERIC',<679>,0:-1]
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeCreateTable(SemanticAnalyzer.java:10895)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genResolvedParseTree(SemanticAnalyzer.java:10103)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10147)
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:192)
at 
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:222)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:421)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:307)
at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1112)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1160)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1039)
at 
org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:207)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:159)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:370)
at 
org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:754)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:675)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:615)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
bash-4.1# 




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


RE: Build hive failure on ubuntu 15.04 with oracle java 1.8

2015-05-27 Thread 煜 韦
This is known issue. https://issues.apache.org/jira/browse/HIVE-10674

> From: yu20...@hotmail.com
> To: dev@hive.apache.org
> Subject: Build hive failure on ubuntu 15.04 with oracle java 1.8
> Date: Tue, 26 May 2015 11:58:45 +0800
> 
> Hi guys,
> I tried to built hive 1.2.0 on ubuntu 15.04 with oracle Java 1.8. Then I 
> encountered following problem.What should I do to fix this issue?
> Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=512m; 
> support was removed in 8.0Running 
> org.apache.hadoop.hive.metastore.TestMetastoreExprTests run: 1, Failures: 0, 
> Errors: 0, Skipped: 0, Time elapsed: 8.719 sec - in 
> org.apache.hadoop.hive.metastore.TestMetastoreExprResults :Failed tests:  
> TestExecDriver.testMapRedPlan1:513->executePlan:487 expected: but 
> was:  TestExecDriver.testMapRedPlan2:522->executePlan:487 
> expected: but was:  
> TestExecDriver.testMapRedPlan3:531->executePlan:487 expected: but 
> was:  TestExecDriver.testMapRedPlan4:540->executePlan:487 
> expected: but was:  
> TestExecDriver.testMapRedPlan5:549->executePlan:487 expected: but 
> was:  TestExecDriver.testMapRedPlan6:558->executePlan:487 
> expected: but was:  
> TestExecDriver.testMapPlan1:496->executePlan:487 expected: but 
> was:  TestExecDriver.testMapPlan2:504->executePlan:487 expected: 
> but was:  TestSessionState.testReloadExistingAuxJars2:234 Could not 
> find SessionStateTest.jar.v1  TestSessionState.testReloadAuxJars2:191 Could 
> not find SessionStateTest.jar.v1  
> TestSessionState.testReloadExistingAuxJars2:234 Could not find 
> SessionStateTest.jar.v1  TestSessionState.testReloadAuxJars2:191 Could not 
> find SessionStateTest.jar.v1Tests run: 3545, Failures: 12, Errors: 0, 
> Skipped: 1[ERROR] Failed to execute goal 
> org.apache.maven.plugins:maven-surefire-plugin:2.16:test (default-test) on 
> project hive-exec: There are test failures.[ERROR][ERROR] Please refer to 
> /home/hadoop/apache-hive-1.2.0-src/ql/target/surefire-reports for the 
> individual test results.[ERROR] -> [Help 
> 1]org.apache.maven.lifecycle.LifecycleExecutionException: Failed to execute 
> goal org.apache.maven.plugins:maven-surefire-plugin:2.16:test (default-test) 
> on project hive-exec: There are test failures.Please refer to 
> /home/hadoop/apache-hive-1.2.0-src/ql/target/surefire-reports for the 
> individual test results.at 
> org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:213)
> at 
> org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:153)
> at 
> org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:145)
> at 
> org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:84)
> at 
> org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:59)
> at 
> org.apache.maven.lifecycle.internal.LifecycleStarter.singleThreadedBuild(LifecycleStarter.java:183)
> at 
> org.apache.maven.lifecycle.internal.LifecycleStarter.execute(LifecycleStarter.java:161)
> at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:320) 
>at org.apache.maven.DefaultMaven.execute(DefaultMaven.java:156)at 
> org.apache.maven.cli.MavenCli.execute(MavenCli.java:537)at 
> org.apache.maven.cli.MavenCli.doMain(MavenCli.java:196)at 
> org.apache.maven.cli.MavenCli.main(MavenCli.java:141)at 
> sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
>at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:497)at 
> org.codehaus.plexus.classworlds.launcher.Launcher.launchEnhanced(Launcher.java:289)
> at 
> org.codehaus.plexus.classworlds.launcher.Launcher.launch(Launcher.java:229)   
>  at 
> org.codehaus.plexus.classworlds.launcher.Launcher.mainWithExitCode(Launcher.java:415)
> at 
> org.codehaus.plexus.classworlds.launcher.Launcher.main(Launcher.java:356)Caused
>  by: org.apache.maven.plugin.MojoFailureException: There are test 
> failures.Thanks,Jared