[jira] [Created] (HIVE-10831) HiveQL Parse error in 1.1.1

2015-05-27 Thread JIRA
Zoltán Szatmári created HIVE-10831:
--

 Summary: HiveQL Parse error in 1.1.1
 Key: HIVE-10831
 URL: https://issues.apache.org/jira/browse/HIVE-10831
 Project: Hive
  Issue Type: Bug
  Components: Hive, HiveServer2
Affects Versions: 1.1.1
 Environment: CentOS 6.4, Apache Hadoop 2.7 and Hive 1.1.1 based on the 
following binaries:
- https://archive.apache.org/dist/hive/hive-1.1.1/apache-hive-1.1.1-bin.tar.gz
- http://www.eu.apache.org/dist/hadoop/common/hadoop-2.7.0/hadoop-2.7.0.tar.gz

Reporter: Zoltán Szatmári


The create table ... stored as textfile query fails with AssertionError 
during parsing the query text. Without stored as something it works. These 
query is ok in 1.0.0, 1.0.1, 1.1.0 and 1.2.0 (with the exactly same 
configuration), but fails in 1.1.1.

We tried using both Hive CLI and also beeline. Almost the same stacktrace is 
shown in Hive CLI or in the HiveServer log (when using beeline). The 
interesting is that the Hive CLI crashes. 

hive CREATE TABLE r3 (a1 DOUBLE , a2 DOUBLE) stored as textfile;
Exception in thread main java.lang.AssertionError: Unknown token: 
[@-1,0:0='TOK_FILEFORMAT_GENERIC',679,0:-1]
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeCreateTable(SemanticAnalyzer.java:10895)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genResolvedParseTree(SemanticAnalyzer.java:10103)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10147)
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:192)
at 
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:222)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:421)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:307)
at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1112)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1160)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1039)
at 
org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:207)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:159)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:370)
at 
org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:754)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:675)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:615)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
bash-4.1# 




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 34447: HIVE-10761 : Create codahale-based metrics system for Hive

2015-05-27 Thread Szehon Ho

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34447/
---

(Updated May 27, 2015, 6:25 p.m.)


Review request for hive, Chao Sun, Jimmy Xiang, and Xuefu Zhang.


Changes
---

Rebase the patch.


Bugs: HIVE-10761
https://issues.apache.org/jira/browse/HIVE-10761


Repository: hive-git


Description
---

See JIRA for the motivation.  Summary: There is an existing metric system that 
uses some custom model and hooked up to JMX reporting, codahale-based metrics 
system will be desirable for standard model and reporting.

This adds a codahale-based metrics system to HiveServer2 and HiveMetastore.  
Metrics implementation is now internally pluggable, and the existing Metrics 
system can be re-enabled by configuration if desired for backward-compatibility.

Following metrics are supported by Metrics system:
1.  JVMPauseMonitor (used to call Hadoop's internal implementation, now forked 
off to integrate with Metrics system)
2.  HMS API calls
3.  Standard JVM metrics (only for new implementation, as its free with 
codahale).

The following metrics reporting are supported by new system (configuration 
exposed)
1.  JMX
2.  CONSOLE
3.  JSON_FILE (periodic file of metrics that gets overwritten).

A goal is to add a webserver that exposes the JSON metrics, but this will defer 
to a later implementation.


Diffs (updated)
-

  common/pom.xml a615c1e 
  common/src/java/org/apache/hadoop/hive/common/JvmPauseMonitor.java 
PRE-CREATION 
  common/src/java/org/apache/hadoop/hive/common/metrics/Metrics.java 01c9d1d 
  common/src/java/org/apache/hadoop/hive/common/metrics/MetricsLegacy.java 
PRE-CREATION 
  common/src/java/org/apache/hadoop/hive/common/metrics/common/Metrics.java 
PRE-CREATION 
  
common/src/java/org/apache/hadoop/hive/common/metrics/common/MetricsFactory.java
 PRE-CREATION 
  common/src/java/org/apache/hadoop/hive/common/metrics/metrics2/Metrics.java 
PRE-CREATION 
  
common/src/java/org/apache/hadoop/hive/common/metrics/metrics2/MetricsReporting.java
 PRE-CREATION 
  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 49b8f97 
  common/src/test/org/apache/hadoop/hive/common/metrics/TestMetrics.java 
e85d3f8 
  common/src/test/org/apache/hadoop/hive/common/metrics/TestMetricsLegacy.java 
PRE-CREATION 
  
common/src/test/org/apache/hadoop/hive/common/metrics/metrics2/TestMetrics.java 
PRE-CREATION 
  
itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/TestMetaStoreMetrics.java
 PRE-CREATION 
  metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 
d81c856 
  pom.xml b21d894 
  service/src/java/org/apache/hive/service/server/HiveServer2.java 58e8e49 
  shims/0.20S/src/main/java/org/apache/hadoop/hive/shims/Hadoop20SShims.java 
6d8166c 
  shims/0.23/src/main/java/org/apache/hadoop/hive/shims/Hadoop23Shims.java 
19324b8 
  shims/common/src/main/java/org/apache/hadoop/hive/shims/HadoopShims.java 
5a6bc44 

Diff: https://reviews.apache.org/r/34447/diff/


Testing
---

New unit test added.  Manually tested.


Thanks,

Szehon Ho



[jira] [Created] (HIVE-10834) Support First_value()/last_value() over x preceding and y preceding windowing

2015-05-27 Thread Aihua Xu (JIRA)
Aihua Xu created HIVE-10834:
---

 Summary: Support First_value()/last_value() over x preceding and y 
preceding windowing
 Key: HIVE-10834
 URL: https://issues.apache.org/jira/browse/HIVE-10834
 Project: Hive
  Issue Type: Sub-task
  Components: PTF-Windowing
Reporter: Aihua Xu
Assignee: Aihua Xu


Currently the following query
{noformat}
select ts, f, first_value(f) over (partition by ts order by t rows between 2 
preceding and 1 preceding) from over10k limit 100;
{noformat}
throws exception:
{noformat}
java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: 
Hive Runtime Error while processing row (tag=0) 
{key:{reducesinkkey0:2013-03-01 
09:11:58.703071,reducesinkkey1:-3},value:{_col3:0.83}}
at 
org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:256)
at 
org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:506)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:447)
at 
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:449)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error 
while processing row (tag=0) {key:{reducesinkkey0:2013-03-01 
09:11:58.703071,reducesinkkey1:-3},value:{_col3:0.83}}
at 
org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:244)
... 3 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Internal Error: 
cannot generate all output rows for a Partition
at 
org.apache.hadoop.hive.ql.udf.ptf.WindowingTableFunction.finishPartition(WindowingTableFunction.java:519)
at 
org.apache.hadoop.hive.ql.exec.PTFOperator$PTFInvocation.finishPartition(PTFOperator.java:337)
at 
org.apache.hadoop.hive.ql.exec.PTFOperator.process(PTFOperator.java:114)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837)
at 
org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:88)
at 
org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:235)
{noformat}




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: normalizing spark tarball dependency in Hive build

2015-05-27 Thread Sergey Shelukhin
It’s possible to publish binaries to central.
For example, your kit redistributable is published this way:
http://search.maven.org/#browse|928812221


On 15/5/26, 21:35, Xuefu Zhang xzh...@cloudera.com wrote:

We thought of that, but unfortunate there this is a binary which isn't
published anywhere in public maven repositories. That's why we hosted it
at
cloudfront.

I think this is a general problem for any binaries required by tests. We
are open to suggestions though.

Thanks,
Xuefu

On Tue, May 26, 2015 at 1:35 PM, Sergey Shelukhin ser...@hortonworks.com
wrote:

 Hi.
 I was trying to build Hive on a slow connection (or I could have no
 connection for that matter), and pulling
 
 
http://d3jw87u4immizc.cloudfront.net/spark-tarball/spark-1.3.0-bin-hadoop
2
 -without-hive.tgz” was taking forever (I ctrl-c-ed it eventually).
 On a good note it did appear to respect “-o” on rebuild attempt (either
 that, or whatever was remaining from the canceled build sufficed for the
 mvn install -o … build that followed).
 Is it possible to get this dependency via some more conventional means
 like maven?





Review Request 34726: HIVE-10533

2015-05-27 Thread Jesús Camacho Rodríguez

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34726/
---

Review request for hive and Ashutosh Chauhan.


Bugs: HIVE-10533
https://issues.apache.org/jira/browse/HIVE-10533


Repository: hive-git


Description
---

CBO (Calcite Return Path): Join to MultiJoin support for outer joins


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/HiveCalciteUtil.java 
f4e7c45242cd7e714148da281a08fbf90552d720 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/reloperators/HiveMultiJoin.java
 PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveInsertExchange4JoinRule.java
 30db8fd75a716442b1ae3c3e9c2e42b36d4fea9f 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveJoinToMultiJoinRule.java
 532d7d3b56377946f6a9ad883d7b7dbf1325a8c7 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/translator/HiveOpConverter.java
 efc254297df51756e555fb75d015a49b0ae11a71 

Diff: https://reviews.apache.org/r/34726/diff/


Testing
---


Thanks,

Jesús Camacho Rodríguez



[jira] [Created] (HIVE-10835) Concurrency issues in JDBC driver

2015-05-27 Thread Chaoyu Tang (JIRA)
Chaoyu Tang created HIVE-10835:
--

 Summary: Concurrency issues in JDBC driver
 Key: HIVE-10835
 URL: https://issues.apache.org/jira/browse/HIVE-10835
 Project: Hive
  Issue Type: Bug
  Components: JDBC
Affects Versions: 1.2.0
Reporter: Chaoyu Tang
Assignee: Chaoyu Tang


Though JDBC specification specifies that Each Connection object can create 
multiple Statement objects that may be used concurrently by the program, but 
that does not work in current Hive JDBC driver. In addition, there also exist  
race conditions between DatabaseMetaData, Statement and ResultSet as long as 
they make RPC calls to HS2 using same Thrift transport, which happens within a 
connection.
So we need a connection level lock to serialize all these RPC calls in a 
connection.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 34666: HIVE-9152 - Dynamic Partition Pruning [Spark Branch]

2015-05-27 Thread Xuefu Zhang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34666/#review85230
---


This a big patch, for a big feature. It's hard to review offline. Here I 
offered about things that are obvious. For better understanding, I think an 
in-person review would be more effective.


ql/if/queryplan.thrift
https://reviews.apache.org/r/34666/#comment136752

I'm not sure if it matters, but it's probably better if we add it as the 
last.



ql/src/java/org/apache/hadoop/hive/ql/exec/GroupByOperator.java
https://reviews.apache.org/r/34666/#comment136753

Did you make any changes in this file? If not, let's leave it as it is.



ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkDynamicPartitionPruner.java
https://reviews.apache.org/r/34666/#comment136942

File descriptor needs to be closed in final block. In addition, closing in 
is not sufficient, as in might be null while fs.open(fstatus.getPath() returns 
not null.



ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkUtilities.java
https://reviews.apache.org/r/34666/#comment136943

Any chance that an op might be visited multiple times?



ql/src/java/org/apache/hadoop/hive/ql/io/CombineHiveInputFormat.java
https://reviews.apache.org/r/34666/#comment136946

numThread could be = 0?



ql/src/java/org/apache/hadoop/hive/ql/optimizer/Optimizer.java
https://reviews.apache.org/r/34666/#comment136948

what's this change about?



ql/src/test/results/clientpositive/spark/smb_mapjoin_11.q.out
https://reviews.apache.org/r/34666/#comment136976

why the stats are gone?


- Xuefu Zhang


On May 26, 2015, 4:28 p.m., Chao Sun wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/34666/
 ---
 
 (Updated May 26, 2015, 4:28 p.m.)
 
 
 Review request for hive, chengxiang li and Xuefu Zhang.
 
 
 Bugs: HIVE-9152
 https://issues.apache.org/jira/browse/HIVE-9152
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 Tez implemented dynamic partition pruning in HIVE-7826. This is a nice 
 optimization and we should implement the same in HOS.
 
 
 Diffs
 -
 
   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 43c53fc 
   itests/src/test/resources/testconfiguration.properties 2a5f7e3 
   metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore.h 0f86117 
   metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore.cpp a0b34cb 
   metastore/src/gen/thrift/gen-cpp/hive_metastore_types.h 55e0385 
   metastore/src/gen/thrift/gen-cpp/hive_metastore_types.cpp 749c97a 
   metastore/src/gen/thrift/gen-py/hive_metastore/ThriftHiveMetastore.py 
 4cc54e8 
   ql/if/queryplan.thrift c8dfa35 
   ql/src/gen/thrift/gen-cpp/queryplan_types.h ac73bc5 
   ql/src/gen/thrift/gen-cpp/queryplan_types.cpp 19d4806 
   
 ql/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/ql/plan/api/OperatorType.java
  e18f935 
   ql/src/gen/thrift/gen-php/Types.php 7121ed4 
   ql/src/gen/thrift/gen-py/queryplan/ttypes.py 53c0106 
   ql/src/gen/thrift/gen-rb/queryplan_types.rb c2c4220 
   ql/src/java/org/apache/hadoop/hive/ql/exec/GroupByOperator.java 9867739 
   ql/src/java/org/apache/hadoop/hive/ql/exec/OperatorFactory.java 91e8a02 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveSparkClientFactory.java 
 21398d8 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkDynamicPartitionPruner.java
  PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkUtilities.java 
 e6c845c 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorSparkPartitionPruningSinkOperator.java
  PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/io/CombineHiveInputFormat.java 
 1de7e40 
   ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java 9d5730d 
   ql/src/java/org/apache/hadoop/hive/ql/optimizer/Optimizer.java ea5efe5 
   
 ql/src/java/org/apache/hadoop/hive/ql/optimizer/SparkDynamicPartitionPruningOptimization.java
  PRE-CREATION 
   
 ql/src/java/org/apache/hadoop/hive/ql/optimizer/SparkRemoveDynamicPruningBySize.java
  PRE-CREATION 
   
 ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/SparkMapJoinResolver.java
  8e56263 
   ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/Vectorizer.java 
 5f731d7 
   
 ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SparkPartitionPruningSinkDesc.java
  PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/parse/spark/GenSparkProcContext.java 
 447f104 
   ql/src/java/org/apache/hadoop/hive/ql/parse/spark/GenSparkUtils.java 
 e27ce0d 
   
 ql/src/java/org/apache/hadoop/hive/ql/parse/spark/OptimizeSparkProcContext.java
  f7586a4 
   ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkCompiler.java 
 19aae70 
   
 

Review Request 34727: HIVE-10835: Concurrency issues in JDBC driver

2015-05-27 Thread Chaoyu Tang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34727/
---

Review request for hive, Szehon Ho, Thejas Nair, and Xuefu Zhang.


Bugs: HIVE-10835
https://issues.apache.org/jira/browse/HIVE-10835


Repository: hive-git


Description
---

There exist race conditions between DatabaseMetaData, Statement and ResultSet 
when they make RPC calls to HS2 using same Thrift transport, which happens 
within same connection. 
The patch is to have a connection level lock to serialize the RPC calls within 
a single connection.


Diffs
-

  jdbc/src/java/org/apache/hive/jdbc/HiveConnection.java 1b2891b 
  jdbc/src/java/org/apache/hive/jdbc/HiveDatabaseMetaData.java 13e42b5 
  jdbc/src/java/org/apache/hive/jdbc/HivePreparedStatement.java 8a0671f 
  jdbc/src/java/org/apache/hive/jdbc/HiveQueryResultSet.java e93795a 
  jdbc/src/java/org/apache/hive/jdbc/HiveStatement.java 6b3d05c 

Diff: https://reviews.apache.org/r/34727/diff/


Testing
---

Some multi-thread tests.


Thanks,

Chaoyu Tang



Hive-0.14 - Build # 967 - Failure

2015-05-27 Thread Apache Jenkins Server
Changes for Build #967



No tests ran.

The Apache Jenkins build system has built Hive-0.14 (build #967)

Status: Failure

Check console output at https://builds.apache.org/job/Hive-0.14/967/ to view 
the results.

Re: Review Request 34696: HIVE-686 add UDF substring_index

2015-05-27 Thread Alexander Pivovarov


 On May 27, 2015, 4:42 a.m., Swarnim Kulkarni wrote:
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFSubstringIndex.java,
   line 45
  https://reviews.apache.org/r/34696/diff/1/?file=972489#file972489line45
 
  Worth mentinoning in your example what the expected output would look 
  like?
 
 Alexander Pivovarov wrote:
 Not sure I got the issue...
 
 --- desc output
 hive desc function extended substring_index;
 OK
 ...
 Example:
   SELECT substring_index('www.apache.org', '.', 2);
  'www.apache'
 
 
 -- actual select
 hive SELECT substring_index('www.apache.org', '.', 2);
 OK
 www.apache
 
 Swarnim Kulkarni wrote:
 My point was just that why not also include a sample result what the 
 users could expect to see after this command is executed. Might improve the 
 readability a bit.

it's included. The result is 'www.apache' - right adter \n symbol


- Alexander


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34696/#review85318
---


On May 27, 2015, 3:35 a.m., Alexander Pivovarov wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/34696/
 ---
 
 (Updated May 27, 2015, 3:35 a.m.)
 
 
 Review request for hive, Hao Cheng, Jason Dere, namit jain, and Thejas Nair.
 
 
 Bugs: HIVE-686
 https://issues.apache.org/jira/browse/HIVE-686
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 HIVE-686 add UDF substring_index
 
 
 Diffs
 -
 
   ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java 
 94a3b1787e2b3571eb7a8102c28f7334ae3fa829 
   
 ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFSubstringIndex.java
  PRE-CREATION 
   
 ql/src/test/org/apache/hadoop/hive/ql/udf/generic/TestGenericUDFSubstringIndex.java
  PRE-CREATION 
   ql/src/test/queries/clientpositive/udf_substring_index.q PRE-CREATION 
   ql/src/test/results/clientpositive/show_functions.q.out 
 16820ca887320da13a42bebe0876f29eec373c8f 
   ql/src/test/results/clientpositive/udf_substring_index.q.out PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/34696/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Alexander Pivovarov
 




Review Request 34713: Invalidate basic stats for insert queries if autogather=false

2015-05-27 Thread Ashutosh Chauhan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34713/
---

Review request for hive and Gopal V.


Bugs: HIVE-10807
https://issues.apache.org/jira/browse/HIVE-10807


Repository: hive-git


Description
---

Invalidate basic stats for insert queries if autogather=false


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/QueryProperties.java e8f7fba 
  ql/src/java/org/apache/hadoop/hive/ql/exec/StatsTask.java 2a8167a 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMRFileSink1.java e5b9c2b 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java acd9bf5 
  ql/src/java/org/apache/hadoop/hive/ql/parse/QBParseInfo.java 14a7e9c 
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 7f355e5 
  ql/src/test/queries/clientpositive/insert_into1.q f19506a 
  ql/src/test/results/clientnegative/stats_partialscan_autogether.q.out 321ebe5 
  ql/src/test/results/clientpositive/auto_join_filters.q.out a6720d9 
  ql/src/test/results/clientpositive/auto_join_nulls.q.out 4416f3e 
  ql/src/test/results/clientpositive/auto_sortmerge_join_1.q.out 5114038 
  ql/src/test/results/clientpositive/auto_sortmerge_join_11.q.out e6e7ef3 
  ql/src/test/results/clientpositive/auto_sortmerge_join_12.q.out b2e782f 
  ql/src/test/results/clientpositive/auto_sortmerge_join_2.q.out 210f1ab 
  ql/src/test/results/clientpositive/auto_sortmerge_join_3.q.out a307b13 
  ql/src/test/results/clientpositive/auto_sortmerge_join_4.q.out f4ceee7 
  ql/src/test/results/clientpositive/auto_sortmerge_join_5.q.out 3c2951a 
  ql/src/test/results/clientpositive/auto_sortmerge_join_7.q.out e1f3888 
  ql/src/test/results/clientpositive/auto_sortmerge_join_8.q.out 38ecdbe 
  ql/src/test/results/clientpositive/bucket_map_join_1.q.out 42e6a3f 
  ql/src/test/results/clientpositive/bucket_map_join_2.q.out af73309 
  ql/src/test/results/clientpositive/bucket_map_join_spark1.q.out 870ecdd 
  ql/src/test/results/clientpositive/bucket_map_join_spark2.q.out 33f5c46 
  ql/src/test/results/clientpositive/bucket_map_join_spark3.q.out 067d1ff 
  ql/src/test/results/clientpositive/bucketcontext_1.q.out 77bfcf9 
  ql/src/test/results/clientpositive/bucketcontext_2.q.out a9db13d 
  ql/src/test/results/clientpositive/bucketcontext_3.q.out 9ba3e0c 
  ql/src/test/results/clientpositive/bucketcontext_4.q.out a2b37a8 
  ql/src/test/results/clientpositive/bucketcontext_5.q.out 3ee1f0e 
  ql/src/test/results/clientpositive/bucketcontext_6.q.out d2304fa 
  ql/src/test/results/clientpositive/bucketcontext_7.q.out 1a105ed 
  ql/src/test/results/clientpositive/bucketcontext_8.q.out 138e415 
  ql/src/test/results/clientpositive/bucketmapjoin1.q.out 471ff73 
  ql/src/test/results/clientpositive/bucketmapjoin10.q.out b0e849d 
  ql/src/test/results/clientpositive/bucketmapjoin11.q.out 4263cab 
  ql/src/test/results/clientpositive/bucketmapjoin12.q.out bcd7394 
  ql/src/test/results/clientpositive/bucketmapjoin2.q.out a8d9e9d 
  ql/src/test/results/clientpositive/bucketmapjoin3.q.out c759f05 
  ql/src/test/results/clientpositive/bucketmapjoin4.q.out f61500c 
  ql/src/test/results/clientpositive/bucketmapjoin5.q.out 0cb2825 
  ql/src/test/results/clientpositive/bucketmapjoin7.q.out 667a9db 
  ql/src/test/results/clientpositive/bucketmapjoin8.q.out 252b377 
  ql/src/test/results/clientpositive/bucketmapjoin9.q.out 5e28dc3 
  ql/src/test/results/clientpositive/bucketmapjoin_negative.q.out 6ae127d 
  ql/src/test/results/clientpositive/bucketmapjoin_negative2.q.out 4c9f54a 
  ql/src/test/results/clientpositive/bucketmapjoin_negative3.q.out 9a0bfc4 
  ql/src/test/results/clientpositive/columnstats_partlvl.q.out e0c4cfe 
  ql/src/test/results/clientpositive/columnstats_tbllvl.q.out 19283bb 
  ql/src/test/results/clientpositive/display_colstats_tbllvl.q.out 7c91248 
  
ql/src/test/results/clientpositive/encrypted/encryption_insert_partition_dynamic.q.out
 939e206 
  
ql/src/test/results/clientpositive/encrypted/encryption_insert_partition_static.q.out
 fd7932e 
  
ql/src/test/results/clientpositive/encrypted/encryption_join_unencrypted_tbl.q.out
 9b6f750 
  ql/src/test/results/clientpositive/groupby_sort_6.q.out 0169430 
  ql/src/test/results/clientpositive/insert_into1.q.out 9e5f3bb 
  ql/src/test/results/clientpositive/join_filters.q.out 4f112bd 
  ql/src/test/results/clientpositive/join_nulls.q.out 46e0170 
  ql/src/test/results/clientpositive/list_bucket_dml_8.q.java1.7.out a9522e0 
  ql/src/test/results/clientpositive/parquet_serde.q.out e753180 
  ql/src/test/results/clientpositive/ql_rewrite_gbtoidx_cbo_2.q.out 3ee2e0f 
  ql/src/test/results/clientpositive/skewjoin_union_remove_1.q.out 1f21877 
  ql/src/test/results/clientpositive/spark/auto_sortmerge_join_1.q.out 09d2692 
  ql/src/test/results/clientpositive/spark/auto_sortmerge_join_12.q.out a70b161 
  

[jira] [Created] (HIVE-10832) ColumnStatsTask failure when processing large amount of partitions

2015-05-27 Thread Chao Sun (JIRA)
Chao Sun created HIVE-10832:
---

 Summary: ColumnStatsTask failure when processing large amount of 
partitions
 Key: HIVE-10832
 URL: https://issues.apache.org/jira/browse/HIVE-10832
 Project: Hive
  Issue Type: Bug
  Components: Statistics
Affects Versions: 1.1.0
Reporter: Chao Sun


We are trying to populate column stats for a TPC-DS 4TB dataset, and, every 
time we try to do:

{code}
analyze table catalog_sales partition(cs_sold_date_sk) compute statistics for 
columns;
{code}

it ends up with the failure:

{noformat}
2015-05-26 12:14:53,128 WARN 
org.apache.hadoop.hive.metastore.RetryingMetaStoreClient: MetaStoreClient lost 
connection. Attempting to reconnect.
org.apache.thrift.transport.TTransportException: 
java.net.SocketTimeoutException: Read timed out
at 
org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:129)
at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86)
at 
org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:429)
at 
org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:318)
at 
org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:219)
at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69)
at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_set_aggr_stats_for(ThriftHiveMetastore.java:2974)
at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.set_aggr_stats_for(ThriftHiveMetastore.java:2961)
at 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.setPartitionColumnStatistics(HiveMetaStoreClient.java:1376)
at sun.reflect.GeneratedMethodAccessor44.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:91)
at com.sun.proxy.$Proxy10.setPartitionColumnStatistics(Unknown Source)
at 
org.apache.hadoop.hive.ql.metadata.Hive.setPartitionColumnStatistics(Hive.java:2921)
at 
org.apache.hadoop.hive.ql.exec.ColumnStatsTask.persistPartitionStats(ColumnStatsTask.java:349)
at org.apache.hadoop.hive.ql.exec.ColumnStatsTask.execute(Write failed: 
Broken pipe
~ $ at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160)
at 
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:88)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1638)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1397)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1181)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1047)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1042)
at 
org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:145)
at 
org.apache.hive.service.cli.operation.SQLOperation.access$100(SQLOperation.java:70)
at 
org.apache.hive.service.cli.operation.SQLOperation$1$1.run(SQLOperation.java:197)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
at 
org.apache.hive.service.cli.operation.SQLOperation$1.run(SQLOperation.java:209)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.net.SocketTimeoutException: Read timed out
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:152)
at java.net.SocketInputStream.read(SocketInputStream.java:122)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:275)
at java.io.BufferedInputStream.read(BufferedInputStream.java:334)
at 
org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127)
... 35 more
{noformat}

We didn't see this issue for smaller amount of partitions, and seems like 
ColumnStatsTask has a scalability issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Caching metastore objects

2015-05-27 Thread Scott C Gray


Great, that is perfect (I think :)).   The only thing it appears to be
missing is the ability to change multiple listeners together, but that
would be a relatively simple patch.

Thanks for pointing me to it!




From:   Ashutosh Chauhan hashut...@apache.org
To: dev@hive.apache.org dev@hive.apache.org
Date:   05/27/2015 01:25 AM
Subject:Re: Caching metastore objects



Siva / Scott,

Such a framework exists in some form  :
https://issues.apache.org/jira/browse/HIVE-2038
To make it even more generic there was a proposal
https://issues.apache.org/jira/browse/HIVE-2147 But there was a resistance
from a community for it. May be now community is ready for it : )

Ashutosh

On Tue, May 26, 2015 at 10:12 PM, Sivaramakrishnan Narayanan 
tarb...@gmail.com wrote:

 Thanks for the replies.

 @Ashutosh - thanks for the pointer! Yes I was running 0.11 metastore. Let
 me try with 0.13 metastore! Maybe my woes will be gone. If they don't
then
 I'll continue working along these lines.

 @Alan - agreed. Caching MTables seems like a better approach if 0.13
 metastore perf is not as good as I'd like.

 @Scott - a pluggable hook for metastore calls would be super useful. If
you
 want to generate events for client-side actions, I suppose you could just
 implement a dynamic proxy class over the metastore client class which
does
 whatever you need it to. Similar technique could work in the server side
-
 I believe there is already a RetryingMetaStoreClient proxy class in
place.


 On Wed, May 27, 2015 at 7:32 AM, Ashutosh Chauhan hashut...@apache.org
 wrote:

  Are you running pre-0.12 or with hive.metastore.try.direct.sql = false;
 
  Work done on https://issues.apache.org/jira/browse/HIVE-4051 should
  alleviate some of your problems.
 
 
  On Mon, May 25, 2015 at 8:19 PM, Sivaramakrishnan Narayanan 
  tarb...@gmail.com wrote:
 
   Apologies if this has been discussed in the past - my searches did
not
  pull
   up any relevant threads. If there are better solutions available out
of
  the
   box, please let me know!
  
   Problem statement
   --
  
   We have a setup where a single metastoredb is used by Hive, Presto
and
   SparkSQL. In addition, there are 1000s of hive queries submitted in
 batch
   form from multiple machines. Oftentimes, the metastoredb ends up
being
   remote (in a different region in AWS etc) and round-trip latency is
 high.
   We've seen single thrift calls getting translated into lots of small
 SQL
   calls by datanucleus and the roundtrip latency ends up killing
  performance.
   Furthermore, any of these systems may create / modify a hive table
and
  this
   should be reflected in the other system. Example, I may create a
table
 in
   hive and query it using Presto or vice versa. In our setup, there may
 be
   multiple thrift metastore servers pointing to the same metastore db.
  
   Investigation
   ---
  
   Basically, we've been looking at caching to solve this problem (will
 come
   to invalidation in a bit). I looked briefly at DN's support for
 caching -
   these two parameters seem to be switched off by default.
  
   METASTORE_CACHE_LEVEL2(datanucleus.cache.level2, false),
   METASTORE_CACHE_LEVEL2_TYPE(datanucleus.cache.level2.type,
 none),
  
   Furthermore, my reading of
   http://www.datanucleus.org/products/datanucleus/jdo/cache.html
 suggests
   that there is no sophistication in invalidation - seems like only
   time-based invalidation is supported and it can't work across
multiple
  PMFs
   (therefore, multiple thrift metastore servers)
  
   Solution Outline
   ---
  
  - Every table / partition will have an additional property called
  'version'
  - Any call that modifies table or partition will bump up version
of
  the
  table / partition
  - Guava based cache of thrift objects that come from metastore
calls
  - We fire a single SQL matching versions before returning from
cache
  - It is conceivable to have a mode wherein invalidation based on
  version
  happens in a background thread (for higher performance, lower
  fidelity)
  - Not proposing any locking (not shooting for world peace
here :) )
  - We could extend HiveMetaStore class or create a new server
  altogether
  
   Is this something that would be interesting to the community? Is this
   problem already solved and should I spend my time watching GoT
instead?
  
   Thanks
   Siva
  
 



Re: [VOTE] Stable releases from branch-1 and experimental releases from master

2015-05-27 Thread Vikram Dixit K
+1 for all the reasons outlined.

On Tue, May 26, 2015 at 6:13 PM, Thejas Nair thejas.n...@gmail.com wrote:
 +1
 - This is great for users who want to take longer to upgrade from
 hadoop-1 and care mainly for bug fixes and incremental features,
 rather than radical new features.
 - The ability to release initial 2.x releases marked as alpha/beta
 also helps to get users to try it out, and also lets them choose what
 is right for them.
 - This also lets developers focus on major new features without the
 burden of maintaining hadoop-1 compatibility.

 On Tue, May 26, 2015 at 11:41 AM, Alan Gates alanfga...@gmail.com wrote:
 We have discussed this for several weeks now.  Some concerns have been
 raised which I have tried to address.  I think it is time to vote on it as
 our release plan.  To be specific, I propose:

 Hive makes a branch-1 from the current master.  This would be used for 1.3
 and future 1.x releases.  This branch would not deprecate existing
 functionality.  Any new features in this branch would also need to be put on
 master.  An upgrade path for users will be maintained from one 1.x release
 to the next, as well as from the latest 1.x release to the latest 2.x
 release.

 Going forward releases numbered 2.x will be made from master.  The purpose
 of these releases will be to enable users to get access to new features
 being developed in Hive and allow developers to get feedback.  It is
 expected that for a while these releases will not be production ready and
 will be clearly so labeled.  Some legacy features, such as Hadoop 1 and
 MapReduce, will no longer be supported in the master.  Any critical bug
 fixes (security, incorrect results, crashes) fixed in master will also be
 ported to branch-1 for at least a year.  This time period may be extended in
 the future based on the stability and adoption of 2.x releases.

 Based on Hive's bylaws this release plan vote will be open for 3 days and
 all active committers have binding votes.

 Here's my +1.

 Alan.



-- 
Nothing better than when appreciated for hard work.
-Mark


Re: Review Request 34696: HIVE-686 add UDF substring_index

2015-05-27 Thread Swarnim Kulkarni


 On May 27, 2015, 4:42 a.m., Swarnim Kulkarni wrote:
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFSubstringIndex.java,
   line 45
  https://reviews.apache.org/r/34696/diff/1/?file=972489#file972489line45
 
  Worth mentinoning in your example what the expected output would look 
  like?
 
 Alexander Pivovarov wrote:
 Not sure I got the issue...
 
 --- desc output
 hive desc function extended substring_index;
 OK
 ...
 Example:
   SELECT substring_index('www.apache.org', '.', 2);
  'www.apache'
 
 
 -- actual select
 hive SELECT substring_index('www.apache.org', '.', 2);
 OK
 www.apache

My point was just that why not also include a sample result what the users 
could expect to see after this command is executed. Might improve the 
readability a bit.


- Swarnim


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34696/#review85318
---


On May 27, 2015, 3:35 a.m., Alexander Pivovarov wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/34696/
 ---
 
 (Updated May 27, 2015, 3:35 a.m.)
 
 
 Review request for hive, Hao Cheng, Jason Dere, namit jain, and Thejas Nair.
 
 
 Bugs: HIVE-686
 https://issues.apache.org/jira/browse/HIVE-686
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 HIVE-686 add UDF substring_index
 
 
 Diffs
 -
 
   ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java 
 94a3b1787e2b3571eb7a8102c28f7334ae3fa829 
   
 ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFSubstringIndex.java
  PRE-CREATION 
   
 ql/src/test/org/apache/hadoop/hive/ql/udf/generic/TestGenericUDFSubstringIndex.java
  PRE-CREATION 
   ql/src/test/queries/clientpositive/udf_substring_index.q PRE-CREATION 
   ql/src/test/results/clientpositive/show_functions.q.out 
 16820ca887320da13a42bebe0876f29eec373c8f 
   ql/src/test/results/clientpositive/udf_substring_index.q.out PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/34696/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Alexander Pivovarov
 




[jira] [Created] (HIVE-10833) RowResolver looks mangled with CBO

2015-05-27 Thread Eugene Koifman (JIRA)
Eugene Koifman created HIVE-10833:
-

 Summary: RowResolver looks mangled with CBO 
 Key: HIVE-10833
 URL: https://issues.apache.org/jira/browse/HIVE-10833
 Project: Hive
  Issue Type: Bug
Affects Versions: 1.3.0
Reporter: Eugene Koifman
Assignee: Laljo John Pullokkaran


While working on HIVE-10828 I noticed that internal state of RowResolver looks 
odd when CBO is enabled.
Consider the script below.
{noformat}
set hive.enforce.bucketing=true;
set hive.exec.dynamic.partition.mode=nonstrict;
set hive.cbo.enable=false;

drop table if exists acid_partitioned;
create table acid_partitioned (a int, c string)
  partitioned by (p int)
  clustered by (a) into 1 buckets;
  
insert into acid_partitioned partition (p) (a,p) values(1,1);
{noformat}

With CBO on,
if you put a break point in {noformat}SemanticAnalyzer.genSelectPlan(String 
dest, ASTNode selExprList, QB qb, Operator? input,
  Operator? inputForSelectStar, boolean outerLV){noformat} at line 

_selectStar = selectStar  exprList.getChildCount() == posn + 1;_

(currently 3865) and examine _out_rwsch.rslvMap_ variable looks like 
{noformat}{null={values__tmp__table__1.tmp_values_col1=_col0: string, 
values__tmp__table__1.tmp_values_col2=_col1: string}}{noformat}

with CBO disabled, the same _out_rwsch.rslvMap_ looks like
{noformat}{values__tmp__table__1={tmp_values_col1=_col0: string, 
tmp_values_col2=_col1: string}}{noformat}

The _out_rwsch.invRslvMap_ also differs in the same way.

It seems that the version you get with CBO off is the correct one since
_insert into acid_partitioned partition (p) (a,p) values(1,1)_ is rewritten to
_insert into acid_partitioned partition (p) (a,p) select * from 
values__tmp__table__1_

CC [~ashutoshc]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Review Request 34716: HIVE-10826 Support min()/max() functions over x preceding and y preceding windowing

2015-05-27 Thread Aihua Xu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34716/
---

Review request for hive.


Repository: hive-git


Description
---

HIVE-10826 Support min()/max() functions over x preceding and y preceding 
windowing


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFMax.java 
6b7808aa6e1104a0acff3bc0fe89fc92bb200803 
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFMin.java 
d931d52d0235fcd19571d317715f8a6663aeb49c 
  ql/src/test/queries/clientpositive/windowing_windowspec2.q 
d85cea987462e4c15129334aa4aed9263ef8cc01 
  ql/src/test/results/clientpositive/windowing_windowspec2.q.out 
bf916398b2d7b0198713623d23d27c2a76551bcb 

Diff: https://reviews.apache.org/r/34716/diff/


Testing
---


Thanks,

Aihua Xu



[jira] [Created] (HIVE-10836) Beeline OutOfMemoryError due to large history

2015-05-27 Thread Patrick McAnneny (JIRA)
Patrick McAnneny created HIVE-10836:
---

 Summary: Beeline OutOfMemoryError due to large history
 Key: HIVE-10836
 URL: https://issues.apache.org/jira/browse/HIVE-10836
 Project: Hive
  Issue Type: Bug
 Environment: Hive 1.1.0 on RHEL with Cloudera (cdh5.4.0)
Reporter: Patrick McAnneny


Attempting to run beeline via commandline fails with the error below due to 
large commands in the ~/.beeline/history file. Not sure if the problem also 
exists with many lines in the history or just big lines.

I had a few lines in my history file with over 1 million characters each. 
Deleting said lines from the history file resolved the issue.

Beeline version 1.1.0-cdh5.4.0 by Apache Hive
Exception in thread main java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOf(Arrays.java:2367)
at 
java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:130)
at 
java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:114)
at 
java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:535)
at java.lang.StringBuffer.append(StringBuffer.java:322)
at java.io.BufferedReader.readLine(BufferedReader.java:363)
at java.io.BufferedReader.readLine(BufferedReader.java:382)
at jline.console.history.FileHistory.load(FileHistory.java:69)
at jline.console.history.FileHistory.load(FileHistory.java:61)
at org.apache.hive.beeline.BeeLine.getConsoleReader(BeeLine.java:869)
at org.apache.hive.beeline.BeeLine.begin(BeeLine.java:766)
at 
org.apache.hive.beeline.BeeLine.mainWithInputRedirection(BeeLine.java:480)
at org.apache.hive.beeline.BeeLine.main(BeeLine.java:463)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 34576: Bucketized Table feature fails in some cases

2015-05-27 Thread John Pullokkaran


 On May 24, 2015, 1:50 a.m., Xuefu Zhang wrote:
  ql/src/java/org/apache/hadoop/hive/ql/parse/LoadSemanticAnalyzer.java, line 
  226
  https://reviews.apache.org/r/34576/diff/2/?file=971006#file971006line226
 
  Warning is proper, but I think the words should say might because the 
  source data might be already bucketed and matches the target, in which 
  case, there is no problem.

Load command doesn't excersise bucketizing. IMO will not is correct.


- John


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34576/#review85081
---


On May 23, 2015, 5:47 p.m., pengcheng xiong wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/34576/
 ---
 
 (Updated May 23, 2015, 5:47 p.m.)
 
 
 Review request for hive and John Pullokkaran.
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 Bucketized Table feature fails in some cases. if src  destination is 
 bucketed on same key, and if actual data in the src is not bucketed (because 
 data got loaded using LOAD DATA LOCAL INPATH ) then the data won't be 
 bucketed while writing to destination.
 Example
 --
 CREATE TABLE P1(key STRING, val STRING)
 CLUSTERED BY (key) SORTED BY (key) INTO 2 BUCKETS STORED AS TEXTFILE;
 LOAD DATA LOCAL INPATH '/Users/jp/apache-hive1/data/files/P1.txt' INTO TABLE 
 P1;
 – perform an insert to make sure there are 2 files
 INSERT OVERWRITE TABLE P1 select key, val from P1;
 --
 This is not a regression. This has never worked.
 This got only discovered due to Hadoop2 changes.
 In Hadoop1, in local mode, number of reducers will always be 1, regardless of 
 what is requested by app. Hadoop2 now honors the number of reducer setting in 
 local mode (by spawning threads).
 Long term solution seems to be to prevent load data for bucketed table.
 
 
 Diffs
 -
 
   ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java e53933e 
   ql/src/java/org/apache/hadoop/hive/ql/parse/LoadSemanticAnalyzer.java 
 1a9b42b 
   ql/src/test/results/clientnegative/bucket_mapjoin_mismatch1.q.out 623c2e8 
   
 ql/src/test/results/clientnegative/bucket_mapjoin_wrong_table_metadata_1.q.out
  f4522d2 
   
 ql/src/test/results/clientnegative/bucket_mapjoin_wrong_table_metadata_2.q.out
  9aa9b5d 
   ql/src/test/results/clientnegative/exim_11_nonpart_noncompat_sorting.q.out 
 9220c8e 
   ql/src/test/results/clientpositive/auto_join32.q.out bfc8be8 
   ql/src/test/results/clientpositive/auto_join_filters.q.out a6720d9 
   ql/src/test/results/clientpositive/auto_sortmerge_join_1.q.out 383defd 
   ql/src/test/results/clientpositive/auto_sortmerge_join_11.q.out e6e7ef3 
   ql/src/test/results/clientpositive/auto_sortmerge_join_12.q.out e9fb705 
   ql/src/test/results/clientpositive/auto_sortmerge_join_2.q.out c089419 
   ql/src/test/results/clientpositive/auto_sortmerge_join_3.q.out 6e443fa 
   ql/src/test/results/clientpositive/auto_sortmerge_join_4.q.out feaea04 
   ql/src/test/results/clientpositive/auto_sortmerge_join_5.q.out f64ecf0 
   ql/src/test/results/clientpositive/auto_sortmerge_join_7.q.out e89f548 
   ql/src/test/results/clientpositive/auto_sortmerge_join_8.q.out 44c037f 
   ql/src/test/results/clientpositive/bucket_map_join_1.q.out d778203 
   ql/src/test/results/clientpositive/bucket_map_join_2.q.out aef77aa 
   ql/src/test/results/clientpositive/bucket_map_join_spark1.q.out 870ecdd 
   ql/src/test/results/clientpositive/bucket_map_join_spark2.q.out 33f5c46 
   ql/src/test/results/clientpositive/bucket_map_join_spark3.q.out 067d1ff 
   ql/src/test/results/clientpositive/bucketcontext_1.q.out 77bfcf9 
   ql/src/test/results/clientpositive/bucketcontext_2.q.out a9db13d 
   ql/src/test/results/clientpositive/bucketcontext_3.q.out 9ba3e0c 
   ql/src/test/results/clientpositive/bucketcontext_4.q.out a2b37a8 
   ql/src/test/results/clientpositive/bucketcontext_5.q.out 3ee1f0e 
   ql/src/test/results/clientpositive/bucketcontext_6.q.out d2304fa 
   ql/src/test/results/clientpositive/bucketcontext_7.q.out 1a105ed 
   ql/src/test/results/clientpositive/bucketcontext_8.q.out 138e415 
   ql/src/test/results/clientpositive/bucketizedhiveinputformat_auto.q.out 
 215efdd 
   ql/src/test/results/clientpositive/bucketmapjoin1.q.out 72f2a07 
   ql/src/test/results/clientpositive/bucketmapjoin10.q.out b0e849d 
   ql/src/test/results/clientpositive/bucketmapjoin11.q.out 4263cab 
   ql/src/test/results/clientpositive/bucketmapjoin12.q.out bcd7394 
   ql/src/test/results/clientpositive/bucketmapjoin2.q.out a8d9e9d 
   ql/src/test/results/clientpositive/bucketmapjoin3.q.out c759f05 
   

Re: Review Request 34696: HIVE-686 add UDF substring_index

2015-05-27 Thread Swarnim Kulkarni


 On May 27, 2015, 4:42 a.m., Swarnim Kulkarni wrote:
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFSubstringIndex.java,
   line 45
  https://reviews.apache.org/r/34696/diff/1/?file=972489#file972489line45
 
  Worth mentinoning in your example what the expected output would look 
  like?
 
 Alexander Pivovarov wrote:
 Not sure I got the issue...
 
 --- desc output
 hive desc function extended substring_index;
 OK
 ...
 Example:
   SELECT substring_index('www.apache.org', '.', 2);
  'www.apache'
 
 
 -- actual select
 hive SELECT substring_index('www.apache.org', '.', 2);
 OK
 www.apache
 
 Swarnim Kulkarni wrote:
 My point was just that why not also include a sample result what the 
 users could expect to see after this command is executed. Might improve the 
 readability a bit.
 
 Alexander Pivovarov wrote:
 it's included. The result is 'www.apache' - right adter \n symbol

Ah ok. Sorry missed that.


- Swarnim


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34696/#review85318
---


On May 27, 2015, 3:35 a.m., Alexander Pivovarov wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/34696/
 ---
 
 (Updated May 27, 2015, 3:35 a.m.)
 
 
 Review request for hive, Hao Cheng, Jason Dere, namit jain, and Thejas Nair.
 
 
 Bugs: HIVE-686
 https://issues.apache.org/jira/browse/HIVE-686
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 HIVE-686 add UDF substring_index
 
 
 Diffs
 -
 
   ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java 
 94a3b1787e2b3571eb7a8102c28f7334ae3fa829 
   
 ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFSubstringIndex.java
  PRE-CREATION 
   
 ql/src/test/org/apache/hadoop/hive/ql/udf/generic/TestGenericUDFSubstringIndex.java
  PRE-CREATION 
   ql/src/test/queries/clientpositive/udf_substring_index.q PRE-CREATION 
   ql/src/test/results/clientpositive/show_functions.q.out 
 16820ca887320da13a42bebe0876f29eec373c8f 
   ql/src/test/results/clientpositive/udf_substring_index.q.out PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/34696/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Alexander Pivovarov
 




Re: Review Request 34576: Bucketized Table feature fails in some cases

2015-05-27 Thread John Pullokkaran


 On May 24, 2015, 2:03 a.m., Xuefu Zhang wrote:
  Have you thought of what if the client is not interactive, such as JDBC or 
  thrift?
 
 pengcheng xiong wrote:
 I am sorry that we have not thought about it yet. We admitted that the 
 patch will not cover the case when the client is not interactive. Do you have 
 any good ideas that you can share with us? Do you think logging this besides 
 printing a waring msg is good enough? Thanks.
 
 Xuefu Zhang wrote:
 There are all kinds of issues with data loading into bucketed tables. 
 While advanced users might be able to load data correctly, I think that's 
 really rare. The data in a bucketed table needs to be generated by Hive. 
 Thefore, I think we should disable insert into and load data 
 into|overwrite for a bucketed table. We should also disallow external tables 
 for the same reason.
 
 To allow the advanced user to achieve what they used to do, we can have a 
 flag, such as hive.enforce.strict.bucketing, which defaults to true. Those 
 users can proceed by turning this off.
 
 Another option for insert into would be supporting appending new data, 
 such as proposed in HIVE-3244.
 
 Gopal V wrote:
 Why would you disable insert into bucketed tables? How else would ACID 
 work?
 
 Xuefu Zhang wrote:
 yeah. but I guess we were talking about things out of the context of 
 ACID. Even before ACID, user can do insert into a bucketed table, which can 
 be very harmful.

This patch is only addressing Load path. Which i think we all agree is a 
problem.


- John


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34576/#review85082
---


On May 23, 2015, 5:47 p.m., pengcheng xiong wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/34576/
 ---
 
 (Updated May 23, 2015, 5:47 p.m.)
 
 
 Review request for hive and John Pullokkaran.
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 Bucketized Table feature fails in some cases. if src  destination is 
 bucketed on same key, and if actual data in the src is not bucketed (because 
 data got loaded using LOAD DATA LOCAL INPATH ) then the data won't be 
 bucketed while writing to destination.
 Example
 --
 CREATE TABLE P1(key STRING, val STRING)
 CLUSTERED BY (key) SORTED BY (key) INTO 2 BUCKETS STORED AS TEXTFILE;
 LOAD DATA LOCAL INPATH '/Users/jp/apache-hive1/data/files/P1.txt' INTO TABLE 
 P1;
 – perform an insert to make sure there are 2 files
 INSERT OVERWRITE TABLE P1 select key, val from P1;
 --
 This is not a regression. This has never worked.
 This got only discovered due to Hadoop2 changes.
 In Hadoop1, in local mode, number of reducers will always be 1, regardless of 
 what is requested by app. Hadoop2 now honors the number of reducer setting in 
 local mode (by spawning threads).
 Long term solution seems to be to prevent load data for bucketed table.
 
 
 Diffs
 -
 
   ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java e53933e 
   ql/src/java/org/apache/hadoop/hive/ql/parse/LoadSemanticAnalyzer.java 
 1a9b42b 
   ql/src/test/results/clientnegative/bucket_mapjoin_mismatch1.q.out 623c2e8 
   
 ql/src/test/results/clientnegative/bucket_mapjoin_wrong_table_metadata_1.q.out
  f4522d2 
   
 ql/src/test/results/clientnegative/bucket_mapjoin_wrong_table_metadata_2.q.out
  9aa9b5d 
   ql/src/test/results/clientnegative/exim_11_nonpart_noncompat_sorting.q.out 
 9220c8e 
   ql/src/test/results/clientpositive/auto_join32.q.out bfc8be8 
   ql/src/test/results/clientpositive/auto_join_filters.q.out a6720d9 
   ql/src/test/results/clientpositive/auto_sortmerge_join_1.q.out 383defd 
   ql/src/test/results/clientpositive/auto_sortmerge_join_11.q.out e6e7ef3 
   ql/src/test/results/clientpositive/auto_sortmerge_join_12.q.out e9fb705 
   ql/src/test/results/clientpositive/auto_sortmerge_join_2.q.out c089419 
   ql/src/test/results/clientpositive/auto_sortmerge_join_3.q.out 6e443fa 
   ql/src/test/results/clientpositive/auto_sortmerge_join_4.q.out feaea04 
   ql/src/test/results/clientpositive/auto_sortmerge_join_5.q.out f64ecf0 
   ql/src/test/results/clientpositive/auto_sortmerge_join_7.q.out e89f548 
   ql/src/test/results/clientpositive/auto_sortmerge_join_8.q.out 44c037f 
   ql/src/test/results/clientpositive/bucket_map_join_1.q.out d778203 
   ql/src/test/results/clientpositive/bucket_map_join_2.q.out aef77aa 
   ql/src/test/results/clientpositive/bucket_map_join_spark1.q.out 870ecdd 
   ql/src/test/results/clientpositive/bucket_map_join_spark2.q.out 33f5c46 
   ql/src/test/results/clientpositive/bucket_map_join_spark3.q.out 067d1ff 
   

[jira] [Created] (HIVE-10837) Running large queries (inserts) fails and crashes hiveserver2

2015-05-27 Thread Patrick McAnneny (JIRA)
Patrick McAnneny created HIVE-10837:
---

 Summary: Running large queries (inserts) fails and crashes 
hiveserver2
 Key: HIVE-10837
 URL: https://issues.apache.org/jira/browse/HIVE-10837
 Project: Hive
  Issue Type: Bug
 Environment: Hive 1.1.0 on RHEL with Cloudera (cdh5.4.0)
Reporter: Patrick McAnneny
Priority: Critical


When running a large insert statement through beeline or pyhs2, a thrift error 
is returned and hiveserver2 crashes.

I ran into this with large insert statements -- my initial failing query was 
around 6million characters. After further testing however it seems like the 
failure threshold is based on number of inserted rows rather than the query's 
size in characters. My testing shows the failure threshold between 199,000 and 
230,000 inserted rows.

The thrift error is as follows:

Error: org.apache.thrift.transport.TTransportException: 
java.net.SocketException: Broken pipe (state=08S01,code=0)


Also note for anyone that tests this issue - when testing different queries I 
ran into https://issues.apache.org/jira/browse/HIVE-10836




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 34447: HIVE-10761 : Create codahale-based metrics system for Hive

2015-05-27 Thread Xuefu Zhang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34447/#review85418
---



common/src/java/org/apache/hadoop/hive/common/JvmPauseMonitor.java
https://reviews.apache.org/r/34447/#comment136981

Maybe a more informational message



common/src/java/org/apache/hadoop/hive/common/JvmPauseMonitor.java
https://reviews.apache.org/r/34447/#comment136983

should we check isStarted()?



common/src/java/org/apache/hadoop/hive/common/metrics/MetricsLegacy.java
https://reviews.apache.org/r/34447/#comment137016

LegacyMetrics?



common/src/java/org/apache/hadoop/hive/common/metrics/common/MetricsFactory.java
https://reviews.apache.org/r/34447/#comment136986

This should be also synchronized.



common/src/java/org/apache/hadoop/hive/common/metrics/common/MetricsFactory.java
https://reviews.apache.org/r/34447/#comment137006

Should we call it deinit()?



common/src/java/org/apache/hadoop/hive/common/metrics/metrics2/Metrics.java
https://reviews.apache.org/r/34447/#comment137008

Could we rename the class so that we don't have to handle the duplicated 
class/interface names?



common/src/java/org/apache/hadoop/hive/common/metrics/metrics2/Metrics.java
https://reviews.apache.org/r/34447/#comment137010

Could we rename the class so that we don't have to handle the duplicated 
class/interface names?



common/src/java/org/apache/hadoop/hive/common/metrics/metrics2/Metrics.java
https://reviews.apache.org/r/34447/#comment137009

If the synchronized block is for the whole method, we might just as well 
declare the whole method as synchronized.



common/src/java/org/apache/hadoop/hive/common/metrics/metrics2/Metrics.java
https://reviews.apache.org/r/34447/#comment137011

Same as above.



common/src/java/org/apache/hadoop/hive/common/metrics/metrics2/Metrics.java
https://reviews.apache.org/r/34447/#comment137013

Shouldn't this be private?



common/src/java/org/apache/hadoop/hive/common/metrics/metrics2/Metrics.java
https://reviews.apache.org/r/34447/#comment137012

I think fd needs to be closed properly in final block.



common/src/java/org/apache/hadoop/hive/common/metrics/metrics2/Metrics.java
https://reviews.apache.org/r/34447/#comment137014

I think checking initialized needs to be synchronized.



common/src/java/org/apache/hadoop/hive/common/metrics/metrics2/Metrics.java
https://reviews.apache.org/r/34447/#comment137015

Same as above.



metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
https://reviews.apache.org/r/34447/#comment137007

Where do we call uninit() or it doesn't matter? Same for HS2.


- Xuefu Zhang


On May 27, 2015, 6:25 p.m., Szehon Ho wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/34447/
 ---
 
 (Updated May 27, 2015, 6:25 p.m.)
 
 
 Review request for hive, Chao Sun, Jimmy Xiang, and Xuefu Zhang.
 
 
 Bugs: HIVE-10761
 https://issues.apache.org/jira/browse/HIVE-10761
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 See JIRA for the motivation.  Summary: There is an existing metric system 
 that uses some custom model and hooked up to JMX reporting, codahale-based 
 metrics system will be desirable for standard model and reporting.
 
 This adds a codahale-based metrics system to HiveServer2 and HiveMetastore.  
 Metrics implementation is now internally pluggable, and the existing Metrics 
 system can be re-enabled by configuration if desired for 
 backward-compatibility.
 
 Following metrics are supported by Metrics system:
 1.  JVMPauseMonitor (used to call Hadoop's internal implementation, now 
 forked off to integrate with Metrics system)
 2.  HMS API calls
 3.  Standard JVM metrics (only for new implementation, as its free with 
 codahale).
 
 The following metrics reporting are supported by new system (configuration 
 exposed)
 1.  JMX
 2.  CONSOLE
 3.  JSON_FILE (periodic file of metrics that gets overwritten).
 
 A goal is to add a webserver that exposes the JSON metrics, but this will 
 defer to a later implementation.
 
 
 Diffs
 -
 
   common/pom.xml a615c1e 
   common/src/java/org/apache/hadoop/hive/common/JvmPauseMonitor.java 
 PRE-CREATION 
   common/src/java/org/apache/hadoop/hive/common/metrics/Metrics.java 01c9d1d 
   common/src/java/org/apache/hadoop/hive/common/metrics/MetricsLegacy.java 
 PRE-CREATION 
   common/src/java/org/apache/hadoop/hive/common/metrics/common/Metrics.java 
 PRE-CREATION 
   
 common/src/java/org/apache/hadoop/hive/common/metrics/common/MetricsFactory.java
  PRE-CREATION 
   common/src/java/org/apache/hadoop/hive/common/metrics/metrics2/Metrics.java 
 PRE-CREATION 
   
 

Re: Review Request 34586: HIVE-10704

2015-05-27 Thread Mostafa Mokhtar

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34586/
---

(Updated May 27, 2015, 6:32 a.m.)


Review request for hive.


Repository: hive-git


Description
---

fix biggest small table selection when table sizes are 0
fallback to dividing memory equally if any tables have invalid size


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/HashTableLoader.java 
536b92c5dd03abe9ff57bf64d87be0f3ef34aa7a 

Diff: https://reviews.apache.org/r/34586/diff/


Testing
---


Thanks,

Mostafa Mokhtar



Re: Review Request 34586: HIVE-10704

2015-05-27 Thread Mostafa Mokhtar

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34586/
---

(Updated May 27, 2015, 6:33 a.m.)


Review request for hive.


Repository: hive-git


Description
---

fix biggest small table selection when table sizes are 0
fallback to dividing memory equally if any tables have invalid size


Diffs (updated)
-

  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/HashTableLoader.java 536b92c 

Diff: https://reviews.apache.org/r/34586/diff/


Testing
---


Thanks,

Mostafa Mokhtar



Re: Review Request 34586: HIVE-10704

2015-05-27 Thread Mostafa Mokhtar

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34586/
---

(Updated May 27, 2015, 6:30 a.m.)


Review request for hive.


Repository: hive-git


Description
---

fix biggest small table selection when table sizes are 0
fallback to dividing memory equally if any tables have invalid size


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/HashTableLoader.java 
536b92c5dd03abe9ff57bf64d87be0f3ef34aa7a 

Diff: https://reviews.apache.org/r/34586/diff/


Testing
---


File Attachments (updated)


HIVE-10704.3.patch
  
https://reviews.apache.org/media/uploaded/files/2015/05/27/4a999c9c-1c3f-44dd-a321-a4157a067300__HIVE-10704.3.patch


Thanks,

Mostafa Mokhtar



RE: Build hive failure on ubuntu 15.04 with oracle java 1.8

2015-05-27 Thread 煜 韦
This is known issue. https://issues.apache.org/jira/browse/HIVE-10674

 From: yu20...@hotmail.com
 To: dev@hive.apache.org
 Subject: Build hive failure on ubuntu 15.04 with oracle java 1.8
 Date: Tue, 26 May 2015 11:58:45 +0800
 
 Hi guys,
 I tried to built hive 1.2.0 on ubuntu 15.04 with oracle Java 1.8. Then I 
 encountered following problem.What should I do to fix this issue?
 Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=512m; 
 support was removed in 8.0Running 
 org.apache.hadoop.hive.metastore.TestMetastoreExprTests run: 1, Failures: 0, 
 Errors: 0, Skipped: 0, Time elapsed: 8.719 sec - in 
 org.apache.hadoop.hive.metastore.TestMetastoreExprResults :Failed tests:  
 TestExecDriver.testMapRedPlan1:513-executePlan:487 expected:true but 
 was:false  TestExecDriver.testMapRedPlan2:522-executePlan:487 
 expected:true but was:false  
 TestExecDriver.testMapRedPlan3:531-executePlan:487 expected:true but 
 was:false  TestExecDriver.testMapRedPlan4:540-executePlan:487 
 expected:true but was:false  
 TestExecDriver.testMapRedPlan5:549-executePlan:487 expected:true but 
 was:false  TestExecDriver.testMapRedPlan6:558-executePlan:487 
 expected:true but was:false  
 TestExecDriver.testMapPlan1:496-executePlan:487 expected:true but 
 was:false  TestExecDriver.testMapPlan2:504-executePlan:487 expected:true 
 but was:false  TestSessionState.testReloadExistingAuxJars2:234 Could not 
 find SessionStateTest.jar.v1  TestSessionState.testReloadAuxJars2:191 Could 
 not find SessionStateTest.jar.v1  
 TestSessionState.testReloadExistingAuxJars2:234 Could not find 
 SessionStateTest.jar.v1  TestSessionState.testReloadAuxJars2:191 Could not 
 find SessionStateTest.jar.v1Tests run: 3545, Failures: 12, Errors: 0, 
 Skipped: 1[ERROR] Failed to execute goal 
 org.apache.maven.plugins:maven-surefire-plugin:2.16:test (default-test) on 
 project hive-exec: There are test failures.[ERROR][ERROR] Please refer to 
 /home/hadoop/apache-hive-1.2.0-src/ql/target/surefire-reports for the 
 individual test results.[ERROR] - [Help 
 1]org.apache.maven.lifecycle.LifecycleExecutionException: Failed to execute 
 goal org.apache.maven.plugins:maven-surefire-plugin:2.16:test (default-test) 
 on project hive-exec: There are test failures.Please refer to 
 /home/hadoop/apache-hive-1.2.0-src/ql/target/surefire-reports for the 
 individual test results.at 
 org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:213)
 at 
 org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:153)
 at 
 org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:145)
 at 
 org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:84)
 at 
 org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:59)
 at 
 org.apache.maven.lifecycle.internal.LifecycleStarter.singleThreadedBuild(LifecycleStarter.java:183)
 at 
 org.apache.maven.lifecycle.internal.LifecycleStarter.execute(LifecycleStarter.java:161)
 at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:320) 
at org.apache.maven.DefaultMaven.execute(DefaultMaven.java:156)at 
 org.apache.maven.cli.MavenCli.execute(MavenCli.java:537)at 
 org.apache.maven.cli.MavenCli.doMain(MavenCli.java:196)at 
 org.apache.maven.cli.MavenCli.main(MavenCli.java:141)at 
 sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:497)at 
 org.codehaus.plexus.classworlds.launcher.Launcher.launchEnhanced(Launcher.java:289)
 at 
 org.codehaus.plexus.classworlds.launcher.Launcher.launch(Launcher.java:229)   
  at 
 org.codehaus.plexus.classworlds.launcher.Launcher.mainWithExitCode(Launcher.java:415)
 at 
 org.codehaus.plexus.classworlds.launcher.Launcher.main(Launcher.java:356)Caused
  by: org.apache.maven.plugin.MojoFailureException: There are test 
 failures.Thanks,Jared 
  

Review Request 34754: NumberFormatException while running analyze table partition compute statics query

2015-05-27 Thread Ashutosh Chauhan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34754/
---

Review request for hive and pengcheng xiong.


Bugs: HIVE-10840
https://issues.apache.org/jira/browse/HIVE-10840


Repository: hive-git


Description
---

NumberFormatException while running analyze table partition compute statics 
query


Diffs
-

  itests/src/test/resources/testconfiguration.properties ae03283 
  ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUtils.java ad481bc 
  ql/src/test/queries/clientpositive/stats_only_null.q a91022c 
  ql/src/test/results/clientpositive/tez/stats_only_null.q.out PRE-CREATION 

Diff: https://reviews.apache.org/r/34754/diff/


Testing
---

Modified existing test to increase its coverage.


Thanks,

Ashutosh Chauhan



Re: Review Request 34455: HIVE-10550 Dynamic RDD caching optimization for HoS.[Spark Branch]

2015-05-27 Thread Xuefu Zhang


 On May 27, 2015, 10:13 p.m., Xuefu Zhang wrote:
  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java, line 2062
  https://reviews.apache.org/r/34455/diff/3/?file=972428#file972428line2062
 
  Sorry for pointing this out late. I'm not certain if it's a good idea 
  to expose these two configurations. Also this introduces a change of  
  behavior. For now, can we get rid of them and change the persistency level 
  back to MEM+DISK?
  
  We can come back to revisit this later on. At this moment, I don't feel 
  confident to make the call.
 
 chengxiang li wrote:
 persistent to MEM + DISK may hurt the performance in certain cases, i 
 think at least we should have a switch to open/close this optimization,
 
 Xuefu Zhang wrote:
 Agreed. However, before we find out more about in what cases this helps 
 or hurts, I think it's better we keep the existing behavior. This doesn't 
 prevent us from adding a flag later on.
 
 chengxiang li wrote:
 Ok, i would remove these configurations from patch in temp, we can 
 discuss later when we got more knowledge about it.

Please feel free to create a followup JIRA to do more research. We can try 
different data sizes and persistancy levels to see the result. At that time, we 
can decide if it makes sense to introduce configurations. Thanks.


- Xuefu


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34455/#review85451
---


On May 28, 2015, 3:30 a.m., chengxiang li wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/34455/
 ---
 
 (Updated May 28, 2015, 3:30 a.m.)
 
 
 Review request for hive, Chao Sun, Jimmy Xiang, and Xuefu Zhang.
 
 
 Bugs: HIVE-10550
 https://issues.apache.org/jira/browse/HIVE-10550
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 see jira description
 
 
 Diffs
 -
 
   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/CacheTran.java 
 PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/MapTran.java 2170243 
   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/ReduceTran.java e60dfac 
   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkPlan.java ee5c78a 
   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkPlanGenerator.java 
 3f240f5 
   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkUtilities.java 
 e6c845c 
 
 Diff: https://reviews.apache.org/r/34455/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 chengxiang li
 




Re: Big Lock in Driver.compileInternal

2015-05-27 Thread Sergey Shelukhin
Hi. As luck would have it, we are currently looking at this issue :)
I have a small patch up at
https://issues.apache.org/jira/browse/HIVE-4239; I tested it a bit w/a
unit test and some manual cluster testing. Would you be willing to test it
on your setup?

On 15/5/25, 20:54, Loudongfeng loudongf...@huawei.com wrote:

Hi, All

I notice that there is a big lock in org.apache.hadoop.hive.ql.Driver
Following is a piece of code from Apache Hive 1.2.0

private static final Object compileMonitor = new Object();

private int compileInternal(String command) {
  int ret;
  synchronized (compileMonitor) {
ret = compile(command);
  }
...
}

This means HQLs submitted concurrently from clients side will be compiled
one by one on Hive Server side.
This will cause problem when compile phase is slow.

My question is ,what does this lock protect for? Is it possible to remove
it ?

Best Regards
Nemon



Re: Review Request 34455: HIVE-10550 Dynamic RDD caching optimization for HoS.[Spark Branch]

2015-05-27 Thread Xuefu Zhang


 On May 27, 2015, 10:13 p.m., Xuefu Zhang wrote:
  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java, line 2062
  https://reviews.apache.org/r/34455/diff/3/?file=972428#file972428line2062
 
  Sorry for pointing this out late. I'm not certain if it's a good idea 
  to expose these two configurations. Also this introduces a change of  
  behavior. For now, can we get rid of them and change the persistency level 
  back to MEM+DISK?
  
  We can come back to revisit this later on. At this moment, I don't feel 
  confident to make the call.
 
 chengxiang li wrote:
 persistent to MEM + DISK may hurt the performance in certain cases, i 
 think at least we should have a switch to open/close this optimization,

Agreed. However, before we find out more about in what cases this helps or 
hurts, I think it's better we keep the existing behavior. This doesn't prevent 
us from adding a flag later on.


- Xuefu


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34455/#review85451
---


On May 27, 2015, 1:50 a.m., chengxiang li wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/34455/
 ---
 
 (Updated May 27, 2015, 1:50 a.m.)
 
 
 Review request for hive, Chao Sun, Jimmy Xiang, and Xuefu Zhang.
 
 
 Bugs: HIVE-10550
 https://issues.apache.org/jira/browse/HIVE-10550
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 see jira description
 
 
 Diffs
 -
 
   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 43c53fc 
   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/CacheTran.java 
 PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/MapTran.java 2170243 
   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/ReduceTran.java e60dfac 
   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkPlan.java ee5c78a 
   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkPlanGenerator.java 
 3f240f5 
   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkUtilities.java 
 e6c845c 
   
 ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/SparkRddCachingResolver.java
  PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkCompiler.java 
 19aae70 
   ql/src/java/org/apache/hadoop/hive/ql/plan/SparkWork.java bb5dd79 
 
 Diff: https://reviews.apache.org/r/34455/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 chengxiang li
 




Re: Review Request 34455: HIVE-10550 Dynamic RDD caching optimization for HoS.[Spark Branch]

2015-05-27 Thread chengxiang li

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34455/
---

(Updated May 28, 2015, 3:30 a.m.)


Review request for hive, Chao Sun, Jimmy Xiang, and Xuefu Zhang.


Changes
---

remove configs, and move common parent match logic in SparkPlanGenerator 
directly.


Bugs: HIVE-10550
https://issues.apache.org/jira/browse/HIVE-10550


Repository: hive-git


Description
---

see jira description


Diffs (updated)
-

  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/CacheTran.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/MapTran.java 2170243 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/ReduceTran.java e60dfac 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkPlan.java ee5c78a 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkPlanGenerator.java 
3f240f5 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkUtilities.java e6c845c 

Diff: https://reviews.apache.org/r/34455/diff/


Testing
---


Thanks,

chengxiang li



Review Request 34757: HIVE-10844: Combine equivalent Works for HoS[Spark Branch]

2015-05-27 Thread chengxiang li

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34757/
---

Review request for hive and Xuefu Zhang.


Bugs: HIVE-10844
https://issues.apache.org/jira/browse/HIVE-10844


Repository: hive-git


Description
---

Some Hive queries(like TPCDS Q39) may share the same subquery, which translated 
into sperate, but equivalent Works in SparkWork, combining these equivalent 
Works into a single one would help to benifit from following dynamic RDD 
caching optimization.


Diffs
-

  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/CombineEquivalentWorkResolver.java
 PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkCompiler.java 19aae70 

Diff: https://reviews.apache.org/r/34757/diff/


Testing
---


Thanks,

chengxiang li



Re: Review Request 34455: HIVE-10550 Dynamic RDD caching optimization for HoS.[Spark Branch]

2015-05-27 Thread chengxiang li


 On 五月 27, 2015, 10:13 p.m., Xuefu Zhang wrote:
  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java, line 2062
  https://reviews.apache.org/r/34455/diff/3/?file=972428#file972428line2062
 
  Sorry for pointing this out late. I'm not certain if it's a good idea 
  to expose these two configurations. Also this introduces a change of  
  behavior. For now, can we get rid of them and change the persistency level 
  back to MEM+DISK?
  
  We can come back to revisit this later on. At this moment, I don't feel 
  confident to make the call.

persistent to MEM + DISK may hurt the performance in certain cases, i think at 
least we should have a switch to open/close this optimization,


- chengxiang


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34455/#review85451
---


On 五月 27, 2015, 1:50 a.m., chengxiang li wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/34455/
 ---
 
 (Updated 五月 27, 2015, 1:50 a.m.)
 
 
 Review request for hive, Chao Sun, Jimmy Xiang, and Xuefu Zhang.
 
 
 Bugs: HIVE-10550
 https://issues.apache.org/jira/browse/HIVE-10550
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 see jira description
 
 
 Diffs
 -
 
   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 43c53fc 
   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/CacheTran.java 
 PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/MapTran.java 2170243 
   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/ReduceTran.java e60dfac 
   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkPlan.java ee5c78a 
   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkPlanGenerator.java 
 3f240f5 
   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkUtilities.java 
 e6c845c 
   
 ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/SparkRddCachingResolver.java
  PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkCompiler.java 
 19aae70 
   ql/src/java/org/apache/hadoop/hive/ql/plan/SparkWork.java bb5dd79 
 
 Diff: https://reviews.apache.org/r/34455/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 chengxiang li
 




Re: Review Request 34248: HIVE-10684 Fix the unit test failures for HIVE-7553 after HIVE-10674 removed the binary jar files

2015-05-27 Thread cheng xu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34248/
---

(Updated May 28, 2015, 2:31 a.m.)


Review request for hive and Sushanth Sowmyan.


Bugs: HIVE-10684
https://issues.apache.org/jira/browse/HIVE-10684


Repository: hive-git


Description
---

Remove binaries from source and fix the failed cases


Diffs (updated)
-

  ql/src/test/org/apache/hadoop/hive/ql/session/TestSessionState.java 45ba07e 
  ql/src/test/resources/RefreshedJarClassV1.txt PRE-CREATION 
  ql/src/test/resources/RefreshedJarClassV2.txt PRE-CREATION 

Diff: https://reviews.apache.org/r/34248/diff/


Testing
---

UT passed


Thanks,

cheng xu



Re: Review Request 34455: HIVE-10550 Dynamic RDD caching optimization for HoS.[Spark Branch]

2015-05-27 Thread Xuefu Zhang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34455/#review85509
---

Ship it!


Ship It!

- Xuefu Zhang


On May 28, 2015, 3:30 a.m., chengxiang li wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/34455/
 ---
 
 (Updated May 28, 2015, 3:30 a.m.)
 
 
 Review request for hive, Chao Sun, Jimmy Xiang, and Xuefu Zhang.
 
 
 Bugs: HIVE-10550
 https://issues.apache.org/jira/browse/HIVE-10550
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 see jira description
 
 
 Diffs
 -
 
   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/CacheTran.java 
 PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/MapTran.java 2170243 
   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/ReduceTran.java e60dfac 
   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkPlan.java ee5c78a 
   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkPlanGenerator.java 
 3f240f5 
   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkUtilities.java 
 e6c845c 
 
 Diff: https://reviews.apache.org/r/34455/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 chengxiang li
 




Re: How to debug hive unit test in eclipse ?

2015-05-27 Thread Lefty Leverenz
This is great -- thanks Bob!

Would you be willing to contribute it to the Hive wiki, or at least allow
us to link to it from the Testing Docs overview
https://cwiki.apache.org/confluence/display/Hive/TestingDocs?

-- Lefty


On Mon, May 25, 2015 at 12:02 PM, Bob Freitas bob.e.frei...@gmail.com
wrote:

 Hi Jeff,

 I recently needed to figure out how to do unit testing of Hive scripts, and
 it turned out to be something of an adventure.  I had done some previous
 work in this area but things have changed with MR2 and YARN, gee go
 figure...

 What I ended up doing was going through the Hive source code to figure out
 how the dev team was doing the testing.  To help out people who come after
 me, I put together an article and github repo

 http://www.lopakalogic.com/articles/hadoop-articles/hive-testing/

 With this I was able to step through my script, the Hadoop code, the Hive
 code, it was pretty cool!

 Hope it helps!



[GitHub] hive pull request: Hive 10843

2015-05-27 Thread thejasmn
GitHub user thejasmn opened a pull request:

https://github.com/apache/hive/pull/39

Hive 10843



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/thejasmn/hive HIVE-10843

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/hive/pull/39.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #39


commit 9a99f25a0acc1d5b8e611fead6f5dffa985176e8
Author: Thejas Nair the...@hortonworks.com
Date:   2015-05-27T18:12:52Z

show tables now passes the current db name

commit 574e3da1220500d1548d4b2431883db8a7da6028
Author: Thejas Nair the...@hortonworks.com
Date:   2015-05-28T00:35:21Z

add db info in describe db command




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] hive pull request: HIVE-10843

2015-05-27 Thread thejasmn
GitHub user thejasmn opened a pull request:

https://github.com/apache/hive/pull/40

HIVE-10843



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/thejasmn/hive HIVE-10843

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/hive/pull/40.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #40


commit 9a99f25a0acc1d5b8e611fead6f5dffa985176e8
Author: Thejas Nair the...@hortonworks.com
Date:   2015-05-27T18:12:52Z

show tables now passes the current db name

commit 574e3da1220500d1548d4b2431883db8a7da6028
Author: Thejas Nair the...@hortonworks.com
Date:   2015-05-28T00:35:21Z

add db info in describe db command




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: Review Request 34447: HIVE-10761 : Create codahale-based metrics system for Hive

2015-05-27 Thread Xuefu Zhang


 On May 27, 2015, 9:29 p.m., Xuefu Zhang wrote:
  common/src/java/org/apache/hadoop/hive/common/metrics/metrics2/Metrics.java,
   line 141
  https://reviews.apache.org/r/34447/diff/3/?file=972974#file972974line141
 
  If the synchronized block is for the whole method, we might just as 
  well declare the whole method as synchronized.
 
 Szehon Ho wrote:
 In this context, I think a object synchronization makes more sense than 
 synchronizing on the class (sycnrhonized method).

I think they are equivalent. A synchronized method is synchronizing on this. 
It will be on the class if the method is static.


- Xuefu


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34447/#review85418
---


On May 28, 2015, 2:11 a.m., Szehon Ho wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/34447/
 ---
 
 (Updated May 28, 2015, 2:11 a.m.)
 
 
 Review request for hive, Chao Sun, Jimmy Xiang, and Xuefu Zhang.
 
 
 Bugs: HIVE-10761
 https://issues.apache.org/jira/browse/HIVE-10761
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 See JIRA for the motivation.  Summary: There is an existing metric system 
 that uses some custom model and hooked up to JMX reporting, codahale-based 
 metrics system will be desirable for standard model and reporting.
 
 This adds a codahale-based metrics system to HiveServer2 and HiveMetastore.  
 Metrics implementation is now internally pluggable, and the existing Metrics 
 system can be re-enabled by configuration if desired for 
 backward-compatibility.
 
 Following metrics are supported by Metrics system:
 1.  JVMPauseMonitor (used to call Hadoop's internal implementation, now 
 forked off to integrate with Metrics system)
 2.  HMS API calls
 3.  Standard JVM metrics (only for new implementation, as its free with 
 codahale).
 
 The following metrics reporting are supported by new system (configuration 
 exposed)
 1.  JMX
 2.  CONSOLE
 3.  JSON_FILE (periodic file of metrics that gets overwritten).
 
 A goal is to add a webserver that exposes the JSON metrics, but this will 
 defer to a later implementation.
 
 
 Diffs
 -
 
   common/pom.xml a615c1e 
   common/src/java/org/apache/hadoop/hive/common/JvmPauseMonitor.java 
 PRE-CREATION 
   common/src/java/org/apache/hadoop/hive/common/metrics/LegacyMetrics.java 
 PRE-CREATION 
   common/src/java/org/apache/hadoop/hive/common/metrics/Metrics.java 01c9d1d 
   common/src/java/org/apache/hadoop/hive/common/metrics/common/Metrics.java 
 PRE-CREATION 
   
 common/src/java/org/apache/hadoop/hive/common/metrics/common/MetricsFactory.java
  PRE-CREATION 
   
 common/src/java/org/apache/hadoop/hive/common/metrics/metrics2/CodahaleMetrics.java
  PRE-CREATION 
   
 common/src/java/org/apache/hadoop/hive/common/metrics/metrics2/MetricsReporting.java
  PRE-CREATION 
   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 49b8f97 
   
 common/src/test/org/apache/hadoop/hive/common/metrics/TestLegacyMetrics.java 
 PRE-CREATION 
   common/src/test/org/apache/hadoop/hive/common/metrics/TestMetrics.java 
 e85d3f8 
   
 common/src/test/org/apache/hadoop/hive/common/metrics/metrics2/TestCodahaleMetrics.java
  PRE-CREATION 
   
 itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/TestMetaStoreMetrics.java
  PRE-CREATION 
   metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 
 d81c856 
   pom.xml b21d894 
   service/src/java/org/apache/hive/service/server/HiveServer2.java 58e8e49 
   shims/0.20S/src/main/java/org/apache/hadoop/hive/shims/Hadoop20SShims.java 
 6d8166c 
   shims/0.23/src/main/java/org/apache/hadoop/hive/shims/Hadoop23Shims.java 
 19324b8 
   shims/common/src/main/java/org/apache/hadoop/hive/shims/HadoopShims.java 
 5a6bc44 
 
 Diff: https://reviews.apache.org/r/34447/diff/
 
 
 Testing
 ---
 
 New unit test added.  Manually tested.
 
 
 Thanks,
 
 Szehon Ho
 




[jira] [Created] (HIVE-10844) Combine equivalent Works for HoS[Spark Branch]

2015-05-27 Thread Chengxiang Li (JIRA)
Chengxiang Li created HIVE-10844:


 Summary: Combine equivalent Works for HoS[Spark Branch]
 Key: HIVE-10844
 URL: https://issues.apache.org/jira/browse/HIVE-10844
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Chengxiang Li
Assignee: Chengxiang Li


Some Hive queries(like [TPCDS 
Q39|https://github.com/hortonworks/hive-testbench/blob/hive14/sample-queries-tpcds/query39.sql])
 may share the same subquery, which translated into sperate, but equivalent 
Works in SparkWork, combining these equivalent Works into a single one would 
help to benifit from following dynamic RDD caching optimization.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Revise docs for Hive indexing

2015-05-27 Thread Lefty Leverenz
Will Hive indexing ever be fixed?  If not, should we remove the doc I
cobbled together (Indexing
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Indexing)
or just revise it?  And should the design doc be moved from the Completed
section to Incomplete (Hive Design Docs
https://cwiki.apache.org/confluence/display/Hive/DesignDocs)?

What about bitmap indexes, do they work (Bitmap Indexes
https://cwiki.apache.org/confluence/display/Hive/IndexDev+Bitmap --
HIVE-1803 https://issues.apache.org/jira/browse/HIVE-1803)?

-- Lefty


[jira] [Created] (HIVE-10842) LLAP: DAGs get stuck in yet another way

2015-05-27 Thread Sergey Shelukhin (JIRA)
Sergey Shelukhin created HIVE-10842:
---

 Summary: LLAP: DAGs get stuck in yet another way
 Key: HIVE-10842
 URL: https://issues.apache.org/jira/browse/HIVE-10842
 Project: Hive
  Issue Type: Sub-task
Reporter: Sergey Shelukhin


Looks exactly like HIVE-10744. Last comment there has internal app IDs. Logs 
upon request.
6 (number of slots) tasks from a machine are stuck.
jstack for target daemon sayeth:
{noformat}
   7 Found one Java-level deadlock:
  8 =
  9 
 10 IPC Server handler 4 on 15001:
 11   waiting to lock Monitor@0x7f3cb0005cb8 (Object@0x8cc3ce98, a 
java/lang/Object),
 12   which is held by Wait-Queue-Scheduler-0
 13 Wait-Queue-Scheduler-0:
 14   waiting to lock Monitor@0x7f3cb0004d98 (Object@0x9234cf58, a 
org/apache/hadoop/hive/llap/daemon/impl/Q ueryInfo$FinishableStateTracker),
 15   which is held by IPC Server handler 4 on 15001
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] hive pull request: Hive 10843

2015-05-27 Thread thejasmn
Github user thejasmn closed the pull request at:

https://github.com/apache/hive/pull/39


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (HIVE-10843) desc database and show tables commands don't pass db to HiveAuthorizer check

2015-05-27 Thread Thejas M Nair (JIRA)
Thejas M Nair created HIVE-10843:


 Summary: desc database and show tables commands don't pass db to 
HiveAuthorizer check
 Key: HIVE-10843
 URL: https://issues.apache.org/jira/browse/HIVE-10843
 Project: Hive
  Issue Type: Bug
Reporter: Thejas M Nair
Assignee: Thejas M Nair


'show tables' and 'describe database' command should pass the database 
information for the command to HiveAuthorizer . This is needed for any auditing 
the hive authorizer might implement, or any authorization check it might decide 
to do based on the given database name.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 34455: HIVE-10550 Dynamic RDD caching optimization for HoS.[Spark Branch]

2015-05-27 Thread chengxiang li


 On 五月 27, 2015, 10:13 p.m., Xuefu Zhang wrote:
  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java, line 2062
  https://reviews.apache.org/r/34455/diff/3/?file=972428#file972428line2062
 
  Sorry for pointing this out late. I'm not certain if it's a good idea 
  to expose these two configurations. Also this introduces a change of  
  behavior. For now, can we get rid of them and change the persistency level 
  back to MEM+DISK?
  
  We can come back to revisit this later on. At this moment, I don't feel 
  confident to make the call.
 
 chengxiang li wrote:
 persistent to MEM + DISK may hurt the performance in certain cases, i 
 think at least we should have a switch to open/close this optimization,
 
 Xuefu Zhang wrote:
 Agreed. However, before we find out more about in what cases this helps 
 or hurts, I think it's better we keep the existing behavior. This doesn't 
 prevent us from adding a flag later on.

Ok, i would remove these configurations from patch in temp, we can discuss 
later when we got more knowledge about it.


- chengxiang


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34455/#review85451
---


On 五月 27, 2015, 1:50 a.m., chengxiang li wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/34455/
 ---
 
 (Updated 五月 27, 2015, 1:50 a.m.)
 
 
 Review request for hive, Chao Sun, Jimmy Xiang, and Xuefu Zhang.
 
 
 Bugs: HIVE-10550
 https://issues.apache.org/jira/browse/HIVE-10550
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 see jira description
 
 
 Diffs
 -
 
   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 43c53fc 
   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/CacheTran.java 
 PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/MapTran.java 2170243 
   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/ReduceTran.java e60dfac 
   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkPlan.java ee5c78a 
   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkPlanGenerator.java 
 3f240f5 
   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkUtilities.java 
 e6c845c 
   
 ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/SparkRddCachingResolver.java
  PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkCompiler.java 
 19aae70 
   ql/src/java/org/apache/hadoop/hive/ql/plan/SparkWork.java bb5dd79 
 
 Diff: https://reviews.apache.org/r/34455/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 chengxiang li
 




Review Request 34752: Beeline-CLI: Implement CLI source command using Beeline functionality

2015-05-27 Thread cheng xu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34752/
---

Review request for hive and Xuefu Zhang.


Bugs: HIVE-10821
https://issues.apache.org/jira/browse/HIVE-10821


Repository: hive-git


Description
---

Add source command support for CLI using beeline


Diffs
-

  beeline/src/java/org/apache/hive/beeline/BeeLine.java 4a82635 
  beeline/src/test/org/apache/hive/beeline/cli/TestHiveCli.java cc0b598 

Diff: https://reviews.apache.org/r/34752/diff/


Testing
---

Newly created UT passed


Thanks,

cheng xu



Re: Review Request 34447: HIVE-10761 : Create codahale-based metrics system for Hive

2015-05-27 Thread Szehon Ho


 On May 27, 2015, 9:29 p.m., Xuefu Zhang wrote:
  common/src/java/org/apache/hadoop/hive/common/metrics/metrics2/Metrics.java,
   line 141
  https://reviews.apache.org/r/34447/diff/3/?file=972974#file972974line141
 
  If the synchronized block is for the whole method, we might just as 
  well declare the whole method as synchronized.

In this context, I think a object synchronization makes more sense than 
synchronizing on the class (sycnrhonized method).


- Szehon


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34447/#review85418
---


On May 28, 2015, 2:11 a.m., Szehon Ho wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/34447/
 ---
 
 (Updated May 28, 2015, 2:11 a.m.)
 
 
 Review request for hive, Chao Sun, Jimmy Xiang, and Xuefu Zhang.
 
 
 Bugs: HIVE-10761
 https://issues.apache.org/jira/browse/HIVE-10761
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 See JIRA for the motivation.  Summary: There is an existing metric system 
 that uses some custom model and hooked up to JMX reporting, codahale-based 
 metrics system will be desirable for standard model and reporting.
 
 This adds a codahale-based metrics system to HiveServer2 and HiveMetastore.  
 Metrics implementation is now internally pluggable, and the existing Metrics 
 system can be re-enabled by configuration if desired for 
 backward-compatibility.
 
 Following metrics are supported by Metrics system:
 1.  JVMPauseMonitor (used to call Hadoop's internal implementation, now 
 forked off to integrate with Metrics system)
 2.  HMS API calls
 3.  Standard JVM metrics (only for new implementation, as its free with 
 codahale).
 
 The following metrics reporting are supported by new system (configuration 
 exposed)
 1.  JMX
 2.  CONSOLE
 3.  JSON_FILE (periodic file of metrics that gets overwritten).
 
 A goal is to add a webserver that exposes the JSON metrics, but this will 
 defer to a later implementation.
 
 
 Diffs
 -
 
   common/pom.xml a615c1e 
   common/src/java/org/apache/hadoop/hive/common/JvmPauseMonitor.java 
 PRE-CREATION 
   common/src/java/org/apache/hadoop/hive/common/metrics/LegacyMetrics.java 
 PRE-CREATION 
   common/src/java/org/apache/hadoop/hive/common/metrics/Metrics.java 01c9d1d 
   common/src/java/org/apache/hadoop/hive/common/metrics/common/Metrics.java 
 PRE-CREATION 
   
 common/src/java/org/apache/hadoop/hive/common/metrics/common/MetricsFactory.java
  PRE-CREATION 
   
 common/src/java/org/apache/hadoop/hive/common/metrics/metrics2/CodahaleMetrics.java
  PRE-CREATION 
   
 common/src/java/org/apache/hadoop/hive/common/metrics/metrics2/MetricsReporting.java
  PRE-CREATION 
   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 49b8f97 
   
 common/src/test/org/apache/hadoop/hive/common/metrics/TestLegacyMetrics.java 
 PRE-CREATION 
   common/src/test/org/apache/hadoop/hive/common/metrics/TestMetrics.java 
 e85d3f8 
   
 common/src/test/org/apache/hadoop/hive/common/metrics/metrics2/TestCodahaleMetrics.java
  PRE-CREATION 
   
 itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/TestMetaStoreMetrics.java
  PRE-CREATION 
   metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 
 d81c856 
   pom.xml b21d894 
   service/src/java/org/apache/hive/service/server/HiveServer2.java 58e8e49 
   shims/0.20S/src/main/java/org/apache/hadoop/hive/shims/Hadoop20SShims.java 
 6d8166c 
   shims/0.23/src/main/java/org/apache/hadoop/hive/shims/Hadoop23Shims.java 
 19324b8 
   shims/common/src/main/java/org/apache/hadoop/hive/shims/HadoopShims.java 
 5a6bc44 
 
 Diff: https://reviews.apache.org/r/34447/diff/
 
 
 Testing
 ---
 
 New unit test added.  Manually tested.
 
 
 Thanks,
 
 Szehon Ho
 




Re: Review Request 34447: HIVE-10761 : Create codahale-based metrics system for Hive

2015-05-27 Thread Szehon Ho

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34447/
---

(Updated May 28, 2015, 2:11 a.m.)


Review request for hive, Chao Sun, Jimmy Xiang, and Xuefu Zhang.


Changes
---

Address review comments.


Bugs: HIVE-10761
https://issues.apache.org/jira/browse/HIVE-10761


Repository: hive-git


Description
---

See JIRA for the motivation.  Summary: There is an existing metric system that 
uses some custom model and hooked up to JMX reporting, codahale-based metrics 
system will be desirable for standard model and reporting.

This adds a codahale-based metrics system to HiveServer2 and HiveMetastore.  
Metrics implementation is now internally pluggable, and the existing Metrics 
system can be re-enabled by configuration if desired for backward-compatibility.

Following metrics are supported by Metrics system:
1.  JVMPauseMonitor (used to call Hadoop's internal implementation, now forked 
off to integrate with Metrics system)
2.  HMS API calls
3.  Standard JVM metrics (only for new implementation, as its free with 
codahale).

The following metrics reporting are supported by new system (configuration 
exposed)
1.  JMX
2.  CONSOLE
3.  JSON_FILE (periodic file of metrics that gets overwritten).

A goal is to add a webserver that exposes the JSON metrics, but this will defer 
to a later implementation.


Diffs (updated)
-

  common/pom.xml a615c1e 
  common/src/java/org/apache/hadoop/hive/common/JvmPauseMonitor.java 
PRE-CREATION 
  common/src/java/org/apache/hadoop/hive/common/metrics/LegacyMetrics.java 
PRE-CREATION 
  common/src/java/org/apache/hadoop/hive/common/metrics/Metrics.java 01c9d1d 
  common/src/java/org/apache/hadoop/hive/common/metrics/common/Metrics.java 
PRE-CREATION 
  
common/src/java/org/apache/hadoop/hive/common/metrics/common/MetricsFactory.java
 PRE-CREATION 
  
common/src/java/org/apache/hadoop/hive/common/metrics/metrics2/CodahaleMetrics.java
 PRE-CREATION 
  
common/src/java/org/apache/hadoop/hive/common/metrics/metrics2/MetricsReporting.java
 PRE-CREATION 
  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 49b8f97 
  common/src/test/org/apache/hadoop/hive/common/metrics/TestLegacyMetrics.java 
PRE-CREATION 
  common/src/test/org/apache/hadoop/hive/common/metrics/TestMetrics.java 
e85d3f8 
  
common/src/test/org/apache/hadoop/hive/common/metrics/metrics2/TestCodahaleMetrics.java
 PRE-CREATION 
  
itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/TestMetaStoreMetrics.java
 PRE-CREATION 
  metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 
d81c856 
  pom.xml b21d894 
  service/src/java/org/apache/hive/service/server/HiveServer2.java 58e8e49 
  shims/0.20S/src/main/java/org/apache/hadoop/hive/shims/Hadoop20SShims.java 
6d8166c 
  shims/0.23/src/main/java/org/apache/hadoop/hive/shims/Hadoop23Shims.java 
19324b8 
  shims/common/src/main/java/org/apache/hadoop/hive/shims/HadoopShims.java 
5a6bc44 

Diff: https://reviews.apache.org/r/34447/diff/


Testing
---

New unit test added.  Manually tested.


Thanks,

Szehon Ho



JIRA: sort attachments by date

2015-05-27 Thread Lefty Leverenz
Is there any way to change the default for JIRA attachments to Sort By
Date instead of Sort By Name?

Manage Attachments doesn't have anything useful.

-- Lefty


[jira] [Created] (HIVE-10841) [WHERE col is not null] does not work for large queries

2015-05-27 Thread Alexander Pivovarov (JIRA)
Alexander Pivovarov created HIVE-10841:
--

 Summary: [WHERE col is not null] does not work for large queries
 Key: HIVE-10841
 URL: https://issues.apache.org/jira/browse/HIVE-10841
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Alexander Pivovarov


The result from the following SELCT query is 3 rows but it should be 1 row.
I checked it in MySQL - it returned 1 row.

To reproduce the issue in Hive
1. prepare tables
{code}
drop table if exists L;
drop table if exists LA;
drop table if exists FR;
drop table if exists A;
drop table if exists PI;
drop table if exists acct;

create table L as select 4436 id;
create table LA as select 4436 loan_id, 4748 aid, 4415 pi_id;
create table FR as select 4436 loan_id;
create table A as select 4748 id;
create table PI as select 4415 id;

create table acct as select 4748 aid, 10 acc_n, 122 brn;
insert into table acct values(4748, null, null);
insert into table acct values(4748, null, null);
{code}

2. run SELECT query
{code}
select
  acct.ACC_N,
  acct.brn
FROM L
JOIN LA ON L.id = LA.loan_id
JOIN FR ON L.id = FR.loan_id
JOIN A ON LA.aid = A.id
JOIN PI ON PI.id = LA.pi_id
JOIN acct ON A.id = acct.aid
WHERE
  L.id = 4436
  and acct.brn is not null;
{code}

the result is 3 rows
{code}
10  122
NULLNULL
NULLNULL
{code}

but it should be 1 row

{code}
10  122
{code}

3. workaround is to put acct.brn is not null to join condition
{code}
select
  acct.ACC_N,
  acct.brn
FROM L
JOIN LA ON L.id = LA.loan_id
JOIN FR ON L.id = FR.loan_id
JOIN A ON LA.aid = A.id
JOIN PI ON PI.id = LA.pi_id
JOIN acct ON A.id = acct.aid and acct.brn is not null
WHERE
  L.id = 4436;
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [VOTE] Stable releases from branch-1 and experimental releases from master

2015-05-27 Thread Lefty Leverenz
+1

-- Lefty

On Wed, May 27, 2015 at 3:21 PM, Alexander Pivovarov apivova...@gmail.com
wrote:

 +1
 On May 27, 2015 10:45 AM, Vikram Dixit K vikram.di...@gmail.com wrote:

  +1 for all the reasons outlined.
 
  On Tue, May 26, 2015 at 6:13 PM, Thejas Nair thejas.n...@gmail.com
  wrote:
   +1
   - This is great for users who want to take longer to upgrade from
   hadoop-1 and care mainly for bug fixes and incremental features,
   rather than radical new features.
   - The ability to release initial 2.x releases marked as alpha/beta
   also helps to get users to try it out, and also lets them choose what
   is right for them.
   - This also lets developers focus on major new features without the
   burden of maintaining hadoop-1 compatibility.
  
   On Tue, May 26, 2015 at 11:41 AM, Alan Gates alanfga...@gmail.com
  wrote:
   We have discussed this for several weeks now.  Some concerns have been
   raised which I have tried to address.  I think it is time to vote on
 it
  as
   our release plan.  To be specific, I propose:
  
   Hive makes a branch-1 from the current master.  This would be used for
  1.3
   and future 1.x releases.  This branch would not deprecate existing
   functionality.  Any new features in this branch would also need to be
  put on
   master.  An upgrade path for users will be maintained from one 1.x
  release
   to the next, as well as from the latest 1.x release to the latest 2.x
   release.
  
   Going forward releases numbered 2.x will be made from master.  The
  purpose
   of these releases will be to enable users to get access to new
 features
   being developed in Hive and allow developers to get feedback.  It is
   expected that for a while these releases will not be production ready
  and
   will be clearly so labeled.  Some legacy features, such as Hadoop 1
 and
   MapReduce, will no longer be supported in the master.  Any critical
 bug
   fixes (security, incorrect results, crashes) fixed in master will also
  be
   ported to branch-1 for at least a year.  This time period may be
  extended in
   the future based on the stability and adoption of 2.x releases.
  
   Based on Hive's bylaws this release plan vote will be open for 3 days
  and
   all active committers have binding votes.
  
   Here's my +1.
  
   Alan.
 
 
 
  --
  Nothing better than when appreciated for hard work.
  -Mark
 



Re: [VOTE] Stable releases from branch-1 and experimental releases from master

2015-05-27 Thread Alexander Pivovarov
+1
On May 27, 2015 10:45 AM, Vikram Dixit K vikram.di...@gmail.com wrote:

 +1 for all the reasons outlined.

 On Tue, May 26, 2015 at 6:13 PM, Thejas Nair thejas.n...@gmail.com
 wrote:
  +1
  - This is great for users who want to take longer to upgrade from
  hadoop-1 and care mainly for bug fixes and incremental features,
  rather than radical new features.
  - The ability to release initial 2.x releases marked as alpha/beta
  also helps to get users to try it out, and also lets them choose what
  is right for them.
  - This also lets developers focus on major new features without the
  burden of maintaining hadoop-1 compatibility.
 
  On Tue, May 26, 2015 at 11:41 AM, Alan Gates alanfga...@gmail.com
 wrote:
  We have discussed this for several weeks now.  Some concerns have been
  raised which I have tried to address.  I think it is time to vote on it
 as
  our release plan.  To be specific, I propose:
 
  Hive makes a branch-1 from the current master.  This would be used for
 1.3
  and future 1.x releases.  This branch would not deprecate existing
  functionality.  Any new features in this branch would also need to be
 put on
  master.  An upgrade path for users will be maintained from one 1.x
 release
  to the next, as well as from the latest 1.x release to the latest 2.x
  release.
 
  Going forward releases numbered 2.x will be made from master.  The
 purpose
  of these releases will be to enable users to get access to new features
  being developed in Hive and allow developers to get feedback.  It is
  expected that for a while these releases will not be production ready
 and
  will be clearly so labeled.  Some legacy features, such as Hadoop 1 and
  MapReduce, will no longer be supported in the master.  Any critical bug
  fixes (security, incorrect results, crashes) fixed in master will also
 be
  ported to branch-1 for at least a year.  This time period may be
 extended in
  the future based on the stability and adoption of 2.x releases.
 
  Based on Hive's bylaws this release plan vote will be open for 3 days
 and
  all active committers have binding votes.
 
  Here's my +1.
 
  Alan.



 --
 Nothing better than when appreciated for hard work.
 -Mark



[jira] [Created] (HIVE-10839) TestHCatLoaderEncryption.* tests fail in windows because of path related issues

2015-05-27 Thread Hari Sankar Sivarama Subramaniyan (JIRA)
Hari Sankar Sivarama Subramaniyan created HIVE-10839:


 Summary: TestHCatLoaderEncryption.* tests fail in windows because 
of path related issues
 Key: HIVE-10839
 URL: https://issues.apache.org/jira/browse/HIVE-10839
 Project: Hive
  Issue Type: Bug
  Components: Tests
 Environment: Windows OS
Reporter: Hari Sankar Sivarama Subramaniyan
Assignee: Hari Sankar Sivarama Subramaniyan


I am getting the following errors while trying to run 
org.apache.hive.hcatalog.pig.TestHCatLoaderEncryption.* tests in windows.

{code}
Encryption key created: 'key_128'
(1,Encryption Processor Helper Failed:Pathname 
/D:/w/hv/hcatalog/hcatalog-pig-adapter/target/tmp/org.apache.hive.hcatalog.pig.TestHCatLoader-1432579852919/warehouse/encryptedTable
 from 
D:/w/hv/hcatalog/hcatalog-pig-adapter/target/tmp/org.apache.hive.hcatalog.pig.TestHCatLoader-1432579852919/warehouse/encryptedTable
 is not a valid DFS filename.,null)
Encryption key deleted: 'key_128'
{code}

{code}
Error Message

Could not fully delete 
D:\w\hv\hcatalog\hcatalog-pig-adapter\target\tmp\dfs\name1
Stacktrace

java.io.IOException: Could not fully delete 
D:\w\hv\hcatalog\hcatalog-pig-adapter\target\tmp\dfs\name1
at 
org.apache.hadoop.hdfs.MiniDFSCluster.createNameNodesAndSetConf(MiniDFSCluster.java:940)
at 
org.apache.hadoop.hdfs.MiniDFSCluster.initMiniDFSCluster(MiniDFSCluster.java:811)
at org.apache.hadoop.hdfs.MiniDFSCluster.init(MiniDFSCluster.java:742)
at org.apache.hadoop.hdfs.MiniDFSCluster.init(MiniDFSCluster.java:612)
at 
org.apache.hadoop.hive.shims.Hadoop23Shims.getMiniDfs(Hadoop23Shims.java:523)
at 
org.apache.hive.hcatalog.pig.TestHCatLoaderEncryption.initEncryptionShim(TestHCatLoaderEncryption.java:242)
at 
org.apache.hive.hcatalog.pig.TestHCatLoaderEncryption.setup(TestHCatLoaderEncryption.java:190)
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 34455: HIVE-10550 Dynamic RDD caching optimization for HoS.[Spark Branch]

2015-05-27 Thread Xuefu Zhang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34455/#review85451
---



common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
https://reviews.apache.org/r/34455/#comment137023

Sorry for pointing this out late. I'm not certain if it's a good idea to 
expose these two configurations. Also this introduces a change of  behavior. 
For now, can we get rid of them and change the persistency level back to 
MEM+DISK?

We can come back to revisit this later on. At this moment, I don't feel 
confident to make the call.


- Xuefu Zhang


On May 27, 2015, 1:50 a.m., chengxiang li wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/34455/
 ---
 
 (Updated May 27, 2015, 1:50 a.m.)
 
 
 Review request for hive, Chao Sun, Jimmy Xiang, and Xuefu Zhang.
 
 
 Bugs: HIVE-10550
 https://issues.apache.org/jira/browse/HIVE-10550
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 see jira description
 
 
 Diffs
 -
 
   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 43c53fc 
   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/CacheTran.java 
 PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/MapTran.java 2170243 
   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/ReduceTran.java e60dfac 
   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkPlan.java ee5c78a 
   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkPlanGenerator.java 
 3f240f5 
   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkUtilities.java 
 e6c845c 
   
 ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/SparkRddCachingResolver.java
  PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkCompiler.java 
 19aae70 
   ql/src/java/org/apache/hadoop/hive/ql/plan/SparkWork.java bb5dd79 
 
 Diff: https://reviews.apache.org/r/34455/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 chengxiang li
 




[jira] [Created] (HIVE-10838) Allow Hive metastore client can use different hostname which has multiple hostnames when security is enable

2015-05-27 Thread HeeSoo Kim (JIRA)
HeeSoo Kim created HIVE-10838:
-

 Summary: Allow Hive metastore client can use different hostname 
which has multiple hostnames when security is enable
 Key: HIVE-10838
 URL: https://issues.apache.org/jira/browse/HIVE-10838
 Project: Hive
  Issue Type: Task
Reporter: HeeSoo Kim
Assignee: HeeSoo Kim


Currently if Hive metastore client (e.g. HS2, oozie) tries to connect the hive 
metastore to when security is enabled, the Hive metastore client will fail to 
connect with an error like the following:
{code}
2015-05-21 23:17:59,554 ERROR metadata.Hive 
(Hive.java:getDelegationToken(2638)) - MetaException(message:Unauthorized 
connection for super-user: 
hiveserver/hiveserver-dpci.s3s.altiscale@test.altiscale.com from IP 
10.250.16.43)
at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_delegation_token_result$get_delegation_token_resultStandardScheme.read(ThriftHiveMetastore.java)
at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_delegation_token_result$get_delegation_token_resultStandardScheme.read(ThriftHiveMetastore.java)
at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_delegation_token_result.read(ThriftHiveMetastore.java)
at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78)
at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_get_delegation_token(ThriftHiveMetastore.java:3293)
at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_delegation_token(ThriftHiveMetastore.java:3279)
at 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getDelegationToken(HiveMetaStoreClient.java:1559)
{code}
This is the case when if Hive metastore client's default IP address is the 
different from hostname of the Hive metastore client's kerberos principal. And 
the Hive metastore client has multiple IP addresses.
We need to set the bind address when Hive metastore client tries to connect 
Hive metastore based on hostname of Kerberos.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-10840) NumberFormatException while running analyze table partition compute statics query

2015-05-27 Thread Ashutosh Chauhan (JIRA)
Ashutosh Chauhan created HIVE-10840:
---

 Summary: NumberFormatException while running analyze table 
partition compute statics query
 Key: HIVE-10840
 URL: https://issues.apache.org/jira/browse/HIVE-10840
 Project: Hive
  Issue Type: Bug
  Components: Statistics
Affects Versions: 1.2.0
Reporter: Jagruti Varia
Assignee: Ashutosh Chauhan






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)