date:20181214

hive git commit: HIVE-21028: Adding a JDO fetch plan for getTableMeta get_table_meta to avoid race condition(Karthik Manamcheri, reviewed by Adam Holley, Vihang K and Naveen G)

2018-12-14 Thread ngangam

Repository: hive
Updated Branches:
  refs/heads/branch-3 a7b3cf4bd -> 3db928668


 HIVE-21028: Adding a JDO fetch plan for getTableMeta get_table_meta to avoid 
race condition(Karthik Manamcheri, reviewed by Adam Holley, Vihang K and Naveen 
G)


Project: http://git-wip-us.apache.org/repos/asf/hive/repo
Commit: http://git-wip-us.apache.org/repos/asf/hive/commit/3db92866
Tree: http://git-wip-us.apache.org/repos/asf/hive/tree/3db92866
Diff: http://git-wip-us.apache.org/repos/asf/hive/diff/3db92866

Branch: refs/heads/branch-3
Commit: 3db928668f109452f26fb487204c9489379f0bc9
Parents: a7b3cf4
Author: Naveen Gangam 
Authored: Fri Dec 14 18:16:58 2018 -0500
Committer: Naveen Gangam 
Committed: Fri Dec 14 18:16:58 2018 -0500

--
 .../hadoop/hive/metastore/ObjectStore.java  |   9 +
 .../hive/metastore/model/FetchGroups.java   |  26 ++
 .../src/main/resources/package.jdo  |   5 +
 .../hive/metastore/StatementVerifyingDerby.java | 345 +++
 .../TestObjectStoreStatementVerify.java | 159 +
 5 files changed, 544 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/hive/blob/3db92866/standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
--
diff --git 
a/standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
 
b/standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
index 59c4d22..ccb2ddb 100644
--- 
a/standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
+++ 
b/standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
@@ -154,6 +154,7 @@ import org.apache.hadoop.hive.metastore.conf.MetastoreConf;
 import org.apache.hadoop.hive.metastore.conf.MetastoreConf.ConfVars;
 import org.apache.hadoop.hive.metastore.metrics.Metrics;
 import org.apache.hadoop.hive.metastore.metrics.MetricsConstants;
+import org.apache.hadoop.hive.metastore.model.FetchGroups;
 import org.apache.hadoop.hive.metastore.model.MCatalog;
 import org.apache.hadoop.hive.metastore.model.MColumnDescriptor;
 import org.apache.hadoop.hive.metastore.model.MConstraint;
@@ -1530,6 +1531,13 @@ public class ObjectStore implements RawStore, 
Configurable {
 LOG.debug("getTableMeta with filter " + filterBuilder.toString() + " 
params: " +
 StringUtils.join(parameterVals, ", "));
   }
+  // Add the fetch group here which retrieves the database object along 
with the MTable
+  // objects. If we don't prefetch the database object, we could end up in 
a situation where
+  // the database gets dropped while we are looping through the tables 
throwing a
+  // JDOObjectNotFoundException. This causes HMS to go into a retry loop 
which greatly degrades
+  // performance of this function when called with dbNames="*" and 
tableNames="*" (fetch all
+  // tables in all databases, essentially a full dump)
+  pm.getFetchPlan().addGroup(FetchGroups.FETCH_DATABASE_ON_MTABLE);
   query = pm.newQuery(MTable.class, filterBuilder.toString());
   Collection tables = (Collection) 
query.executeWithArray(parameterVals.toArray(new String[parameterVals.size()]));
   for (MTable table : tables) {
@@ -1540,6 +1548,7 @@ public class ObjectStore implements RawStore, 
Configurable {
   }
   commited = commitTransaction();
 } finally {
+  pm.getFetchPlan().removeGroup(FetchGroups.FETCH_DATABASE_ON_MTABLE);
   rollbackAndCleanup(commited, query);
 }
 return metas;

http://git-wip-us.apache.org/repos/asf/hive/blob/3db92866/standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/model/FetchGroups.java
--
diff --git 
a/standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/model/FetchGroups.java
 
b/standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/model/FetchGroups.java
new file mode 100644
index 000..83fd2dd
--- /dev/null
+++ 
b/standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/model/FetchGroups.java
@@ -0,0 +1,26 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF

hive git commit: HIVE-21030 : Add credential store env properties redaction in JobConf (Denys Kuzmenko reviewed by Vihang Karajgaonkar)

2018-12-14 Thread vihangk1

Repository: hive
Updated Branches:
  refs/heads/master 01ed46b4b -> 4e415609c


HIVE-21030 : Add credential store env properties redaction in JobConf (Denys 
Kuzmenko reviewed by Vihang Karajgaonkar)


Project: http://git-wip-us.apache.org/repos/asf/hive/repo
Commit: http://git-wip-us.apache.org/repos/asf/hive/commit/4e415609
Tree: http://git-wip-us.apache.org/repos/asf/hive/tree/4e415609
Diff: http://git-wip-us.apache.org/repos/asf/hive/diff/4e415609

Branch: refs/heads/master
Commit: 4e415609ce333fd17c1dd5d4bf44ca9a3897ec42
Parents: 01ed46b
Author: denys kuzmenko 
Authored: Fri Dec 14 13:29:03 2018 -0800
Committer: Vihang Karajgaonkar 
Committed: Fri Dec 14 13:29:41 2018 -0800

--
 .../apache/hadoop/hive/conf/HiveConfUtil.java   | 35 ++-
 .../ql/exec/TestHiveCredentialProviders.java| 36 
 2 files changed, 62 insertions(+), 9 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/hive/blob/4e415609/common/src/java/org/apache/hadoop/hive/conf/HiveConfUtil.java
--
diff --git a/common/src/java/org/apache/hadoop/hive/conf/HiveConfUtil.java 
b/common/src/java/org/apache/hadoop/hive/conf/HiveConfUtil.java
index 2ad5f9e..ae6fa43 100644
--- a/common/src/java/org/apache/hadoop/hive/conf/HiveConfUtil.java
+++ b/common/src/java/org/apache/hadoop/hive/conf/HiveConfUtil.java
@@ -24,12 +24,14 @@ import org.apache.hadoop.conf.Configuration;
 import org.apache.hadoop.hive.common.classification.InterfaceAudience.Private;
 import org.apache.hadoop.hive.conf.HiveConf.ConfVars;
 import org.apache.hadoop.mapred.JobConf;
+import org.apache.hadoop.mapreduce.MRJobConfig;
 import org.apache.hive.common.util.HiveStringUtils;
 import org.slf4j.Logger;
 import org.slf4j.LoggerFactory;
 
 import java.io.File;
 import java.util.ArrayList;
+import java.util.Collection;
 import java.util.Collections;
 import java.util.Comparator;
 import java.util.HashSet;
@@ -38,6 +40,7 @@ import java.util.List;
 import java.util.Map;
 import java.util.Set;
 import java.util.StringTokenizer;
+import java.util.stream.Stream;
 
 /**
  * Hive Configuration utils
@@ -182,23 +185,37 @@ public class HiveConfUtil {
 
 String jobKeyStoreLocation = 
jobConf.get(HiveConf.ConfVars.HIVE_SERVER2_JOB_CREDENTIAL_PROVIDER_PATH.varname);
 String oldKeyStoreLocation = 
jobConf.get(Constants.HADOOP_CREDENTIAL_PROVIDER_PATH_CONFIG);
+
 if (StringUtils.isNotBlank(jobKeyStoreLocation)) {
   jobConf.set(Constants.HADOOP_CREDENTIAL_PROVIDER_PATH_CONFIG, 
jobKeyStoreLocation);
   LOG.debug("Setting job conf credstore location to " + jobKeyStoreLocation
   + " previous location was " + oldKeyStoreLocation);
 }
 
-String credStorepassword = getJobCredentialProviderPassword(jobConf);
-if (credStorepassword != null) {
-  // if the execution engine is MR set the map/reduce env with the 
credential store password
+String credstorePassword = getJobCredentialProviderPassword(jobConf);
+if (credstorePassword != null) {
   String execEngine = jobConf.get(ConfVars.HIVE_EXECUTION_ENGINE.varname);
+
   if ("mr".equalsIgnoreCase(execEngine)) {
-addKeyValuePair(jobConf, JobConf.MAPRED_MAP_TASK_ENV,
-Constants.HADOOP_CREDENTIAL_PASSWORD_ENVVAR, credStorepassword);
-addKeyValuePair(jobConf, JobConf.MAPRED_REDUCE_TASK_ENV,
-Constants.HADOOP_CREDENTIAL_PASSWORD_ENVVAR, credStorepassword);
-addKeyValuePair(jobConf, "yarn.app.mapreduce.am.admin.user.env",
-Constants.HADOOP_CREDENTIAL_PASSWORD_ENVVAR, credStorepassword);
+// if the execution engine is MR set the map/reduce env with the 
credential store password
+
+Collection redactedProperties =
+
jobConf.getStringCollection(MRJobConfig.MR_JOB_REDACTED_PROPERTIES);
+
+Stream.of(
+JobConf.MAPRED_MAP_TASK_ENV,
+JobConf.MAPRED_REDUCE_TASK_ENV,
+"yarn.app.mapreduce.am.admin.user.env")
+
+.forEach(property -> {
+  addKeyValuePair(jobConf, property,
+  Constants.HADOOP_CREDENTIAL_PASSWORD_ENVVAR, 
credstorePassword);
+  redactedProperties.add(property);
+});
+
+// Hide sensitive configuration values from MR HistoryUI by telling MR 
to redact the following list.
+jobConf.set(MRJobConfig.MR_JOB_REDACTED_PROPERTIES,
+StringUtils.join(redactedProperties, ","));
   }
 }
   }

http://git-wip-us.apache.org/repos/asf/hive/blob/4e415609/ql/src/test/org/apache/hadoop/hive/ql/exec/TestHiveCredentialProviders.java
--
diff --git 
a/ql/src/test/org/apache/hadoop/hive/ql/exec/TestHiveCredentialProviders.java

hive git commit: HIVE-20860 : Fix or disable TestMiniLlapLocalCliDriver.testCliDriver[cbo_limit] (addendum)

2018-12-14 Thread vihangk1

Repository: hive
Updated Branches:
  refs/heads/master c7b5454aa -> 01ed46b4b


HIVE-20860 : Fix or disable TestMiniLlapLocalCliDriver.testCliDriver[cbo_limit] 
(addendum)


Project: http://git-wip-us.apache.org/repos/asf/hive/repo
Commit: http://git-wip-us.apache.org/repos/asf/hive/commit/01ed46b4
Tree: http://git-wip-us.apache.org/repos/asf/hive/tree/01ed46b4
Diff: http://git-wip-us.apache.org/repos/asf/hive/diff/01ed46b4

Branch: refs/heads/master
Commit: 01ed46b4bb1ddc46c13f3daf3678e887fe74ee5c
Parents: c7b5454
Author: Vihang Karajgaonkar 
Authored: Mon Dec 3 15:09:59 2018 -0800
Committer: Vihang Karajgaonkar 
Committed: Fri Dec 14 13:16:44 2018 -0800

--
 .../main/java/org/apache/hadoop/hive/cli/control/CliConfigs.java   | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/hive/blob/01ed46b4/itests/util/src/main/java/org/apache/hadoop/hive/cli/control/CliConfigs.java
--
diff --git 
a/itests/util/src/main/java/org/apache/hadoop/hive/cli/control/CliConfigs.java 
b/itests/util/src/main/java/org/apache/hadoop/hive/cli/control/CliConfigs.java
index 2017c94..ed8ae54 100644
--- 
a/itests/util/src/main/java/org/apache/hadoop/hive/cli/control/CliConfigs.java
+++ 
b/itests/util/src/main/java/org/apache/hadoop/hive/cli/control/CliConfigs.java
@@ -157,7 +157,6 @@ public class CliConfigs {
 
 includesFrom(testConfigProps, "minillap.query.files");
 includesFrom(testConfigProps, "minillap.shared.query.files");
-excludeQuery("cbo_limit.q"); //Disabled in HIVE-20860
 
 setResultsDir("ql/src/test/results/clientpositive/llap");
 setLogDir("itests/qtest/target/qfile-results/clientpositive");
@@ -256,6 +255,7 @@ public class CliConfigs {
 excludeQuery("schema_evol_orc_acidvec_part.q"); // Disabled in 
HIVE-19509
 excludeQuery("schema_evol_orc_vec_part_llap_io.q"); // Disabled in 
HIVE-19509
 excludeQuery("load_dyn_part3.q"); // Disabled in HIVE-20662. Enable in 
HIVE-20663.
+excludeQuery("cbo_limit.q"); //Disabled in HIVE-20860. Enable in 
HIVE-20972
 
 setResultsDir("ql/src/test/results/clientpositive/llap");
 setLogDir("itests/qtest/target/qfile-results/clientpositive");

[4/4] hive git commit: HIVE-21021: Scalar subquery with only aggregate in subquery (no group by) has unnecessary sq_count_check branch (Vineet Garg, reviewed by Ashutosh Chauhan)

2018-12-14 Thread vgarg

HIVE-21021: Scalar subquery with only aggregate in subquery (no group by) has 
unnecessary sq_count_check branch (Vineet Garg, reviewed by Ashutosh Chauhan)


Project: http://git-wip-us.apache.org/repos/asf/hive/repo
Commit: http://git-wip-us.apache.org/repos/asf/hive/commit/c7b5454a
Tree: http://git-wip-us.apache.org/repos/asf/hive/tree/c7b5454a
Diff: http://git-wip-us.apache.org/repos/asf/hive/diff/c7b5454a

Branch: refs/heads/master
Commit: c7b5454aa57edec171f7f254b6e342bb4b9cbb82
Parents: 64930f8
Author: Vineet Garg 
Authored: Fri Dec 14 09:43:54 2018 -0800
Committer: Vineet Garg 
Committed: Fri Dec 14 09:43:54 2018 -0800

--
 .../calcite/rules/HiveRemoveSqCountCheck.java   |   45 +-
 .../queries/clientpositive/subquery_scalar.q|   35 +
 .../clientpositive/llap/subquery_scalar.q.out   |  138 ++
 .../clientpositive/perf/tez/cbo_query14.q.out   |  285 +--
 .../perf/tez/constraints/cbo_query14.q.out  |  285 +--
 .../perf/tez/constraints/query14.q.out  | 1589 +++--
 .../clientpositive/perf/tez/query14.q.out   | 1623 --
 .../clientpositive/spark/subquery_scalar.q.out  |  138 ++
 8 files changed, 1845 insertions(+), 2293 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/hive/blob/c7b5454a/ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveRemoveSqCountCheck.java
--
diff --git 
a/ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveRemoveSqCountCheck.java
 
b/ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveRemoveSqCountCheck.java
index 0100395..f0f7094 100644
--- 
a/ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveRemoveSqCountCheck.java
+++ 
b/ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveRemoveSqCountCheck.java
@@ -40,9 +40,11 @@ import java.util.NavigableMap;
 import java.util.TreeMap;
 
 /**
- * Planner rule that removes UDF sq_count_check from a
- * plan if group by keys in a subquery are constant
- * and there is no windowing or grouping sets
+ * Planner rule that removes UDF sq_count_check from a plan if
+ *  1) either group by keys in a subquery are constant and there is no 
windowing or grouping sets
+ *  2) OR there are no group by keys but only aggregate
+ *  Both of the above case will produce at most one row, therefore it is safe 
to remove sq_count_check
+ *which was introduced earlier in the plan to ensure that this condition 
is met at run time
  */
 public class HiveRemoveSqCountCheck extends RelOptRule {
 
@@ -97,25 +99,17 @@ public class HiveRemoveSqCountCheck extends RelOptRule {
 return false;
   }
 
+  private boolean isAggregateWithoutGbyKeys(final Aggregate agg) {
+return agg.getGroupCount() == 0 ? true : false;
+  }
 
-  @Override public void onMatch(RelOptRuleCall call) {
-final Join topJoin= call.rel(0);
-final Join join = call.rel(2);
-final Aggregate aggregate = call.rel(6);
-
-// in presence of grouping sets we can't remove sq_count_check
-if(aggregate.indicator) {
-  return;
-}
-
-final int groupCount = aggregate.getGroupCount();
-
+  private boolean isAggWithConstantGbyKeys(final Aggregate aggregate, 
RelOptRuleCall call) {
 final RexBuilder rexBuilder = aggregate.getCluster().getRexBuilder();
 final RelMetadataQuery mq = call.getMetadataQuery();
 final RelOptPredicateList predicates =
 mq.getPulledUpPredicates(aggregate.getInput());
 if (predicates == null) {
-  return;
+  return false;
 }
 final NavigableMap map = new TreeMap<>();
 for (int key : aggregate.getGroupSet()) {
@@ -128,15 +122,30 @@ public class HiveRemoveSqCountCheck extends RelOptRule {
 
 // None of the group expressions are constant. Nothing to do.
 if (map.isEmpty()) {
-  return;
+  return false;
 }
 
+final int groupCount = aggregate.getGroupCount();
 if (groupCount == map.size()) {
+  return true;
+}
+return false;
+  }
+
+  @Override public void onMatch(RelOptRuleCall call) {
+final Join topJoin= call.rel(0);
+final Join join = call.rel(2);
+final Aggregate aggregate = call.rel(6);
+
+// in presence of grouping sets we can't remove sq_count_check
+if(aggregate.indicator) {
+  return;
+}
+if(isAggregateWithoutGbyKeys(aggregate) || 
isAggWithConstantGbyKeys(aggregate, call)) {
   // join(left, join.getRight)
   RelNode newJoin = HiveJoin.getJoin(topJoin.getCluster(), join.getLeft(), 
 topJoin.getRight(),
   topJoin.getCondition(), topJoin.getJoinType());
   call.transformTo(newJoin);
 }
   }
-
 }

http://git-wip-us.apache.org/repos/asf/hive/blob/c7b5454a/ql/src/test/queries/clientpositive/subquery_scalar.q
--
diff --git

[3/4] hive git commit: HIVE-21021: Scalar subquery with only aggregate in subquery (no group by) has unnecessary sq_count_check branch (Vineet Garg, reviewed by Ashutosh Chauhan)

2018-12-14 Thread vgarg

http://git-wip-us.apache.org/repos/asf/hive/blob/c7b5454a/ql/src/test/results/clientpositive/perf/tez/constraints/query14.q.out
--
diff --git 
a/ql/src/test/results/clientpositive/perf/tez/constraints/query14.q.out 
b/ql/src/test/results/clientpositive/perf/tez/constraints/query14.q.out
index 4df5864..1a3aefe 100644
--- a/ql/src/test/results/clientpositive/perf/tez/constraints/query14.q.out
+++ b/ql/src/test/results/clientpositive/perf/tez/constraints/query14.q.out
@@ -1,9 +1,6 @@
-Warning: Shuffle Join MERGEJOIN[1458][tables = [$hdt$_1, $hdt$_2]] in Stage 
'Reducer 5' is a cross product
-Warning: Shuffle Join MERGEJOIN[1470][tables = [$hdt$_1, $hdt$_2, $hdt$_0]] in 
Stage 'Reducer 6' is a cross product
-Warning: Shuffle Join MERGEJOIN[1460][tables = [$hdt$_1, $hdt$_2]] in Stage 
'Reducer 13' is a cross product
-Warning: Shuffle Join MERGEJOIN[1483][tables = [$hdt$_1, $hdt$_2, $hdt$_0]] in 
Stage 'Reducer 14' is a cross product
-Warning: Shuffle Join MERGEJOIN[1462][tables = [$hdt$_2, $hdt$_3]] in Stage 
'Reducer 18' is a cross product
-Warning: Shuffle Join MERGEJOIN[1496][tables = [$hdt$_2, $hdt$_3, $hdt$_1]] in 
Stage 'Reducer 19' is a cross product
+Warning: Shuffle Join MERGEJOIN[1182][tables = [$hdt$_0, $hdt$_1]] in Stage 
'Reducer 6' is a cross product
+Warning: Shuffle Join MERGEJOIN[1189][tables = [$hdt$_0, $hdt$_1]] in Stage 
'Reducer 16' is a cross product
+Warning: Shuffle Join MERGEJOIN[1196][tables = [$hdt$_1, $hdt$_2]] in Stage 
'Reducer 22' is a cross product
 PREHOOK: query: explain
 with  cross_items as
  (select i_item_sk ss_item_sk
@@ -225,1104 +222,828 @@ POSTHOOK: Output: hdfs://### HDFS PATH ###
 Plan optimized by CBO.
 
 Vertex dependency in root stage
-Map 1 <- Reducer 98 (BROADCAST_EDGE)
-Map 100 <- Reducer 91 (BROADCAST_EDGE)
-Map 101 <- Reducer 97 (BROADCAST_EDGE)
-Map 103 <- Reducer 63 (BROADCAST_EDGE)
-Map 104 <- Reducer 68 (BROADCAST_EDGE)
-Map 20 <- Reducer 25 (BROADCAST_EDGE)
-Map 36 <- Reducer 41 (BROADCAST_EDGE)
-Map 46 <- Reducer 99 (BROADCAST_EDGE)
-Map 50 <- Reducer 29 (BROADCAST_EDGE)
-Map 51 <- Reducer 43 (BROADCAST_EDGE)
-Map 52 <- Reducer 58 (BROADCAST_EDGE)
-Map 69 <- Reducer 85 (BROADCAST_EDGE)
-Reducer 10 <- Map 1 (SIMPLE_EDGE), Map 84 (SIMPLE_EDGE), Union 11 (CONTAINS)
-Reducer 12 <- Union 11 (CUSTOM_SIMPLE_EDGE)
-Reducer 13 <- Reducer 12 (CUSTOM_SIMPLE_EDGE), Reducer 32 (CUSTOM_SIMPLE_EDGE)
-Reducer 14 <- Reducer 13 (CUSTOM_SIMPLE_EDGE), Reducer 62 
(CUSTOM_SIMPLE_EDGE), Union 7 (CONTAINS)
-Reducer 15 <- Map 1 (SIMPLE_EDGE), Map 84 (SIMPLE_EDGE), Union 16 (CONTAINS)
-Reducer 17 <- Union 16 (CUSTOM_SIMPLE_EDGE)
-Reducer 18 <- Reducer 17 (CUSTOM_SIMPLE_EDGE), Reducer 35 (CUSTOM_SIMPLE_EDGE)
-Reducer 19 <- Reducer 18 (CUSTOM_SIMPLE_EDGE), Reducer 67 
(CUSTOM_SIMPLE_EDGE), Union 7 (CONTAINS)
-Reducer 2 <- Map 1 (SIMPLE_EDGE), Map 84 (SIMPLE_EDGE), Union 3 (CONTAINS)
-Reducer 21 <- Map 20 (SIMPLE_EDGE), Map 24 (SIMPLE_EDGE), Union 3 (CONTAINS)
-Reducer 22 <- Map 20 (SIMPLE_EDGE), Map 24 (SIMPLE_EDGE), Union 11 (CONTAINS)
-Reducer 23 <- Map 20 (SIMPLE_EDGE), Map 24 (SIMPLE_EDGE), Union 16 (CONTAINS)
-Reducer 25 <- Map 24 (CUSTOM_SIMPLE_EDGE)
-Reducer 26 <- Map 24 (SIMPLE_EDGE), Map 50 (SIMPLE_EDGE), Union 27 (CONTAINS)
-Reducer 28 <- Union 27 (CUSTOM_SIMPLE_EDGE)
-Reducer 29 <- Map 24 (CUSTOM_SIMPLE_EDGE)
-Reducer 30 <- Map 24 (SIMPLE_EDGE), Map 50 (SIMPLE_EDGE), Union 31 (CONTAINS)
-Reducer 32 <- Union 31 (CUSTOM_SIMPLE_EDGE)
-Reducer 33 <- Map 24 (SIMPLE_EDGE), Map 50 (SIMPLE_EDGE), Union 34 (CONTAINS)
-Reducer 35 <- Union 34 (CUSTOM_SIMPLE_EDGE)
-Reducer 37 <- Map 36 (SIMPLE_EDGE), Map 40 (SIMPLE_EDGE), Union 3 (CONTAINS)
-Reducer 38 <- Map 36 (SIMPLE_EDGE), Map 40 (SIMPLE_EDGE), Union 11 (CONTAINS)
-Reducer 39 <- Map 36 (SIMPLE_EDGE), Map 40 (SIMPLE_EDGE), Union 16 (CONTAINS)
-Reducer 4 <- Union 3 (CUSTOM_SIMPLE_EDGE)
-Reducer 41 <- Map 40 (CUSTOM_SIMPLE_EDGE)
-Reducer 42 <- Map 40 (SIMPLE_EDGE), Map 51 (SIMPLE_EDGE), Union 27 (CONTAINS)
-Reducer 43 <- Map 40 (CUSTOM_SIMPLE_EDGE)
-Reducer 44 <- Map 40 (SIMPLE_EDGE), Map 51 (SIMPLE_EDGE), Union 31 (CONTAINS)
-Reducer 45 <- Map 40 (SIMPLE_EDGE), Map 51 (SIMPLE_EDGE), Union 34 (CONTAINS)
-Reducer 47 <- Map 46 (SIMPLE_EDGE), Map 84 (SIMPLE_EDGE), Union 27 (CONTAINS)
-Reducer 48 <- Map 46 (SIMPLE_EDGE), Map 84 (SIMPLE_EDGE), Union 31 (CONTAINS)
-Reducer 49 <- Map 46 (SIMPLE_EDGE), Map 84 (SIMPLE_EDGE), Union 34 (CONTAINS)
-Reducer 5 <- Reducer 28 (CUSTOM_SIMPLE_EDGE), Reducer 4 (CUSTOM_SIMPLE_EDGE)
-Reducer 53 <- Map 52 (SIMPLE_EDGE), Map 57 (SIMPLE_EDGE)
-Reducer 54 <- Reducer 53 (SIMPLE_EDGE), Reducer 75 (SIMPLE_EDGE)
-Reducer 55 <- Map 102 (SIMPLE_EDGE), Reducer 54 (ONE_TO_ONE_EDGE)
-Reducer 56 <- Reducer 55 (SIMPLE_EDGE)
-Reducer 58 <- Map 57 (CUSTOM_SIMPLE_EDGE)
-Reducer 59 <- Map 103 (SIMPLE_EDGE), Map 57 (SIMPLE_EDGE)
-Reducer 6 <- Reducer 5 (CUSTOM_SIMPLE_EDGE), Reducer 56 (CUSTOM_SIMPLE_EDGE), 
Union 7 (CONTAINS)
-Reducer 60 <- Reducer 59

[2/4] hive git commit: HIVE-21021: Scalar subquery with only aggregate in subquery (no group by) has unnecessary sq_count_check branch (Vineet Garg, reviewed by Ashutosh Chauhan)

2018-12-14 Thread vgarg

http://git-wip-us.apache.org/repos/asf/hive/blob/c7b5454a/ql/src/test/results/clientpositive/perf/tez/query14.q.out
--
diff --git a/ql/src/test/results/clientpositive/perf/tez/query14.q.out 
b/ql/src/test/results/clientpositive/perf/tez/query14.q.out
index a65632d..fd8eb9b 100644
--- a/ql/src/test/results/clientpositive/perf/tez/query14.q.out
+++ b/ql/src/test/results/clientpositive/perf/tez/query14.q.out
@@ -1,9 +1,6 @@
-Warning: Shuffle Join MERGEJOIN[1440][tables = [$hdt$_1, $hdt$_2]] in Stage 
'Reducer 5' is a cross product
-Warning: Shuffle Join MERGEJOIN[1452][tables = [$hdt$_1, $hdt$_2, $hdt$_0]] in 
Stage 'Reducer 6' is a cross product
-Warning: Shuffle Join MERGEJOIN[1442][tables = [$hdt$_1, $hdt$_2]] in Stage 
'Reducer 13' is a cross product
-Warning: Shuffle Join MERGEJOIN[1465][tables = [$hdt$_1, $hdt$_2, $hdt$_0]] in 
Stage 'Reducer 14' is a cross product
-Warning: Shuffle Join MERGEJOIN[1444][tables = [$hdt$_2, $hdt$_3]] in Stage 
'Reducer 18' is a cross product
-Warning: Shuffle Join MERGEJOIN[1478][tables = [$hdt$_2, $hdt$_3, $hdt$_1]] in 
Stage 'Reducer 19' is a cross product
+Warning: Shuffle Join MERGEJOIN[1164][tables = [$hdt$_0, $hdt$_1]] in Stage 
'Reducer 6' is a cross product
+Warning: Shuffle Join MERGEJOIN[1171][tables = [$hdt$_0, $hdt$_1]] in Stage 
'Reducer 16' is a cross product
+Warning: Shuffle Join MERGEJOIN[1178][tables = [$hdt$_1, $hdt$_2]] in Stage 
'Reducer 22' is a cross product
 PREHOOK: query: explain
 with  cross_items as
  (select i_item_sk ss_item_sk
@@ -225,1116 +222,840 @@ POSTHOOK: Output: hdfs://### HDFS PATH ###
 Plan optimized by CBO.
 
 Vertex dependency in root stage
-Map 1 <- Reducer 99 (BROADCAST_EDGE)
-Map 101 <- Reducer 96 (BROADCAST_EDGE)
-Map 102 <- Reducer 98 (BROADCAST_EDGE)
-Map 103 <- Reducer 63 (BROADCAST_EDGE)
-Map 104 <- Reducer 68 (BROADCAST_EDGE)
-Map 20 <- Reducer 25 (BROADCAST_EDGE)
-Map 36 <- Reducer 41 (BROADCAST_EDGE)
-Map 46 <- Reducer 100 (BROADCAST_EDGE)
-Map 50 <- Reducer 29 (BROADCAST_EDGE)
-Map 51 <- Reducer 43 (BROADCAST_EDGE)
-Map 52 <- Reducer 58 (BROADCAST_EDGE)
-Map 91 <- Reducer 94 (BROADCAST_EDGE)
-Reducer 10 <- Map 1 (SIMPLE_EDGE), Map 93 (SIMPLE_EDGE), Union 11 (CONTAINS)
-Reducer 100 <- Map 93 (CUSTOM_SIMPLE_EDGE)
-Reducer 12 <- Union 11 (CUSTOM_SIMPLE_EDGE)
-Reducer 13 <- Reducer 12 (CUSTOM_SIMPLE_EDGE), Reducer 32 (CUSTOM_SIMPLE_EDGE)
-Reducer 14 <- Reducer 13 (CUSTOM_SIMPLE_EDGE), Reducer 62 
(CUSTOM_SIMPLE_EDGE), Union 7 (CONTAINS)
-Reducer 15 <- Map 1 (SIMPLE_EDGE), Map 93 (SIMPLE_EDGE), Union 16 (CONTAINS)
-Reducer 17 <- Union 16 (CUSTOM_SIMPLE_EDGE)
-Reducer 18 <- Reducer 17 (CUSTOM_SIMPLE_EDGE), Reducer 35 (CUSTOM_SIMPLE_EDGE)
-Reducer 19 <- Reducer 18 (CUSTOM_SIMPLE_EDGE), Reducer 67 
(CUSTOM_SIMPLE_EDGE), Union 7 (CONTAINS)
-Reducer 2 <- Map 1 (SIMPLE_EDGE), Map 93 (SIMPLE_EDGE), Union 3 (CONTAINS)
-Reducer 21 <- Map 20 (SIMPLE_EDGE), Map 24 (SIMPLE_EDGE), Union 3 (CONTAINS)
-Reducer 22 <- Map 20 (SIMPLE_EDGE), Map 24 (SIMPLE_EDGE), Union 11 (CONTAINS)
-Reducer 23 <- Map 20 (SIMPLE_EDGE), Map 24 (SIMPLE_EDGE), Union 16 (CONTAINS)
-Reducer 25 <- Map 24 (CUSTOM_SIMPLE_EDGE)
-Reducer 26 <- Map 24 (SIMPLE_EDGE), Map 50 (SIMPLE_EDGE), Union 27 (CONTAINS)
-Reducer 28 <- Union 27 (CUSTOM_SIMPLE_EDGE)
-Reducer 29 <- Map 24 (CUSTOM_SIMPLE_EDGE)
-Reducer 30 <- Map 24 (SIMPLE_EDGE), Map 50 (SIMPLE_EDGE), Union 31 (CONTAINS)
-Reducer 32 <- Union 31 (CUSTOM_SIMPLE_EDGE)
-Reducer 33 <- Map 24 (SIMPLE_EDGE), Map 50 (SIMPLE_EDGE), Union 34 (CONTAINS)
-Reducer 35 <- Union 34 (CUSTOM_SIMPLE_EDGE)
-Reducer 37 <- Map 36 (SIMPLE_EDGE), Map 40 (SIMPLE_EDGE), Union 3 (CONTAINS)
-Reducer 38 <- Map 36 (SIMPLE_EDGE), Map 40 (SIMPLE_EDGE), Union 11 (CONTAINS)
-Reducer 39 <- Map 36 (SIMPLE_EDGE), Map 40 (SIMPLE_EDGE), Union 16 (CONTAINS)
-Reducer 4 <- Union 3 (CUSTOM_SIMPLE_EDGE)
-Reducer 41 <- Map 40 (CUSTOM_SIMPLE_EDGE)
-Reducer 42 <- Map 40 (SIMPLE_EDGE), Map 51 (SIMPLE_EDGE), Union 27 (CONTAINS)
-Reducer 43 <- Map 40 (CUSTOM_SIMPLE_EDGE)
-Reducer 44 <- Map 40 (SIMPLE_EDGE), Map 51 (SIMPLE_EDGE), Union 31 (CONTAINS)
-Reducer 45 <- Map 40 (SIMPLE_EDGE), Map 51 (SIMPLE_EDGE), Union 34 (CONTAINS)
-Reducer 47 <- Map 46 (SIMPLE_EDGE), Map 93 (SIMPLE_EDGE), Union 27 (CONTAINS)
-Reducer 48 <- Map 46 (SIMPLE_EDGE), Map 93 (SIMPLE_EDGE), Union 31 (CONTAINS)
-Reducer 49 <- Map 46 (SIMPLE_EDGE), Map 93 (SIMPLE_EDGE), Union 34 (CONTAINS)
-Reducer 5 <- Reducer 28 (CUSTOM_SIMPLE_EDGE), Reducer 4 (CUSTOM_SIMPLE_EDGE)
-Reducer 53 <- Map 52 (SIMPLE_EDGE), Map 57 (SIMPLE_EDGE)
-Reducer 54 <- Map 69 (SIMPLE_EDGE), Reducer 53 (SIMPLE_EDGE)
-Reducer 55 <- Reducer 54 (ONE_TO_ONE_EDGE), Reducer 70 (SIMPLE_EDGE)
-Reducer 56 <- Reducer 55 (SIMPLE_EDGE)
-Reducer 58 <- Map 57 (CUSTOM_SIMPLE_EDGE)
-Reducer 59 <- Map 103 (SIMPLE_EDGE), Map 57 (SIMPLE_EDGE)
+Map 1 <- Reducer 11 (BROADCAST_EDGE)
+Map 46 <- Reducer 49 (BROADCAST_EDGE)
+Map 64 <- Reducer 51 (BROADCAST_EDGE)
+Map 65 <- Reducer 53

[1/4] hive git commit: HIVE-21021: Scalar subquery with only aggregate in subquery (no group by) has unnecessary sq_count_check branch (Vineet Garg, reviewed by Ashutosh Chauhan)

2018-12-14 Thread vgarg

Repository: hive
Updated Branches:
  refs/heads/master 64930f8ac -> c7b5454aa


http://git-wip-us.apache.org/repos/asf/hive/blob/c7b5454a/ql/src/test/results/clientpositive/spark/subquery_scalar.q.out
--
diff --git a/ql/src/test/results/clientpositive/spark/subquery_scalar.q.out 
b/ql/src/test/results/clientpositive/spark/subquery_scalar.q.out
index e204a01..6c25b58 100644
--- a/ql/src/test/results/clientpositive/spark/subquery_scalar.q.out
+++ b/ql/src/test/results/clientpositive/spark/subquery_scalar.q.out
@@ -7052,3 +7052,141 @@ STAGE PLANS:
   Processor Tree:
 ListSink
 
+PREHOOK: query: CREATE TABLE `store_sales`(
+  `ss_sold_date_sk` int,
+  `ss_quantity` int,
+  `ss_list_price` decimal(7,2))
+PREHOOK: type: CREATETABLE
+PREHOOK: Output: database:default
+PREHOOK: Output: default@store_sales
+POSTHOOK: query: CREATE TABLE `store_sales`(
+  `ss_sold_date_sk` int,
+  `ss_quantity` int,
+  `ss_list_price` decimal(7,2))
+POSTHOOK: type: CREATETABLE
+POSTHOOK: Output: database:default
+POSTHOOK: Output: default@store_sales
+PREHOOK: query: CREATE TABLE `date_dim`(
+  `d_date_sk` int,
+  `d_year` int)
+PREHOOK: type: CREATETABLE
+PREHOOK: Output: database:default
+PREHOOK: Output: default@date_dim
+POSTHOOK: query: CREATE TABLE `date_dim`(
+  `d_date_sk` int,
+  `d_year` int)
+POSTHOOK: type: CREATETABLE
+POSTHOOK: Output: database:default
+POSTHOOK: Output: default@date_dim
+Warning: Shuffle Join JOIN[19][tables = [$hdt$_0, $hdt$_1]] in Work 'Reducer 
2' is a cross product
+PREHOOK: query: explain cbo with avg_sales as
+ (select avg(quantity*list_price) average_sales
+  from (select ss_quantity quantity
+ ,ss_list_price list_price
+   from store_sales
+   ,date_dim
+   where ss_sold_date_sk = d_date_sk
+ and d_year between 1999 and 2001 ) x)
+select * from store_sales where ss_list_price > (select average_sales from 
avg_sales)
+PREHOOK: type: QUERY
+PREHOOK: Input: default@date_dim
+PREHOOK: Input: default@store_sales
+ A masked pattern was here 
+POSTHOOK: query: explain cbo with avg_sales as
+ (select avg(quantity*list_price) average_sales
+  from (select ss_quantity quantity
+ ,ss_list_price list_price
+   from store_sales
+   ,date_dim
+   where ss_sold_date_sk = d_date_sk
+ and d_year between 1999 and 2001 ) x)
+select * from store_sales where ss_list_price > (select average_sales from 
avg_sales)
+POSTHOOK: type: QUERY
+POSTHOOK: Input: default@date_dim
+POSTHOOK: Input: default@store_sales
+ A masked pattern was here 
+CBO PLAN:
+HiveProject(ss_sold_date_sk=[$0], ss_quantity=[$1], ss_list_price=[$2])
+  HiveJoin(condition=[>($2, $3)], joinType=[inner], algorithm=[none], 
cost=[not available])
+HiveProject(ss_sold_date_sk=[$0], ss_quantity=[$1], ss_list_price=[$2])
+  HiveTableScan(table=[[default, store_sales]], table:alias=[store_sales])
+HiveProject($f0=[/($0, $1)])
+  HiveAggregate(group=[{}], agg#0=[sum($0)], agg#1=[count($0)])
+HiveProject($f0=[*(CAST($1):DECIMAL(10, 0), $2)])
+  HiveJoin(condition=[=($0, $3)], joinType=[inner], algorithm=[none], 
cost=[not available])
+HiveProject(ss_sold_date_sk=[$0], ss_quantity=[$1], 
ss_list_price=[$2])
+  HiveFilter(condition=[IS NOT NULL($0)])
+HiveTableScan(table=[[default, store_sales]], 
table:alias=[store_sales])
+HiveProject(d_date_sk=[$0])
+  HiveFilter(condition=[AND(BETWEEN(false, $1, 1999, 2001), IS NOT 
NULL($0))])
+HiveTableScan(table=[[default, date_dim]], 
table:alias=[date_dim])
+
+Warning: Shuffle Join JOIN[35][tables = [$hdt$_0, $hdt$_1, $hdt$_2]] in Work 
'Reducer 2' is a cross product
+PREHOOK: query: explain cbo with avg_sales as
+ (select avg(quantity*list_price) over( partition by list_price) average_sales
+  from (select ss_quantity quantity
+ ,ss_list_price list_price
+   from store_sales
+   ,date_dim
+   where ss_sold_date_sk = d_date_sk
+ and d_year between 1999 and 2001 ) x)
+select * from store_sales where ss_list_price > (select average_sales from 
avg_sales)
+PREHOOK: type: QUERY
+PREHOOK: Input: default@date_dim
+PREHOOK: Input: default@store_sales
+ A masked pattern was here 
+POSTHOOK: query: explain cbo with avg_sales as
+ (select avg(quantity*list_price) over( partition by list_price) average_sales
+  from (select ss_quantity quantity
+ ,ss_list_price list_price
+   from store_sales
+   ,date_dim
+   where ss_sold_date_sk = d_date_sk
+ and d_year between 1999 and 2001 ) x)
+select * from store_sales where ss_list_price > (select average_sales from 
avg_sales)
+POSTHOOK: type: QUERY
+POSTHOOK: Input: default@date_dim
+POSTHOOK: Input: default@store_sales
+ A masked pattern was here 
+CBO PLAN:
+HiveProject(ss_sold_date_sk=[$0], ss_quantity=[$1],

hive git commit: HIVE-21028: Adding a JDO fetch plan for getTableMeta get_table_meta to avoid race condition(Karthik Manamcheri, reviewed by Adam Holley, Vihang K and Naveen G)

2018-12-14 Thread ngangam

Repository: hive
Updated Branches:
  refs/heads/master 687aeef53 -> 64930f8ac


HIVE-21028: Adding a JDO fetch plan for getTableMeta get_table_meta to avoid 
race condition(Karthik Manamcheri, reviewed by Adam Holley, Vihang K and Naveen 
G)


Project: http://git-wip-us.apache.org/repos/asf/hive/repo
Commit: http://git-wip-us.apache.org/repos/asf/hive/commit/64930f8a
Tree: http://git-wip-us.apache.org/repos/asf/hive/tree/64930f8a
Diff: http://git-wip-us.apache.org/repos/asf/hive/diff/64930f8a

Branch: refs/heads/master
Commit: 64930f8ac23c958391dff5405e51d148b039c079
Parents: 687aeef
Author: Naveen Gangam 
Authored: Fri Dec 14 10:28:58 2018 -0500
Committer: Naveen Gangam 
Committed: Fri Dec 14 10:28:58 2018 -0500

--
 .../hadoop/hive/metastore/ObjectStore.java  |   8 +
 .../hive/metastore/model/FetchGroups.java   |  26 ++
 .../src/main/resources/package.jdo  |   5 +
 .../hive/metastore/StatementVerifyingDerby.java | 345 +++
 .../TestObjectStoreStatementVerify.java | 159 +
 5 files changed, 543 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/hive/blob/64930f8a/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
--
diff --git 
a/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
 
b/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
index e598a43..3fa21b7 100644
--- 
a/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
+++ 
b/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
@@ -1461,6 +1461,13 @@ public class ObjectStore implements RawStore, 
Configurable {
 LOG.debug("getTableMeta with filter " + filterBuilder.toString() + " 
params: " +
 StringUtils.join(parameterVals, ", "));
   }
+  // Add the fetch group here which retrieves the database object along 
with the MTable
+  // objects. If we don't prefetch the database object, we could end up in 
a situation where
+  // the database gets dropped while we are looping through the tables 
throwing a
+  // JDOObjectNotFoundException. This causes HMS to go into a retry loop 
which greatly degrades
+  // performance of this function when called with dbNames="*" and 
tableNames="*" (fetch all
+  // tables in all databases, essentially a full dump)
+  pm.getFetchPlan().addGroup(FetchGroups.FETCH_DATABASE_ON_MTABLE);
   query = pm.newQuery(MTable.class, filterBuilder.toString());
   Collection tables = (Collection) 
query.executeWithArray(parameterVals.toArray(new String[parameterVals.size()]));
   for (MTable table : tables) {
@@ -1472,6 +1479,7 @@ public class ObjectStore implements RawStore, 
Configurable {
   }
   commited = commitTransaction();
 } finally {
+  pm.getFetchPlan().removeGroup(FetchGroups.FETCH_DATABASE_ON_MTABLE);
   rollbackAndCleanup(commited, query);
 }
 return metas;

http://git-wip-us.apache.org/repos/asf/hive/blob/64930f8a/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/model/FetchGroups.java
--
diff --git 
a/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/model/FetchGroups.java
 
b/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/model/FetchGroups.java
new file mode 100644
index 000..b822993
--- /dev/null
+++ 
b/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/model/FetchGroups.java
@@ -0,0 +1,26 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hive.metastore.model;
+
+// These fetch groups are defined in package.jdo. Make sure that the names of 
fetch groups match
+// between here and in the package.jdo file.
+public class

hive git commit: HIVE-21035: Race condition in SparkUtilities#getSparkSession (Antal Sinkovits, reviewed by Adam Szita, Denys Kuzmenko)

2018-12-14 Thread szita

Repository: hive
Updated Branches:
  refs/heads/master 7da8f3d36 -> 687aeef53


HIVE-21035: Race condition in SparkUtilities#getSparkSession (Antal Sinkovits, 
reviewed by Adam Szita, Denys Kuzmenko)


Project: http://git-wip-us.apache.org/repos/asf/hive/repo
Commit: http://git-wip-us.apache.org/repos/asf/hive/commit/687aeef5
Tree: http://git-wip-us.apache.org/repos/asf/hive/tree/687aeef5
Diff: http://git-wip-us.apache.org/repos/asf/hive/diff/687aeef5

Branch: refs/heads/master
Commit: 687aeef5347b9ee28f41aacb3085cf513e01afe1
Parents: 7da8f3d
Author: Antal Sinkovits 
Authored: Thu Dec 13 11:13:18 2018 +0100
Committer: Adam Szita 
Committed: Fri Dec 14 14:13:28 2018 +0100

--
 .../hive/ql/exec/spark/SparkUtilities.java  |  32 ++---
 .../hive/ql/exec/spark/TestSparkUtilities.java  | 117 +++
 2 files changed, 135 insertions(+), 14 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/hive/blob/687aeef5/ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkUtilities.java
--
diff --git 
a/ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkUtilities.java 
b/ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkUtilities.java
index d384ed6..fafae31 100644
--- a/ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkUtilities.java
+++ b/ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkUtilities.java
@@ -7,7 +7,7 @@
  * "License"); you may not use this file except in compliance
  * with the License.  You may obtain a copy of the License at
  *
- *http://www.apache.org/licenses/LICENSE-2.0
+ * http://www.apache.org/licenses/LICENSE-2.0
  *
  * Unless required by applicable law or agreed to in writing, software
  * distributed under the License is distributed on an "AS IS" BASIS,
@@ -122,21 +122,25 @@ public class SparkUtilities {
 
   public static SparkSession getSparkSession(HiveConf conf,
   SparkSessionManager sparkSessionManager) throws HiveException {
-SparkSession sparkSession = SessionState.get().getSparkSession();
-HiveConf sessionConf = SessionState.get().getConf();
 
-// Spark configurations are updated close the existing session
-// In case of async queries or confOverlay is not empty,
-// sessionConf and conf are different objects
-if (sessionConf.getSparkConfigUpdated() || conf.getSparkConfigUpdated()) {
-  sparkSessionManager.closeSession(sparkSession);
-  sparkSession = null;
-  conf.setSparkConfigUpdated(false);
-  sessionConf.setSparkConfigUpdated(false);
+SessionState sessionState = SessionState.get();
+synchronized (sessionState) {
+  SparkSession sparkSession = sessionState.getSparkSession();
+  HiveConf sessionConf = sessionState.getConf();
+
+  // Spark configurations are updated close the existing session
+  // In case of async queries or confOverlay is not empty,
+  // sessionConf and conf are different objects
+  if (sessionConf.getSparkConfigUpdated() || conf.getSparkConfigUpdated()) 
{
+sparkSessionManager.closeSession(sparkSession);
+sparkSession = null;
+conf.setSparkConfigUpdated(false);
+sessionConf.setSparkConfigUpdated(false);
+  }
+  sparkSession = sparkSessionManager.getSession(sparkSession, conf, true);
+  sessionState.setSparkSession(sparkSession);
+  return sparkSession;
 }
-sparkSession = sparkSessionManager.getSession(sparkSession, conf, true);
-SessionState.get().setSparkSession(sparkSession);
-return sparkSession;
   }
 
   /**

http://git-wip-us.apache.org/repos/asf/hive/blob/687aeef5/ql/src/test/org/apache/hadoop/hive/ql/exec/spark/TestSparkUtilities.java
--
diff --git 
a/ql/src/test/org/apache/hadoop/hive/ql/exec/spark/TestSparkUtilities.java 
b/ql/src/test/org/apache/hadoop/hive/ql/exec/spark/TestSparkUtilities.java
new file mode 100644
index 000..f797f30
--- /dev/null
+++ b/ql/src/test/org/apache/hadoop/hive/ql/exec/spark/TestSparkUtilities.java
@@ -0,0 +1,117 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing

hive git commit: HIVE-21023 : Add test for replication to a target with hive.strict.managed.tables enabled. (Mahesh Kumar Behera, reviewed by Sankar Hariappan)

2018-12-14 Thread mahesh

Repository: hive
Updated Branches:
  refs/heads/master e8e0396c1 -> 7da8f3d36


HIVE-21023 : Add test for replication to a target with 
hive.strict.managed.tables enabled. (Mahesh Kumar Behera, reviewed by Sankar 
Hariappan)


Project: http://git-wip-us.apache.org/repos/asf/hive/repo
Commit: http://git-wip-us.apache.org/repos/asf/hive/commit/7da8f3d3
Tree: http://git-wip-us.apache.org/repos/asf/hive/tree/7da8f3d3
Diff: http://git-wip-us.apache.org/repos/asf/hive/diff/7da8f3d3

Branch: refs/heads/master
Commit: 7da8f3d36ee2b6c508ea2ab8c241df52107ac74e
Parents: e8e0396
Author: Mahesh Kumar Behera 
Authored: Fri Dec 14 18:26:09 2018 +0530
Committer: Mahesh Kumar Behera 
Committed: Fri Dec 14 18:26:09 2018 +0530

--
 ...ationScenariosIncrementalLoadAcidTables.java | 128 +---
 .../TestReplicationScenariosMigration.java  |  33 ++
 .../TestReplicationWithTableMigration.java  | 328 +++
 3 files changed, 363 insertions(+), 126 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/hive/blob/7da8f3d3/itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestReplicationScenariosIncrementalLoadAcidTables.java
--
diff --git 
a/itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestReplicationScenariosIncrementalLoadAcidTables.java
 
b/itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestReplicationScenariosIncrementalLoadAcidTables.java
index b71cfa4..97775b3 100644
--- 
a/itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestReplicationScenariosIncrementalLoadAcidTables.java
+++ 
b/itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestReplicationScenariosIncrementalLoadAcidTables.java
@@ -17,20 +17,13 @@
  */
 package org.apache.hadoop.hive.ql.parse;
 
-import org.apache.hadoop.fs.Path;
 import org.apache.hadoop.hdfs.MiniDFSCluster;
 import org.apache.hadoop.hive.conf.HiveConf;
-import org.apache.hadoop.hive.metastore.api.Table;
 import org.apache.hadoop.hive.metastore.conf.MetastoreConf;
 import 
org.apache.hadoop.hive.metastore.messaging.json.gzip.GzipJSONMessageEncoder;
-import org.apache.hadoop.hive.metastore.utils.MetaStoreUtils;
-import org.apache.hadoop.hive.ql.parse.repl.PathBuilder;
 import org.apache.hadoop.hive.shims.Utils;
 import org.apache.hadoop.hive.ql.parse.WarehouseInstance;
 import static 
org.apache.hadoop.hive.metastore.ReplChangeManager.SOURCE_OF_REPLICATION;
-import static org.apache.hadoop.hive.ql.io.AcidUtils.isFullAcidTable;
-import static org.apache.hadoop.hive.ql.io.AcidUtils.isTransactionalTable;
-
 import org.apache.hadoop.hive.ql.parse.ReplicationTestUtils;
 
 import org.junit.rules.TestName;
@@ -48,9 +41,7 @@ import java.util.ArrayList;
 import java.util.HashMap;
 import java.util.List;
 import java.util.Map;
-import static org.junit.Assert.assertEquals;
-import static org.junit.Assert.assertFalse;
-import static org.junit.Assert.assertTrue;
+
 import com.google.common.collect.Lists;
 
 /**
@@ -62,7 +53,7 @@ public class 
TestReplicationScenariosIncrementalLoadAcidTables {
 
   protected static final Logger LOG = 
LoggerFactory.getLogger(TestReplicationScenariosIncrementalLoadAcidTables.class);
   static WarehouseInstance primary;
-  private static WarehouseInstance replica, replicaNonAcid, replicaMigration, 
primaryMigration;
+  private static WarehouseInstance replica, replicaNonAcid;
   private static HiveConf conf;
   private String primaryDbName, replicatedDbName, primaryDbNameExtra;
 
@@ -105,36 +96,6 @@ public class 
TestReplicationScenariosIncrementalLoadAcidTables {
 put("hive.metastore.client.capability.check", "false");
 }};
 replicaNonAcid = new WarehouseInstance(LOG, miniDFSCluster, 
overridesForHiveConf1);
-
-HashMap overridesForHiveConfReplicaMigration = new 
HashMap() {{
-  put("fs.defaultFS", miniDFSCluster.getFileSystem().getUri().toString());
-  put("hive.support.concurrency", "true");
-  put("hive.txn.manager", 
"org.apache.hadoop.hive.ql.lockmgr.DbTxnManager");
-  put("hive.metastore.client.capability.check", "false");
-  put("hive.repl.bootstrap.dump.open.txn.timeout", "1s");
-  put("hive.exec.dynamic.partition.mode", "nonstrict");
-  put("hive.strict.checks.bucketing", "false");
-  put("hive.mapred.mode", "nonstrict");
-  put("mapred.input.dir.recursive", "true");
-  put("hive.metastore.disallow.incompatible.col.type.changes", "false");
-  put("hive.strict.managed.tables", "true");
-}};
-replicaMigration = new WarehouseInstance(LOG, miniDFSCluster, 
overridesForHiveConfReplicaMigration);
-
-HashMap overridesForHiveConfPrimaryMigration = new 
HashMap() {{
-  put("fs.defaultFS", miniDFSCluster.getFileSystem().getUri().toString());
-  put("hive.metastore.client.capability.check", "false");
-

hive git commit: HIVE-21028: Adding a JDO fetch plan for getTableMeta get_table_meta to avoid race condition(Karthik Manamcheri, reviewed by Adam Holley, Vihang K and Naveen G)

hive git commit: HIVE-21030 : Add credential store env properties redaction in JobConf (Denys Kuzmenko reviewed by Vihang Karajgaonkar)

hive git commit: HIVE-20860 : Fix or disable TestMiniLlapLocalCliDriver.testCliDriver[cbo_limit] (addendum)

[4/4] hive git commit: HIVE-21021: Scalar subquery with only aggregate in subquery (no group by) has unnecessary sq_count_check branch (Vineet Garg, reviewed by Ashutosh Chauhan)

[3/4] hive git commit: HIVE-21021: Scalar subquery with only aggregate in subquery (no group by) has unnecessary sq_count_check branch (Vineet Garg, reviewed by Ashutosh Chauhan)

[2/4] hive git commit: HIVE-21021: Scalar subquery with only aggregate in subquery (no group by) has unnecessary sq_count_check branch (Vineet Garg, reviewed by Ashutosh Chauhan)

[1/4] hive git commit: HIVE-21021: Scalar subquery with only aggregate in subquery (no group by) has unnecessary sq_count_check branch (Vineet Garg, reviewed by Ashutosh Chauhan)

hive git commit: HIVE-21028: Adding a JDO fetch plan for getTableMeta get_table_meta to avoid race condition(Karthik Manamcheri, reviewed by Adam Holley, Vihang K and Naveen G)

hive git commit: HIVE-21035: Race condition in SparkUtilities#getSparkSession (Antal Sinkovits, reviewed by Adam Szita, Denys Kuzmenko)

hive git commit: HIVE-21023 : Add test for replication to a target with hive.strict.managed.tables enabled. (Mahesh Kumar Behera, reviewed by Sankar Hariappan)

10 matches

Site Navigation

Mail list logo

Footer information