[GitHub] [hive] dlavati opened a new pull request #1207: HIVE-23483: Remove DynamicSerDe

2020-07-06 Thread GitBox


dlavati opened a new pull request #1207:
URL: https://github.com/apache/hive/pull/1207


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org
For additional commands, e-mail: gitbox-h...@hive.apache.org



[GitHub] [hive] dgzdot opened a new pull request #1206: HIVE-23802:“merge files” job was submited to default queue when set h…

2020-07-06 Thread GitBox


dgzdot opened a new pull request #1206:
URL: https://github.com/apache/hive/pull/1206


   …ive.merge.tezfiles to true
   
   ## NOTICE
   
   Please create an issue in ASF JIRA before opening a pull request,
   and you need to set the title of the pull request which starts with
   the corresponding JIRA issue number. (e.g. HIVE-X: Fix a typo in YYY)
   For more details, please see 
https://cwiki.apache.org/confluence/display/Hive/HowToContribute
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org
For additional commands, e-mail: gitbox-h...@hive.apache.org



[GitHub] [hive] kgyrtkirk commented on pull request #1196: HIVE-23791: Optimize ACID stats generation

2020-07-06 Thread GitBox


kgyrtkirk commented on pull request #1196:
URL: https://github.com/apache/hive/pull/1196#issuecomment-654113119


   http://ci.hive.apache.org/job/hive-precommit/job/PR-1196/2/ is valid for 
this pr; branch indexing have started ~80 ptest runs which I had to cancel 
manually...



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org
For additional commands, e-mail: gitbox-h...@hive.apache.org



[GitHub] [hive] nareshpr opened a new pull request #1214: HIVE-23811: deleteReader SARG rowId is not getting validated properly

2020-07-06 Thread GitBox


nareshpr opened a new pull request #1214:
URL: https://github.com/apache/hive/pull/1214


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org
For additional commands, e-mail: gitbox-h...@hive.apache.org



[GitHub] [hive] jcamachor commented on a change in pull request #1212: HIVE-23807 Wrong results with vectorization enabled

2020-07-06 Thread GitBox


jcamachor commented on a change in pull request #1212:
URL: https://github.com/apache/hive/pull/1212#discussion_r450617751



##
File path: 
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastStringToDate.java
##
@@ -116,14 +118,16 @@ public void evaluate(VectorizedRowBatch batch) {
 
   private void evaluate(LongColumnVector outV, BytesColumnVector inV, int i) {
 String dateString = new String(inV.vector[i], inV.start[i], inV.length[i], 
StandardCharsets.UTF_8);
-if (dateParser.parseDate(dateString, sqlDate)) {
+try {
+  Date utilDate = Date.valueOf(dateString);

Review comment:
   Is this vectorized expression generated for `GenericUDFDate`? It seems 
that UDF still relies on DateParser in this branch: 
https://github.com/apache/hive/blob/b7f3e8ef399f510c3a6780209ebc688ef0acee8f/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFDate.java#L112
   
   Since the change may have an effect on parsing of other dates, it may be 
good to have a test run. It seems we have a jenkins active for branch-2 
patches: http://130.211.9.232/job/hive-precommit/job/branch-2/ . I am wondering 
whether a PR against branch-2 will trigger it or whether we can trigger it 
manually. Could you verify?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org
For additional commands, e-mail: gitbox-h...@hive.apache.org



[GitHub] [hive] StefanXiepj closed pull request #1208: Hive 22412: StatsUtils throw NPE when explain

2020-07-06 Thread GitBox


StefanXiepj closed pull request #1208:
URL: https://github.com/apache/hive/pull/1208


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org
For additional commands, e-mail: gitbox-h...@hive.apache.org



[GitHub] [hive] StefanXiepj opened a new pull request #1209: HIVE-22412: StatsUtils throw NPE when explain

2020-07-06 Thread GitBox


StefanXiepj opened a new pull request #1209:
URL: https://github.com/apache/hive/pull/1209


   ## NOTICE
   
   Please create an issue in ASF JIRA before opening a pull request,
   and you need to set the title of the pull request which starts with
   the corresponding JIRA issue number. (e.g. HIVE-X: Fix a typo in YYY)
   For more details, please see 
https://cwiki.apache.org/confluence/display/Hive/HowToContribute
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org
For additional commands, e-mail: gitbox-h...@hive.apache.org



[GitHub] [hive] kgyrtkirk opened a new pull request #1210: HIVE-22301: Hive lineage is not generated for insert overwrite queries on partitioned tables

2020-07-06 Thread GitBox


kgyrtkirk opened a new pull request #1210:
URL: https://github.com/apache/hive/pull/1210


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org
For additional commands, e-mail: gitbox-h...@hive.apache.org



[GitHub] [hive] belugabehr closed pull request #1153: HIVE-23731: Review of AvroInstance Cache

2020-07-06 Thread GitBox


belugabehr closed pull request #1153:
URL: https://github.com/apache/hive/pull/1153


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org
For additional commands, e-mail: gitbox-h...@hive.apache.org



[GitHub] [hive] belugabehr commented on a change in pull request #1211: HIVE-23790: The error message length of 2000 is exceeded for scheduled query

2020-07-06 Thread GitBox


belugabehr commented on a change in pull request #1211:
URL: https://github.com/apache/hive/pull/1211#discussion_r450245316



##
File path: 
standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/client/TestMetastoreScheduledQueries.java
##
@@ -357,6 +360,17 @@ public void testPoll() throws Exception {
 assertFalse(pollResult.isSetQuery());
   }
 
+  private String generateLongErrorMessage() {
+StringBuffer sb=new StringBuffer();

Review comment:
   This is a test, so not super important, but this should be a 
`StringBuilder` and please fix the checkstyle issue (no space between the `=`.  
Thanks.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org
For additional commands, e-mail: gitbox-h...@hive.apache.org



[GitHub] [hive] belugabehr commented on pull request #1209: HIVE-22412: StatsUtils throw NPE when explain

2020-07-06 Thread GitBox


belugabehr commented on pull request #1209:
URL: https://github.com/apache/hive/pull/1209#issuecomment-654250436


   Thanks for the contribution.
   
   I need to look at it more, but if 'null' is not a valid option to pass into 
the constructors, then it should be prohibited: `Objects#requireNull`  Please 
check where the null value is being passed in and instead have the calling 
class pass in an `Collections#EmptyList` if possible, or a new `ArrayList` 
otherwise.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org
For additional commands, e-mail: gitbox-h...@hive.apache.org



[GitHub] [hive] kgyrtkirk commented on pull request #1211: HIVE-23790: The error message length of 2000 is exceeded for scheduled query

2020-07-06 Thread GitBox


kgyrtkirk commented on pull request #1211:
URL: https://github.com/apache/hive/pull/1211#issuecomment-654263274


   I didn't wanted to get into trouble because of utf stuff; so I clip the 
message at 1K ; the purpose of this error message field is to given an idea 
what was the problem - and not to provide fully detailed backtrace/etc ; I 
think 1K might be already too much...in case some exception puts too much info 
into the "message" part of the exception
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org
For additional commands, e-mail: gitbox-h...@hive.apache.org



[GitHub] [hive] kgyrtkirk commented on pull request #1210: HIVE-22301: Hive lineage is not generated for insert overwrite queries on partitioned tables

2020-07-06 Thread GitBox


kgyrtkirk commented on pull request #1210:
URL: https://github.com/apache/hive/pull/1210#issuecomment-654235092


   I feel that there is a strong correlation between something being 
`!complete` and `dummy_partition`



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org
For additional commands, e-mail: gitbox-h...@hive.apache.org



[GitHub] [hive] belugabehr closed pull request #1203: HIVE-22674: Replace Base64 in serde Package

2020-07-06 Thread GitBox


belugabehr closed pull request #1203:
URL: https://github.com/apache/hive/pull/1203


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org
For additional commands, e-mail: gitbox-h...@hive.apache.org



[GitHub] [hive] jfsii commented on pull request #1119: HIVE-23699: Cleanup HIVEQUERYRESULTFILEFORMAT handling

2020-07-06 Thread GitBox


jfsii commented on pull request #1119:
URL: https://github.com/apache/hive/pull/1119#issuecomment-654280518


   Rebased, so it should kick off tests again



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org
For additional commands, e-mail: gitbox-h...@hive.apache.org



[GitHub] [hive] belugabehr closed pull request #1088: HIVE-23673: Maven Standard Directories for accumulo-handler

2020-07-06 Thread GitBox


belugabehr closed pull request #1088:
URL: https://github.com/apache/hive/pull/1088


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org
For additional commands, e-mail: gitbox-h...@hive.apache.org



[GitHub] [hive] kgyrtkirk opened a new pull request #1211: HIVE-23790: The error message length of 2000 is exceeded for scheduled query

2020-07-06 Thread GitBox


kgyrtkirk opened a new pull request #1211:
URL: https://github.com/apache/hive/pull/1211


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org
For additional commands, e-mail: gitbox-h...@hive.apache.org



[GitHub] [hive] belugabehr commented on a change in pull request #1201: HIVE-23797: Throw exception when no metastore found in zookeeper

2020-07-06 Thread GitBox


belugabehr commented on a change in pull request #1201:
URL: https://github.com/apache/hive/pull/1201#discussion_r450251647



##
File path: 
standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java
##
@@ -327,6 +327,13 @@ private void resolveUris() throws MetaException {
   MetaStoreUtils.logAndThrowMetaException(e);
 }
 
+if (metastoreUrisString.isEmpty() && 
"zookeeper".equalsIgnoreCase(serviceDiscoveryMode)) {
+  throw new MetaException("No metastore server available. "
+  + "Please ensure that at least one metastore server is online");
+}
+
+LOG.info("Resolved metastore uris: " + 
Arrays.toString(metastoreUrisString.toArray()));

Review comment:
   Nit:
   
   Do not need to convert it to an array first.  This will work:
   
   `LOG.info("Resolved metastore uris: {}", metastoreUrisString);`

##
File path: 
standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java
##
@@ -327,6 +327,13 @@ private void resolveUris() throws MetaException {
   MetaStoreUtils.logAndThrowMetaException(e);
 }
 
+if (metastoreUrisString.isEmpty() && 
"zookeeper".equalsIgnoreCase(serviceDiscoveryMode)) {
+  throw new MetaException("No metastore server available. "

Review comment:
   Please add some additional context here regarding ZooKeeper.  Something 
like:
   
   `No metastore store service discovered in ZooKeeper`
   or
   `No metastore store service currently registered in ZookKeeper`





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org
For additional commands, e-mail: gitbox-h...@hive.apache.org



[GitHub] [hive] pvary merged pull request #1204: HIVE-23799: Fix AcidUtils.parseBaseOrDeltaBucketFilename handling of data loaded by LOAD DATA

2020-07-06 Thread GitBox


pvary merged pull request #1204:
URL: https://github.com/apache/hive/pull/1204


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org
For additional commands, e-mail: gitbox-h...@hive.apache.org



[GitHub] [hive] dengzhhu653 commented on a change in pull request #1201: HIVE-23797: Throw exception when no metastore found in zookeeper

2020-07-06 Thread GitBox


dengzhhu653 commented on a change in pull request #1201:
URL: https://github.com/apache/hive/pull/1201#discussion_r450287586



##
File path: 
standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java
##
@@ -327,6 +327,13 @@ private void resolveUris() throws MetaException {
   MetaStoreUtils.logAndThrowMetaException(e);
 }
 
+if (metastoreUrisString.isEmpty() && 
"zookeeper".equalsIgnoreCase(serviceDiscoveryMode)) {
+  throw new MetaException("No metastore server available. "
+  + "Please ensure that at least one metastore server is online");
+}
+
+LOG.info("Resolved metastore uris: " + 
Arrays.toString(metastoreUrisString.toArray()));

Review comment:
   done





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org
For additional commands, e-mail: gitbox-h...@hive.apache.org



[GitHub] [hive] kgyrtkirk commented on a change in pull request #1161: HIVE-23638: Spotbugs issues in hive-common

2020-07-06 Thread GitBox


kgyrtkirk commented on a change in pull request #1161:
URL: https://github.com/apache/hive/pull/1161#discussion_r450299299



##
File path: common/src/java/org/apache/hadoop/hive/conf/Validator.java
##
@@ -357,14 +357,15 @@ public String validate(String value) {
   final Path path = FileSystems.getDefault().getPath(value);
   if (path == null && value != null) {
 return String.format("Path '%s' provided could not be located.", 
value);
-  }
-  final boolean isDir = Files.isDirectory(path);
-  final boolean isWritable = Files.isWritable(path);
-  if (!isDir) {
-return String.format("Path '%s' provided is not a directory.", value);
-  }
-  if (!isWritable) {
-return String.format("Path '%s' provided is not writable.", value);
+  } else if (path != null) {
+final boolean isDir = Files.isDirectory(path);
+final boolean isWritable = Files.isWritable(path);
+if (!isDir) {
+  return String.format("Path '%s' provided is not a directory.", 
value);
+}
+if (!isWritable) {
+  return String.format("Path '%s' provided is not writable.", value);
+}
   }
   return null;

Review comment:
   I think this should be something else than `null` - did the old 
behaviour accept `null` as a valid path? I believe `Files.isDirectory(null)` 
returned false
   
   looking from the other end: I don't think `null` should be accepted as a 
"writeabledirectory"

##
File path: common/src/java/org/apache/hadoop/hive/common/FileUtils.java
##
@@ -926,8 +925,7 @@ public static File createLocalDirsTempFile(Configuration 
conf, String prefix, St
* delete a temporary file and remove it from delete-on-exit hook.
*/
   public static boolean deleteTmpFile(File tempFile) {
-if (tempFile != null) {
-  tempFile.delete();
+if (tempFile != null && tempFile.delete()) {

Review comment:
   this change will leave the file registered in the shutdownmanager in 
case the file was already deleted

##
File path: common/src/java/org/apache/hadoop/hive/common/type/HiveVarchar.java
##
@@ -62,6 +62,9 @@ public boolean equals(Object rhs) {
 if (rhs == this) {
   return true;
 }
-return this.getValue().equals(((HiveVarchar)rhs).getValue());
+if (rhs instanceof HiveVarchar) {
+  return this.getValue().equals(((HiveVarchar) rhs).getValue());

Review comment:
   nice catch! :D

##
File path: common/src/java/org/apache/hive/common/util/SuppressFBWarnings.java
##
@@ -0,0 +1,37 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hive.common.util;
+
+import java.lang.annotation.Retention;
+import java.lang.annotation.RetentionPolicy;
+
+@Retention(RetentionPolicy.CLASS)
+public @interface SuppressFBWarnings {
+/**
+ * The set of FindBugs warnings that are to be suppressed in
+ * annotated element. The value can be a bug category, kind or pattern.
+ *
+ */
+String[] value() default {};
+
+/**
+ * Optional documentation of the reason why the warning is suppressed
+ */
+String justification() default "";
+}

Review comment:
   I think at some point we should probably introduce some rule to ensure 
that every file ends with a newline...but for now: could you add one here ? :D 

##
File path: common/src/java/org/apache/hadoop/hive/common/StringInternUtils.java
##
@@ -135,10 +135,10 @@ public static Path internUriStringsInPath(Path path) {
 
   public static  Map internValuesInMap(Map map) {
 if (map != null) {
-  for (K key : map.keySet()) {
-String value = map.get(key);
+  for (Map.Entry entry : map.entrySet()) {
+String value = entry.getValue();
 if (value != null) {
-  map.put(key, value.intern());
+  map.put(entry.getKey(), value.intern());

Review comment:
   I know it's probably out of scope in this patch - but in case the value 
is already "interned" - then it will not change; at a cost of a conditional we 
can skip the hash lookup(put) as well





This is an 

[GitHub] [hive] pvargacl commented on a change in pull request #1087: HIVE-23671: MSCK repair should handle transactional tables

2020-07-06 Thread GitBox


pvargacl commented on a change in pull request #1087:
URL: https://github.com/apache/hive/pull/1087#discussion_r450317451



##
File path: 
ql/src/test/org/apache/hadoop/hive/ql/metadata/TestHiveMetaStoreChecker.java
##
@@ -74,21 +76,21 @@
   @Before
   public void setUp() throws Exception {
 hive = Hive.get();
-
hive.getConf().setIntVar(HiveConf.ConfVars.METASTORE_FS_HANDLER_THREADS_COUNT, 
15);
-hive.getConf().set(HiveConf.ConfVars.HIVE_MSCK_PATH_VALIDATION.varname, 
"throw");
+
hive.getConf().set(MetastoreConf.ConfVars.FS_HANDLER_THREADS_COUNT.getVarname(),
 "15");

Review comment:
   It does not work with MetasoreConf.ConfVars





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org
For additional commands, e-mail: gitbox-h...@hive.apache.org



[GitHub] [hive] pvargacl commented on a change in pull request #1087: HIVE-23671: MSCK repair should handle transactional tables

2020-07-06 Thread GitBox


pvargacl commented on a change in pull request #1087:
URL: https://github.com/apache/hive/pull/1087#discussion_r450302033



##
File path: ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java
##
@@ -2392,33 +2392,29 @@ public static TableSnapshot 
getTableSnapshot(Configuration conf,
 long writeId = -1;
 ValidWriteIdList validWriteIdList = null;
 
-HiveTxnManager sessionTxnMgr = SessionState.get().getTxnMgr();
-String fullTableName = getFullTableName(dbName, tblName);
-if (sessionTxnMgr != null && sessionTxnMgr.getCurrentTxnId() > 0) {
-  validWriteIdList = getTableValidWriteIdList(conf, fullTableName);
-  if (isStatsUpdater) {
-writeId = SessionState.get().getTxnMgr() != null ?
-SessionState.get().getTxnMgr().getAllocatedTableWriteId(
-  dbName, tblName) : -1;
-if (writeId < 1) {
-  // TODO: this is not ideal... stats updater that doesn't have write 
ID is currently
-  //   "create table"; writeId would be 0/-1 here. No need to call 
this w/true.
-  LOG.debug("Stats updater for {}.{} doesn't have a write ID ({})",
-  dbName, tblName, writeId);
+if (SessionState.get() != null) {
+  HiveTxnManager sessionTxnMgr = SessionState.get().getTxnMgr();
+  String fullTableName = getFullTableName(dbName, tblName);
+  if (sessionTxnMgr != null && sessionTxnMgr.getCurrentTxnId() > 0) {
+validWriteIdList = getTableValidWriteIdList(conf, fullTableName);
+if (isStatsUpdater) {
+  writeId = sessionTxnMgr != null ? 
sessionTxnMgr.getAllocatedTableWriteId(dbName, tblName) : -1;
+  if (writeId < 1) {

Review comment:
   The comment said so: "stats updater that doesn't have write ID is 
currently "create table"; writeId would be 0/-1 here."





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org
For additional commands, e-mail: gitbox-h...@hive.apache.org



[GitHub] [hive] pvargacl commented on a change in pull request #1087: HIVE-23671: MSCK repair should handle transactional tables

2020-07-06 Thread GitBox


pvargacl commented on a change in pull request #1087:
URL: https://github.com/apache/hive/pull/1087#discussion_r450326885



##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
##
@@ -8322,6 +8322,22 @@ public AllocateTableWriteIdsResponse 
allocate_table_write_ids(
   return response;
 }
 
+@Override
+public MaxAllocatedTableWriteIdResponse 
get_max_allocated_table_write_id(MaxAllocatedTableWriteIdRequest rqst)

Review comment:
   All the functions in HMS looks like this I don't want to break the 
pattern. On the second glance, I had to change the seedWriteId and seedTxnId to 
look like this...





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org
For additional commands, e-mail: gitbox-h...@hive.apache.org



[GitHub] [hive] ramesh0201 commented on a change in pull request #1177: HIVE-23665 Rewrite last_value to first_value to enable streaming results

2020-07-06 Thread GitBox


ramesh0201 commented on a change in pull request #1177:
URL: https://github.com/apache/hive/pull/1177#discussion_r450347761



##
File path: ql/src/test/results/clientpositive/llap/vector_ptf_part_simple.q.out
##
@@ -314,46 +386,46 @@ POSTHOOK: type: QUERY
 POSTHOOK: Input: default@vector_ptf_part_simple_orc
  A masked pattern was here 
 p_mfgr p_name  p_retailprice   rn  r   dr  fv  lv  c   
cs
-Manufacturer#2 almond aquamarine rose maroon antique   900.66  1   1   
1   900.66  2031.98 8   8
-Manufacturer#2 almond aquamarine rose maroon antique   1698.66 2   1   
1   900.66  2031.98 8   8
-Manufacturer#2 almond antique violet turquoise frosted 1800.7  3   1   
1   900.66  2031.98 8   8
-Manufacturer#2 almond antique violet chocolate turquoise   1690.68 4   
1   1   900.66  2031.98 8   8
-Manufacturer#2 almond antique violet turquoise frosted 1800.7  5   1   
1   900.66  2031.98 8   8
-Manufacturer#2 almond antique violet turquoise frosted 1800.7  6   1   
1   900.66  2031.98 8   8
-Manufacturer#2 almond aquamarine sandy cyan gainsboro  1000.6  7   1   
1   900.66  2031.98 8   8
-Manufacturer#2 almond aquamarine midnight light salmon 2031.98 8   1   
1   900.66  2031.98 8   8
 Manufacturer#3 almond antique forest lavender goldenrod1190.27 1   
1   1   1190.27 1190.27 7   8
-Manufacturer#3 almond antique chartreuse khaki white   99.68   2   1   
1   1190.27 1190.27 7   8
-Manufacturer#3 almond antique forest lavender goldenrodNULL3   
1   1   1190.27 1190.27 7   8
-Manufacturer#3 almond antique metallic orange dim  55.39   4   1   
1   1190.27 1190.27 7   8
-Manufacturer#3 almond antique misty red olive  1922.98 5   1   1   
1190.27 1190.27 7   8
-Manufacturer#3 almond antique forest lavender goldenrod590.27  6   
1   1   1190.27 1190.27 7   8
-Manufacturer#3 almond antique olive coral navajo   1337.29 7   1   
1   1190.27 1190.27 7   8
 Manufacturer#3 almond antique forest lavender goldenrod1190.27 8   
1   1   1190.27 1190.27 7   8
-Manufacturer#4 almond antique gainsboro frosted violet NULL1   1   
1   NULL1290.35 4   6
-Manufacturer#4 almond aquamarine floral ivory bisque   NULL2   1   
1   NULL1290.35 4   6
-Manufacturer#4 almond antique violet mint lemon1375.42 3   1   
1   NULL1290.35 4   6
-Manufacturer#4 almond aquamarine yellow dodger mint1844.92 4   1   
1   NULL1290.35 4   6
-Manufacturer#4 almond aquamarine floral ivory bisque   1206.26 5   1   
1   NULL1290.35 4   6
-Manufacturer#4 almond azure aquamarine papaya violet   1290.35 6   1   
1   NULL1290.35 4   6
+Manufacturer#3 almond antique olive coral navajo   1337.29 7   1   
1   1190.27 1190.27 7   8
+Manufacturer#3 almond antique forest lavender goldenrod590.27  6   
1   1   1190.27 1190.27 7   8
+Manufacturer#3 almond antique misty red olive  1922.98 5   1   1   
1190.27 1190.27 7   8
+Manufacturer#3 almond antique metallic orange dim  55.39   4   1   
1   1190.27 1190.27 7   8
+Manufacturer#3 almond antique forest lavender goldenrodNULL3   
1   1   1190.27 1190.27 7   8
+Manufacturer#3 almond antique chartreuse khaki white   99.68   2   1   
1   1190.27 1190.27 7   8
+Manufacturer#1 almond aquamarine pink moccasin thistle 1632.66 1   1   
1   1632.66 1632.66 11  12
+Manufacturer#1 almond antique chartreuse lavender yellow   1753.76 2   
1   1   1632.66 1632.66 11  12
+Manufacturer#1 almond aquamarine pink moccasin thistle 1632.66 3   1   
1   1632.66 1632.66 11  12
+Manufacturer#1 almond antique chartreuse lavender yellow   1753.76 4   
1   1   1632.66 1632.66 11  12
+Manufacturer#1 almond aquamarine pink moccasin thistle 1632.66 5   1   
1   1632.66 1632.66 11  12
+Manufacturer#1 almond antique chartreuse lavender yellow   1753.76 6   
1   1   1632.66 1632.66 11  12
+Manufacturer#1 almond antique salmon chartreuse burlywood  1602.59 7   
1   1   1632.66 1632.66 11  12
+Manufacturer#1 almond antique burnished rose metallic  1173.15 8   1   
1   1632.66 1632.66 11  12
+Manufacturer#1 almond aquamarine burnished black steel 1414.42 9   1   
1   1632.66 1632.66 11  12
+Manufacturer#1 almond aquamarine pink moccasin thistle NULL10  1   
1   1632.66 1632.66 11  12
+Manufacturer#1 almond antique chartreuse lavender yellow   1753.76 11  
1   1   

[GitHub] [hive] pvargacl commented on a change in pull request #1087: HIVE-23671: MSCK repair should handle transactional tables

2020-07-06 Thread GitBox


pvargacl commented on a change in pull request #1087:
URL: https://github.com/apache/hive/pull/1087#discussion_r450361883



##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/Msck.java
##
@@ -229,102 +239,168 @@ public int repair(MsckInfo msckInfo) {
 throw new MetastoreException(e);
   }
 }
+if (transactionalTable && !MetaStoreServerUtils.isPartitioned(table)) {
+  if (result.getMaxWriteId() > 0) {
+if (txnId < 0) {
+  // We need the txnId to check against even if we didn't do the 
locking
+  txnId = getMsc().openTxn(getUserName());
+}
+
+validateAndAddMaxTxnIdAndWriteId(result.getMaxWriteId(), 
result.getMaxTxnId(),
+table.getDbName(), table.getTableName(), txnId);
+  }
+}
   }
   success = true;
 } catch (Exception e) {
   LOG.warn("Failed to run metacheck: ", e);
   success = false;
-  ret = 1;
 } finally {
-  if (msckInfo.getResFile() != null) {
-BufferedWriter resultOut = null;
-try {
-  Path resFile = new Path(msckInfo.getResFile());
-  FileSystem fs = resFile.getFileSystem(getConf());
-  resultOut = new BufferedWriter(new 
OutputStreamWriter(fs.create(resFile)));
-
-  boolean firstWritten = false;
-  firstWritten |= writeMsckResult(result.getTablesNotInMs(),
-"Tables not in metastore:", resultOut, firstWritten);
-  firstWritten |= writeMsckResult(result.getTablesNotOnFs(),
-"Tables missing on filesystem:", resultOut, firstWritten);
-  firstWritten |= writeMsckResult(result.getPartitionsNotInMs(),
-"Partitions not in metastore:", resultOut, firstWritten);
-  firstWritten |= writeMsckResult(result.getPartitionsNotOnFs(),
-"Partitions missing from filesystem:", resultOut, firstWritten);
-  firstWritten |= writeMsckResult(result.getExpiredPartitions(),
-"Expired partitions (retention period: " + partitionExpirySeconds 
+ "s) :", resultOut, firstWritten);
-  // sorting to stabilize qfile output (msck_repair_drop.q)
-  Collections.sort(repairOutput);
-  for (String rout : repairOutput) {
-if (firstWritten) {
-  resultOut.write(terminator);
-} else {
-  firstWritten = true;
-}
-resultOut.write(rout);
-  }
-} catch (IOException e) {
-  LOG.warn("Failed to save metacheck output: ", e);
-  ret = 1;
-} finally {
-  if (resultOut != null) {
-try {
-  resultOut.close();
-} catch (IOException e) {
-  LOG.warn("Failed to close output file: ", e);
-  ret = 1;
-}
-  }
+  if (result!=null) {
+logResult(result);
+if (msckInfo.getResFile() != null) {
+  success = writeResultToFile(msckInfo, result, repairOutput, 
partitionExpirySeconds) && success;
 }
   }
 
-  LOG.info("Tables not in metastore: {}", result.getTablesNotInMs());
-  LOG.info("Tables missing on filesystem: {}", result.getTablesNotOnFs());
-  LOG.info("Partitions not in metastore: {}", 
result.getPartitionsNotInMs());
-  LOG.info("Partitions missing from filesystem: {}", 
result.getPartitionsNotOnFs());
-  LOG.info("Expired partitions: {}", result.getExpiredPartitions());
-  if (acquireLock && txnId > 0) {
-  if (success) {
-try {
-  LOG.info("txnId: {} succeeded. Committing..", txnId);
-  getMsc().commitTxn(txnId);
-} catch (Exception e) {
-  LOG.warn("Error while committing txnId: {} for table: {}", 
txnId, qualifiedTableName, e);
-  ret = 1;
-}
-  } else {
-try {
-  LOG.info("txnId: {} failed. Aborting..", txnId);
-  getMsc().abortTxns(Lists.newArrayList(txnId));
-} catch (Exception e) {
-  LOG.warn("Error while aborting txnId: {} for table: {}", txnId, 
qualifiedTableName, e);
-  ret = 1;
-}
-  }
+  if (txnId > 0) {
+success = closeTxn(qualifiedTableName, success, txnId) && success;

Review comment:
   fixed





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org
For additional commands, e-mail: gitbox-h...@hive.apache.org



[GitHub] [hive] pvargacl commented on a change in pull request #1087: HIVE-23671: MSCK repair should handle transactional tables

2020-07-06 Thread GitBox


pvargacl commented on a change in pull request #1087:
URL: https://github.com/apache/hive/pull/1087#discussion_r450315397



##
File path: ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands2.java
##
@@ -2209,20 +2209,7 @@ public void testAcidOrcWritePreservesFieldNames() throws 
Exception {
* sorts rows in dictionary order
*/
   static List stringifyValues(int[][] rowsIn) {
-assert rowsIn.length > 0;
-int[][] rows = rowsIn.clone();
-Arrays.sort(rows, new RowComp());
-List rs = new ArrayList();
-for(int[] row : rows) {
-  assert row.length > 0;
-  StringBuilder sb = new StringBuilder();
-  for(int value : row) {
-sb.append(value).append("\t");
-  }
-  sb.setLength(sb.length() - 1);
-  rs.add(sb.toString());
-}
-return rs;
+return TxnCommandsBaseForTests.stringifyValues(rowsIn);

Review comment:
   I will do this in a separate Jira, started it, but it requires more 
change.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org
For additional commands, e-mail: gitbox-h...@hive.apache.org



[GitHub] [hive] pvargacl commented on a change in pull request #1087: HIVE-23671: MSCK repair should handle transactional tables

2020-07-06 Thread GitBox


pvargacl commented on a change in pull request #1087:
URL: https://github.com/apache/hive/pull/1087#discussion_r450334396



##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStoreChecker.java
##
@@ -111,24 +120,24 @@ public IMetaStoreClient getMsc() {
* @param partitions
*  List of partition name value pairs, if null or empty check all
*  partitions
-   * @param table
-   * @param result
-   *  Fill this with the results of the check
+   * @param table Table we want to run the check for.
+   * @return Results of the check
* @throws MetastoreException
*   Failed to get required information from the metastore.
* @throws IOException
*   Most likely filesystem related
*/
-  public void checkMetastore(String catName, String dbName, String tableName,
-  List> partitions, Table table, CheckResult 
result)
+  public CheckResult checkMetastore(String catName, String dbName, String 
tableName,
+  List> partitions, Table table)
   throws MetastoreException, IOException {
-
+CheckResult result = new CheckResult();
 if (dbName == null || "".equalsIgnoreCase(dbName)) {
   dbName = Warehouse.DEFAULT_DATABASE_NAME;
 }
 
 try {
   if (tableName == null || "".equals(tableName)) {
+// TODO: I do not think this is used by anything other than tests

Review comment:
   I do not know. If I understand correctly there is no way currently to 
call MSCK repair without a table specified, but it seems like someone made some 
effort to create that feature and tests for it. But i don't know if we ever 
want that in production (calling msck repair for every table seems like a quick 
way to overwhelm the system)  I left this comment here, for anybody who tries 
to makes sense of this code.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org
For additional commands, e-mail: gitbox-h...@hive.apache.org



[GitHub] [hive] pvargacl commented on a change in pull request #1087: HIVE-23671: MSCK repair should handle transactional tables

2020-07-06 Thread GitBox


pvargacl commented on a change in pull request #1087:
URL: https://github.com/apache/hive/pull/1087#discussion_r450360016



##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStoreChecker.java
##
@@ -429,6 +451,75 @@ void findUnknownPartitions(Table table, Set 
partPaths,
 LOG.debug("Number of partitions not in metastore : " + 
result.getPartitionsNotInMs().size());
   }
 
+  /**
+   * Calculate the maximum seen writeId from the acid directory structure
+   * @param partPath Path of the partition directory
+   * @param res Partition result to write the max ids
+   * @throws IOException ex
+   */
+  private void setMaxTxnAndWriteIdFromPartition(Path partPath, 
CheckResult.PartitionResult res) throws IOException {
+FileSystem fs = partPath.getFileSystem(conf);
+FileStatus[] deltaOrBaseFiles = fs.listStatus(partPath, 
HIDDEN_FILES_PATH_FILTER);
+
+// Read the writeIds from every base and delta directory and find the max
+long maxWriteId = 0L;
+long maxVisibilityId = 0L;
+for(FileStatus fileStatus : deltaOrBaseFiles) {
+  if (!fileStatus.isDirectory()) {
+continue;
+  }
+  long writeId = 0L;
+  long visibilityId = 0L;
+  String folder = fileStatus.getPath().getName();
+  if (folder.startsWith(BASE_PREFIX)) {
+visibilityId = getVisibilityTxnId(folder);
+if (visibilityId > 0) {
+  folder = removeVisibilityTxnId(folder);
+}
+writeId = Long.parseLong(folder.substring(BASE_PREFIX.length()));
+  } else if (folder.startsWith(DELTA_PREFIX) || 
folder.startsWith(DELETE_DELTA_PREFIX)) {
+// See AcidUtils.parseDelta
+visibilityId = getVisibilityTxnId(folder);
+if (visibilityId > 0) {
+  folder = removeVisibilityTxnId(folder);
+}
+boolean isDeleteDelta = folder.startsWith(DELETE_DELTA_PREFIX);
+String rest = folder.substring((isDeleteDelta ? DELETE_DELTA_PREFIX : 
DELTA_PREFIX).length());
+int split = rest.indexOf('_');
+//split2 may be -1 if no statementId
+int split2 = rest.indexOf('_', split + 1);
+// We always want the second part (it is either the same or greater if 
it is a compacted delta)
+writeId = split2 == -1 ? Long.parseLong(rest.substring(split + 1)) : 
Long
+.parseLong(rest.substring(split + 1, split2));
+  }
+  if (writeId > maxWriteId) {
+maxWriteId = writeId;
+  }
+  if (visibilityId > maxVisibilityId) {
+maxVisibilityId = visibilityId;
+  }
+}
+LOG.debug("Max writeId {}, max txnId {} found in partition {}", 
maxWriteId, maxVisibilityId,
+partPath.toUri().toString());
+res.setMaxWriteId(maxWriteId);
+res.setMaxTxnId(maxVisibilityId);
+  }
+  private long getVisibilityTxnId(String folder) {
+int idxOfVis = folder.indexOf(VISIBILITY_PREFIX);

Review comment:
   fixed





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org
For additional commands, e-mail: gitbox-h...@hive.apache.org



[GitHub] [hive] pvargacl commented on a change in pull request #1087: HIVE-23671: MSCK repair should handle transactional tables

2020-07-06 Thread GitBox


pvargacl commented on a change in pull request #1087:
URL: https://github.com/apache/hive/pull/1087#discussion_r450362382



##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/Msck.java
##
@@ -229,102 +239,168 @@ public int repair(MsckInfo msckInfo) {
 throw new MetastoreException(e);
   }
 }
+if (transactionalTable && !MetaStoreServerUtils.isPartitioned(table)) {
+  if (result.getMaxWriteId() > 0) {
+if (txnId < 0) {
+  // We need the txnId to check against even if we didn't do the 
locking
+  txnId = getMsc().openTxn(getUserName());
+}
+
+validateAndAddMaxTxnIdAndWriteId(result.getMaxWriteId(), 
result.getMaxTxnId(),
+table.getDbName(), table.getTableName(), txnId);
+  }
+}
   }
   success = true;
 } catch (Exception e) {
   LOG.warn("Failed to run metacheck: ", e);
   success = false;
-  ret = 1;
 } finally {
-  if (msckInfo.getResFile() != null) {
-BufferedWriter resultOut = null;
-try {
-  Path resFile = new Path(msckInfo.getResFile());
-  FileSystem fs = resFile.getFileSystem(getConf());
-  resultOut = new BufferedWriter(new 
OutputStreamWriter(fs.create(resFile)));
-
-  boolean firstWritten = false;
-  firstWritten |= writeMsckResult(result.getTablesNotInMs(),
-"Tables not in metastore:", resultOut, firstWritten);
-  firstWritten |= writeMsckResult(result.getTablesNotOnFs(),
-"Tables missing on filesystem:", resultOut, firstWritten);
-  firstWritten |= writeMsckResult(result.getPartitionsNotInMs(),
-"Partitions not in metastore:", resultOut, firstWritten);
-  firstWritten |= writeMsckResult(result.getPartitionsNotOnFs(),
-"Partitions missing from filesystem:", resultOut, firstWritten);
-  firstWritten |= writeMsckResult(result.getExpiredPartitions(),
-"Expired partitions (retention period: " + partitionExpirySeconds 
+ "s) :", resultOut, firstWritten);
-  // sorting to stabilize qfile output (msck_repair_drop.q)
-  Collections.sort(repairOutput);
-  for (String rout : repairOutput) {
-if (firstWritten) {
-  resultOut.write(terminator);
-} else {
-  firstWritten = true;
-}
-resultOut.write(rout);
-  }
-} catch (IOException e) {
-  LOG.warn("Failed to save metacheck output: ", e);
-  ret = 1;
-} finally {
-  if (resultOut != null) {
-try {
-  resultOut.close();
-} catch (IOException e) {
-  LOG.warn("Failed to close output file: ", e);
-  ret = 1;
-}
-  }
+  if (result!=null) {
+logResult(result);
+if (msckInfo.getResFile() != null) {
+  success = writeResultToFile(msckInfo, result, repairOutput, 
partitionExpirySeconds) && success;
 }
   }
 
-  LOG.info("Tables not in metastore: {}", result.getTablesNotInMs());
-  LOG.info("Tables missing on filesystem: {}", result.getTablesNotOnFs());
-  LOG.info("Partitions not in metastore: {}", 
result.getPartitionsNotInMs());
-  LOG.info("Partitions missing from filesystem: {}", 
result.getPartitionsNotOnFs());
-  LOG.info("Expired partitions: {}", result.getExpiredPartitions());
-  if (acquireLock && txnId > 0) {
-  if (success) {
-try {
-  LOG.info("txnId: {} succeeded. Committing..", txnId);
-  getMsc().commitTxn(txnId);
-} catch (Exception e) {
-  LOG.warn("Error while committing txnId: {} for table: {}", 
txnId, qualifiedTableName, e);
-  ret = 1;
-}
-  } else {
-try {
-  LOG.info("txnId: {} failed. Aborting..", txnId);
-  getMsc().abortTxns(Lists.newArrayList(txnId));
-} catch (Exception e) {
-  LOG.warn("Error while aborting txnId: {} for table: {}", txnId, 
qualifiedTableName, e);
-  ret = 1;
-}
-  }
+  if (txnId > 0) {
+success = closeTxn(qualifiedTableName, success, txnId) && success;
   }
   if (getMsc() != null) {
 getMsc().close();
 msc = null;
   }
 }
+return success ? 0 : 1;
+  }
 
+  private boolean closeTxn(String qualifiedTableName, boolean success, long 
txnId) {
+boolean ret = true;
+if (success) {
+  try {
+LOG.info("txnId: {} succeeded. Committing..", txnId);
+getMsc().commitTxn(txnId);
+  } catch (Exception e) {
+LOG.warn("Error while committing txnId: {} for table: {}", txnId, 
qualifiedTableName, e);

Review comment:
   fixed

##
File path: 

[GitHub] [hive] dengzhhu653 commented on a change in pull request #1201: HIVE-23797: Throw exception when no metastore found in zookeeper

2020-07-06 Thread GitBox


dengzhhu653 commented on a change in pull request #1201:
URL: https://github.com/apache/hive/pull/1201#discussion_r450288096



##
File path: 
standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java
##
@@ -327,6 +327,13 @@ private void resolveUris() throws MetaException {
   MetaStoreUtils.logAndThrowMetaException(e);
 }
 
+if (metastoreUrisString.isEmpty() && 
"zookeeper".equalsIgnoreCase(serviceDiscoveryMode)) {
+  throw new MetaException("No metastore server available. "

Review comment:
   done,  @belugabehr thank you very much for the review.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org
For additional commands, e-mail: gitbox-h...@hive.apache.org



[GitHub] [hive] pvargacl commented on a change in pull request #1087: HIVE-23671: MSCK repair should handle transactional tables

2020-07-06 Thread GitBox


pvargacl commented on a change in pull request #1087:
URL: https://github.com/apache/hive/pull/1087#discussion_r450301348



##
File path: ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java
##
@@ -2392,33 +2392,29 @@ public static TableSnapshot 
getTableSnapshot(Configuration conf,
 long writeId = -1;
 ValidWriteIdList validWriteIdList = null;
 
-HiveTxnManager sessionTxnMgr = SessionState.get().getTxnMgr();
-String fullTableName = getFullTableName(dbName, tblName);
-if (sessionTxnMgr != null && sessionTxnMgr.getCurrentTxnId() > 0) {
-  validWriteIdList = getTableValidWriteIdList(conf, fullTableName);
-  if (isStatsUpdater) {
-writeId = SessionState.get().getTxnMgr() != null ?
-SessionState.get().getTxnMgr().getAllocatedTableWriteId(
-  dbName, tblName) : -1;
-if (writeId < 1) {
-  // TODO: this is not ideal... stats updater that doesn't have write 
ID is currently
-  //   "create table"; writeId would be 0/-1 here. No need to call 
this w/true.
-  LOG.debug("Stats updater for {}.{} doesn't have a write ID ({})",
-  dbName, tblName, writeId);
+if (SessionState.get() != null) {

Review comment:
   It can be null, if the sessionstate was not set properly, it was failing 
one of the tests, i can't remember which.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org
For additional commands, e-mail: gitbox-h...@hive.apache.org



[GitHub] [hive] pvargacl commented on a change in pull request #1087: HIVE-23671: MSCK repair should handle transactional tables

2020-07-06 Thread GitBox


pvargacl commented on a change in pull request #1087:
URL: https://github.com/apache/hive/pull/1087#discussion_r450360969



##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/Msck.java
##
@@ -229,102 +239,168 @@ public int repair(MsckInfo msckInfo) {
 throw new MetastoreException(e);
   }
 }
+if (transactionalTable && !MetaStoreServerUtils.isPartitioned(table)) {
+  if (result.getMaxWriteId() > 0) {

Review comment:
   fixed





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org
For additional commands, e-mail: gitbox-h...@hive.apache.org



[GitHub] [hive] pvargacl commented on a change in pull request #1087: HIVE-23671: MSCK repair should handle transactional tables

2020-07-06 Thread GitBox


pvargacl commented on a change in pull request #1087:
URL: https://github.com/apache/hive/pull/1087#discussion_r450318861



##
File path: ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java
##
@@ -2392,33 +2392,29 @@ public static TableSnapshot 
getTableSnapshot(Configuration conf,
 long writeId = -1;
 ValidWriteIdList validWriteIdList = null;
 
-HiveTxnManager sessionTxnMgr = SessionState.get().getTxnMgr();
-String fullTableName = getFullTableName(dbName, tblName);
-if (sessionTxnMgr != null && sessionTxnMgr.getCurrentTxnId() > 0) {
-  validWriteIdList = getTableValidWriteIdList(conf, fullTableName);
-  if (isStatsUpdater) {
-writeId = SessionState.get().getTxnMgr() != null ?
-SessionState.get().getTxnMgr().getAllocatedTableWriteId(
-  dbName, tblName) : -1;
-if (writeId < 1) {
-  // TODO: this is not ideal... stats updater that doesn't have write 
ID is currently
-  //   "create table"; writeId would be 0/-1 here. No need to call 
this w/true.
-  LOG.debug("Stats updater for {}.{} doesn't have a write ID ({})",
-  dbName, tblName, writeId);
+if (SessionState.get() != null) {
+  HiveTxnManager sessionTxnMgr = SessionState.get().getTxnMgr();
+  String fullTableName = getFullTableName(dbName, tblName);
+  if (sessionTxnMgr != null && sessionTxnMgr.getCurrentTxnId() > 0) {
+validWriteIdList = getTableValidWriteIdList(conf, fullTableName);
+if (isStatsUpdater) {
+  writeId = sessionTxnMgr != null ? 
sessionTxnMgr.getAllocatedTableWriteId(dbName, tblName) : -1;

Review comment:
   fixed





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org
For additional commands, e-mail: gitbox-h...@hive.apache.org



[GitHub] [hive] pvargacl commented on a change in pull request #1087: HIVE-23671: MSCK repair should handle transactional tables

2020-07-06 Thread GitBox


pvargacl commented on a change in pull request #1087:
URL: https://github.com/apache/hive/pull/1087#discussion_r450318970



##
File path: ql/src/test/org/apache/hadoop/hive/ql/TxnCommandsBaseForTests.java
##
@@ -162,9 +163,23 @@ protected String getWarehouseDir() {
* takes raw data and turns it into a string as if from Driver.getResults()
* sorts rows in dictionary order
*/
-  List stringifyValues(int[][] rowsIn) {
-return TestTxnCommands2.stringifyValues(rowsIn);
+  public static List stringifyValues(int[][] rowsIn) {
+assert rowsIn.length > 0;
+int[][] rows = rowsIn.clone();
+Arrays.sort(rows, new TestTxnCommands2.RowComp());

Review comment:
   fixed

##
File path: 
ql/src/test/org/apache/hadoop/hive/ql/metadata/TestHiveMetaStoreChecker.java
##
@@ -252,37 +241,165 @@ public void testInvalidPartitionKeyName()
   @Test
   public void testSkipInvalidPartitionKeyName()
 throws HiveException, AlreadyExistsException, IOException, 
MetastoreException {
-hive.getConf().set(HiveConf.ConfVars.HIVE_MSCK_PATH_VALIDATION.varname, 
"skip");
+
hive.getConf().set(MetastoreConf.ConfVars.MSCK_PATH_VALIDATION.getVarname(), 
"skip");
 checker = new HiveMetaStoreChecker(msc, hive.getConf());
-Table table = createTestTable();
+Table table = createTestTable(false);
 List partitions = hive.getPartitions(table);
 assertEquals(2, partitions.size());
 // add a fake partition dir on fs
 fs = partitions.get(0).getDataLocation().getFileSystem(hive.getConf());
-Path fakePart =
-new Path(table.getDataLocation().toString(), 
"fakedate=2009-01-01/fakecity=sanjose");
-fs.mkdirs(fakePart);
-fs.deleteOnExit(fakePart);
+addFolderToPath(fs, 
table.getDataLocation().toString(),"fakedate=2009-01-01/fakecity=sanjose");
 createPartitionsDirectoriesOnFS(table, 2);
-CheckResult result = new CheckResult();
-checker.checkMetastore(catName, dbName, tableName, null, null, result);
+CheckResult result = checker.checkMetastore(catName, dbName, tableName, 
null, null);
 assertEquals(Collections. emptySet(), result.getTablesNotInMs());
 assertEquals(Collections. emptySet(), result.getTablesNotOnFs());
 assertEquals(Collections. emptySet(), 
result.getPartitionsNotOnFs());
 // only 2 valid partitions should be added
 assertEquals(2, result.getPartitionsNotInMs().size());
   }
 
-  private Table createTestTable() throws HiveException, AlreadyExistsException 
{
+  /*
+   * Tests the case when we have normal delta_dirs in the partition folder
+   * does not throw HiveException
+   */
+  @Test
+  public void testAddPartitionNormalDeltas() throws Exception {
+Table table = createTestTable(true);
+List partitions = hive.getPartitions(table);
+assertEquals(2, partitions.size());
+// add a partition dir on fs
+fs = partitions.get(0).getDataLocation().getFileSystem(hive.getConf());
+Path newPart = addFolderToPath(fs, table.getDataLocation().toString(),
+partDateName + "=2017-01-01/" + partCityName + "=paloalto");
+
+// Add a few deltas
+addFolderToPath(fs, newPart.toString(), "delta_001_001_");
+addFolderToPath(fs, newPart.toString(), "delta_010_010_");
+addFolderToPath(fs, newPart.toString(), "delta_101_101_");
+CheckResult result = checker.checkMetastore(catName, dbName, tableName, 
null, null);
+assertEquals(Collections. emptySet(), 
result.getPartitionsNotOnFs());
+assertEquals(1, result.getPartitionsNotInMs().size());
+// Found the highest writeId
+assertEquals(101, 
result.getPartitionsNotInMs().iterator().next().getMaxWriteId());
+assertEquals(0, 
result.getPartitionsNotInMs().iterator().next().getMaxTxnId());
+  }
+  /*
+   * Tests the case when we have normal delta_dirs in the partition folder
+   * does not throw HiveException
+   */
+  @Test
+  public void testAddPartitionCompactedDeltas() throws Exception {
+Table table = createTestTable(true);
+List partitions = hive.getPartitions(table);
+assertEquals(2, partitions.size());
+// add a partition dir on fs
+fs = partitions.get(0).getDataLocation().getFileSystem(hive.getConf());
+Path newPart = addFolderToPath(fs, table.getDataLocation().toString(),
+partDateName + "=2017-01-01/" + partCityName + "=paloalto");
+
+// Add a few deltas
+addFolderToPath(fs, newPart.toString(), "delta_001_001_");
+addFolderToPath(fs, newPart.toString(), "delta_010_015_v067");
+addFolderToPath(fs, newPart.toString(), "delta_101_120_v087");
+CheckResult result = checker.checkMetastore(catName, dbName, tableName, 
null, null);
+assertEquals(Collections. emptySet(), 
result.getPartitionsNotOnFs());
+assertEquals(1, result.getPartitionsNotInMs().size());
+// Found the highest writeId
+assertEquals(120, 

[GitHub] [hive] pvargacl commented on a change in pull request #1087: HIVE-23671: MSCK repair should handle transactional tables

2020-07-06 Thread GitBox


pvargacl commented on a change in pull request #1087:
URL: https://github.com/apache/hive/pull/1087#discussion_r450338348



##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStoreChecker.java
##
@@ -429,6 +451,75 @@ void findUnknownPartitions(Table table, Set 
partPaths,
 LOG.debug("Number of partitions not in metastore : " + 
result.getPartitionsNotInMs().size());
   }
 
+  /**
+   * Calculate the maximum seen writeId from the acid directory structure
+   * @param partPath Path of the partition directory
+   * @param res Partition result to write the max ids
+   * @throws IOException ex
+   */
+  private void setMaxTxnAndWriteIdFromPartition(Path partPath, 
CheckResult.PartitionResult res) throws IOException {
+FileSystem fs = partPath.getFileSystem(conf);
+FileStatus[] deltaOrBaseFiles = fs.listStatus(partPath, 
HIDDEN_FILES_PATH_FILTER);
+
+// Read the writeIds from every base and delta directory and find the max
+long maxWriteId = 0L;
+long maxVisibilityId = 0L;
+for(FileStatus fileStatus : deltaOrBaseFiles) {
+  if (!fileStatus.isDirectory()) {
+continue;
+  }
+  long writeId = 0L;
+  long visibilityId = 0L;
+  String folder = fileStatus.getPath().getName();
+  if (folder.startsWith(BASE_PREFIX)) {
+visibilityId = getVisibilityTxnId(folder);
+if (visibilityId > 0) {
+  folder = removeVisibilityTxnId(folder);
+}
+writeId = Long.parseLong(folder.substring(BASE_PREFIX.length()));
+  } else if (folder.startsWith(DELTA_PREFIX) || 
folder.startsWith(DELETE_DELTA_PREFIX)) {
+// See AcidUtils.parseDelta
+visibilityId = getVisibilityTxnId(folder);
+if (visibilityId > 0) {
+  folder = removeVisibilityTxnId(folder);
+}
+boolean isDeleteDelta = folder.startsWith(DELETE_DELTA_PREFIX);
+String rest = folder.substring((isDeleteDelta ? DELETE_DELTA_PREFIX : 
DELTA_PREFIX).length());
+int split = rest.indexOf('_');

Review comment:
   fixed





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org
For additional commands, e-mail: gitbox-h...@hive.apache.org



[GitHub] [hive] pvargacl commented on a change in pull request #1087: HIVE-23671: MSCK repair should handle transactional tables

2020-07-06 Thread GitBox


pvargacl commented on a change in pull request #1087:
URL: https://github.com/apache/hive/pull/1087#discussion_r450361266



##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/Msck.java
##
@@ -229,102 +239,168 @@ public int repair(MsckInfo msckInfo) {
 throw new MetastoreException(e);
   }
 }
+if (transactionalTable && !MetaStoreServerUtils.isPartitioned(table)) {
+  if (result.getMaxWriteId() > 0) {
+if (txnId < 0) {
+  // We need the txnId to check against even if we didn't do the 
locking
+  txnId = getMsc().openTxn(getUserName());
+}
+
+validateAndAddMaxTxnIdAndWriteId(result.getMaxWriteId(), 
result.getMaxTxnId(),
+table.getDbName(), table.getTableName(), txnId);
+  }
+}
   }
   success = true;
 } catch (Exception e) {
   LOG.warn("Failed to run metacheck: ", e);
   success = false;
-  ret = 1;
 } finally {
-  if (msckInfo.getResFile() != null) {
-BufferedWriter resultOut = null;
-try {
-  Path resFile = new Path(msckInfo.getResFile());
-  FileSystem fs = resFile.getFileSystem(getConf());
-  resultOut = new BufferedWriter(new 
OutputStreamWriter(fs.create(resFile)));
-
-  boolean firstWritten = false;
-  firstWritten |= writeMsckResult(result.getTablesNotInMs(),
-"Tables not in metastore:", resultOut, firstWritten);
-  firstWritten |= writeMsckResult(result.getTablesNotOnFs(),
-"Tables missing on filesystem:", resultOut, firstWritten);
-  firstWritten |= writeMsckResult(result.getPartitionsNotInMs(),
-"Partitions not in metastore:", resultOut, firstWritten);
-  firstWritten |= writeMsckResult(result.getPartitionsNotOnFs(),
-"Partitions missing from filesystem:", resultOut, firstWritten);
-  firstWritten |= writeMsckResult(result.getExpiredPartitions(),
-"Expired partitions (retention period: " + partitionExpirySeconds 
+ "s) :", resultOut, firstWritten);
-  // sorting to stabilize qfile output (msck_repair_drop.q)
-  Collections.sort(repairOutput);
-  for (String rout : repairOutput) {
-if (firstWritten) {
-  resultOut.write(terminator);
-} else {
-  firstWritten = true;
-}
-resultOut.write(rout);
-  }
-} catch (IOException e) {
-  LOG.warn("Failed to save metacheck output: ", e);
-  ret = 1;
-} finally {
-  if (resultOut != null) {
-try {
-  resultOut.close();
-} catch (IOException e) {
-  LOG.warn("Failed to close output file: ", e);
-  ret = 1;
-}
-  }
+  if (result!=null) {

Review comment:
   fixed





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org
For additional commands, e-mail: gitbox-h...@hive.apache.org



[GitHub] [hive] pvargacl commented on a change in pull request #1087: HIVE-23671: MSCK repair should handle transactional tables

2020-07-06 Thread GitBox


pvargacl commented on a change in pull request #1087:
URL: https://github.com/apache/hive/pull/1087#discussion_r450361792



##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/Msck.java
##
@@ -229,102 +239,168 @@ public int repair(MsckInfo msckInfo) {
 throw new MetastoreException(e);
   }
 }
+if (transactionalTable && !MetaStoreServerUtils.isPartitioned(table)) {
+  if (result.getMaxWriteId() > 0) {
+if (txnId < 0) {
+  // We need the txnId to check against even if we didn't do the 
locking
+  txnId = getMsc().openTxn(getUserName());
+}
+
+validateAndAddMaxTxnIdAndWriteId(result.getMaxWriteId(), 
result.getMaxTxnId(),
+table.getDbName(), table.getTableName(), txnId);
+  }
+}
   }
   success = true;
 } catch (Exception e) {
   LOG.warn("Failed to run metacheck: ", e);
   success = false;
-  ret = 1;
 } finally {
-  if (msckInfo.getResFile() != null) {
-BufferedWriter resultOut = null;
-try {
-  Path resFile = new Path(msckInfo.getResFile());
-  FileSystem fs = resFile.getFileSystem(getConf());
-  resultOut = new BufferedWriter(new 
OutputStreamWriter(fs.create(resFile)));
-
-  boolean firstWritten = false;
-  firstWritten |= writeMsckResult(result.getTablesNotInMs(),
-"Tables not in metastore:", resultOut, firstWritten);
-  firstWritten |= writeMsckResult(result.getTablesNotOnFs(),
-"Tables missing on filesystem:", resultOut, firstWritten);
-  firstWritten |= writeMsckResult(result.getPartitionsNotInMs(),
-"Partitions not in metastore:", resultOut, firstWritten);
-  firstWritten |= writeMsckResult(result.getPartitionsNotOnFs(),
-"Partitions missing from filesystem:", resultOut, firstWritten);
-  firstWritten |= writeMsckResult(result.getExpiredPartitions(),
-"Expired partitions (retention period: " + partitionExpirySeconds 
+ "s) :", resultOut, firstWritten);
-  // sorting to stabilize qfile output (msck_repair_drop.q)
-  Collections.sort(repairOutput);
-  for (String rout : repairOutput) {
-if (firstWritten) {
-  resultOut.write(terminator);
-} else {
-  firstWritten = true;
-}
-resultOut.write(rout);
-  }
-} catch (IOException e) {
-  LOG.warn("Failed to save metacheck output: ", e);
-  ret = 1;
-} finally {
-  if (resultOut != null) {
-try {
-  resultOut.close();
-} catch (IOException e) {
-  LOG.warn("Failed to close output file: ", e);
-  ret = 1;
-}
-  }
+  if (result!=null) {
+logResult(result);
+if (msckInfo.getResFile() != null) {
+  success = writeResultToFile(msckInfo, result, repairOutput, 
partitionExpirySeconds) && success;

Review comment:
   fixed





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org
For additional commands, e-mail: gitbox-h...@hive.apache.org



[GitHub] [hive] StefanXiepj opened a new pull request #1208: Hive 22412: StatsUtils throw NPE when explain

2020-07-06 Thread GitBox


StefanXiepj opened a new pull request #1208:
URL: https://github.com/apache/hive/pull/1208


   ## NOTICE
   
   Please create an issue in ASF JIRA before opening a pull request,
   and you need to set the title of the pull request which starts with
   the corresponding JIRA issue number. (e.g. HIVE-X: Fix a typo in YYY)
   For more details, please see 
https://cwiki.apache.org/confluence/display/Hive/HowToContribute
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org
For additional commands, e-mail: gitbox-h...@hive.apache.org



[GitHub] [hive] pvary merged pull request #1196: HIVE-23791: Optimize ACID stats generation

2020-07-06 Thread GitBox


pvary merged pull request #1196:
URL: https://github.com/apache/hive/pull/1196


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org
For additional commands, e-mail: gitbox-h...@hive.apache.org



[GitHub] [hive] xiaomengzhang commented on pull request #1213: HIVE-23765: Use ORC file format by default when creating transactiona…

2020-07-06 Thread GitBox


xiaomengzhang commented on pull request #1213:
URL: https://github.com/apache/hive/pull/1213#issuecomment-654462812


   @nrg4878 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org
For additional commands, e-mail: gitbox-h...@hive.apache.org



[GitHub] [hive] xiaomengzhang opened a new pull request #1213: HIVE-23765: Use ORC file format by default when creating transactiona…

2020-07-06 Thread GitBox


xiaomengzhang opened a new pull request #1213:
URL: https://github.com/apache/hive/pull/1213


   …l table
   
   When table has property that transactional is 'true' and 
transactional_properties
   is not 'insert_only', make fileformat ORC.
   
   ## NOTICE
   
   Please create an issue in ASF JIRA before opening a pull request,
   and you need to set the title of the pull request which starts with
   the corresponding JIRA issue number. (e.g. HIVE-X: Fix a typo in YYY)
   For more details, please see 
https://cwiki.apache.org/confluence/display/Hive/HowToContribute
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org
For additional commands, e-mail: gitbox-h...@hive.apache.org



[GitHub] [hive] vineetgarg02 opened a new pull request #1212: HIVE-23807 Wrong results with vectorization enabled

2020-07-06 Thread GitBox


vineetgarg02 opened a new pull request #1212:
URL: https://github.com/apache/hive/pull/1212


   ## NOTICE
   
   Please create an issue in ASF JIRA before opening a pull request,
   and you need to set the title of the pull request which starts with
   the corresponding JIRA issue number. (e.g. HIVE-X: Fix a typo in YYY)
   For more details, please see 
https://cwiki.apache.org/confluence/display/Hive/HowToContribute
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org
For additional commands, e-mail: gitbox-h...@hive.apache.org



[GitHub] [hive] StefanXiepj commented on pull request #1209: HIVE-22412: StatsUtils throw NPE when explain

2020-07-06 Thread GitBox


StefanXiepj commented on pull request #1209:
URL: https://github.com/apache/hive/pull/1209#issuecomment-654557578


   > Thanks for the contribution.
   > 
   > I need to look at it more, but if 'null' is not a valid option to pass 
into the constructors, then it should be prohibited: `Objects#requireNull` 
Please check where the null value is being passed in and instead have the 
calling class pass in an `Collections#EmptyList` if possible, or a new 
`ArrayList` otherwise.
   
   Thanks for your review. The details about this exeception stack is here:
About Map, you can find it at 
`org.apache.hadoop.hive.ql.stats.StatsUtils#getSizeOfMap`:
   ```
 public static long getSizeOfMap(StandardConstantMapObjectInspector scmoi) {
   // This map is null
   Map map = scmoi.getWritableConstantValue();
   ObjectInspector koi = scmoi.getMapKeyObjectInspector();
   ObjectInspector voi = scmoi.getMapValueObjectInspector();
   long result = 0;
   // NPE will thrown at here
   for (Map.Entry entry : map.entrySet()) {
 result += getWritableSize(koi, entry.getKey());
 result += getWritableSize(voi, entry.getValue());
   }
   ...
 }
   ```
   Maybe we can fix it like this:
   ```
   public static long getSizeOfMap(StandardConstantMapObjectInspector scmoi) {
   // This map is null
   Map map = scmoi.getWritableConstantValue();
   // return zero when map is null
   if (null == map) {
   return 0;
   }
   ObjectInspector koi = scmoi.getMapKeyObjectInspector();
   ObjectInspector voi = scmoi.getMapValueObjectInspector();
   long result = 0;
   // NPE will thrown at here
   for (Map.Entry entry : map.entrySet()) {
 result += getWritableSize(koi, entry.getKey());
 result += getWritableSize(voi, entry.getValue());
   }
   ...
 }
   ```
   We might want to fix this in StatsUtils, but i don't think it is a good 
idea, it is better to fix it by initializing value for 
StandardConstantMapObjectInspector, StandardConstantListObjectInspector and 
StandardConstantStructObjectInspector.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org
For additional commands, e-mail: gitbox-h...@hive.apache.org



[GitHub] [hive] adesh-rao commented on a change in pull request #1109: HIVE-22015: Add table constraints in CachedStore

2020-07-06 Thread GitBox


adesh-rao commented on a change in pull request #1109:
URL: https://github.com/apache/hive/pull/1109#discussion_r450598395



##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java
##
@@ -543,10 +557,30 @@ static void prewarm(RawStore rawStore) {
 tableColStats = rawStore.getTableColumnStatistics(catName, 
dbName, tblName, colNames, CacheUtils.HIVE_ENGINE);
 Deadline.stopTimer();
   }
+  Deadline.startTimer("getPrimaryKeys");
+  primaryKeys = rawStore.getPrimaryKeys(catName, dbName, tblName);
+  Deadline.stopTimer();
+  cacheObjects.setPrimaryKeys(primaryKeys);
+
+  Deadline.startTimer("getForeignKeys");
+  foreignKeys = rawStore.getForeignKeys(catName, null, null, 
dbName, tblName);

Review comment:
   Current usages in code of `getForeignKeys`  contains null only for 
parentDb/table, foreignDb/table are always being populated. So let's skip it as 
you mentioned. 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org
For additional commands, e-mail: gitbox-h...@hive.apache.org



[GitHub] [hive] adesh-rao commented on a change in pull request #1109: HIVE-22015: Add table constraints in CachedStore

2020-07-06 Thread GitBox


adesh-rao commented on a change in pull request #1109:
URL: https://github.com/apache/hive/pull/1109#discussion_r450599566



##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java
##
@@ -2497,26 +2610,87 @@ long getPartsFound() {
 
   @Override public List getPrimaryKeys(String catName, String 
dbName, String tblName)
   throws MetaException {
-// TODO constraintCache
-return rawStore.getPrimaryKeys(catName, dbName, tblName);
+catName = normalizeIdentifier(catName);
+dbName = StringUtils.normalizeIdentifier(dbName);
+tblName = StringUtils.normalizeIdentifier(tblName);
+if (!shouldCacheTable(catName, dbName, tblName) || (canUseEvents && 
rawStore.isActiveTransaction())) {
+  return rawStore.getPrimaryKeys(catName, dbName, tblName);
+}
+
+Table tbl = sharedCache.getTableFromCache(catName, dbName, tblName);
+if (tbl == null) {
+  // The table containing the primary keys is not yet loaded in cache
+  return rawStore.getPrimaryKeys(catName, dbName, tblName);
+}
+List keys = sharedCache.listCachedPrimaryKeys(catName, 
dbName, tblName);
+
+return keys;
   }
 
   @Override public List getForeignKeys(String catName, String 
parentDbName, String parentTblName,
   String foreignDbName, String foreignTblName) throws MetaException {
-// TODO constraintCache
-return rawStore.getForeignKeys(catName, parentDbName, parentTblName, 
foreignDbName, foreignTblName);
+ // Get correct ForeignDBName and TableName
+if (foreignDbName == null || foreignTblName == null) {
+  return rawStore.getForeignKeys(catName, parentDbName, parentTblName, 
foreignDbName, foreignTblName);

Review comment:
   Created https://issues.apache.org/jira/browse/HIVE-23810 for followup.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org
For additional commands, e-mail: gitbox-h...@hive.apache.org