[GitHub] [hive] hmangla98 commented on a change in pull request #2240: HIVE-25086: Create Ranger Deny Policy for replication db in all cases.

2021-05-23 Thread GitBox


hmangla98 commented on a change in pull request #2240:
URL: https://github.com/apache/hive/pull/2240#discussion_r637698171



##
File path: 
ql/src/java/org/apache/hadoop/hive/ql/exec/repl/ranger/RangerRestClientImpl.java
##
@@ -420,8 +458,8 @@ boolean checkConnectionPlain(String url, HiveConf hiveConf) 
{
   }
 
   @Override
-  public List addDenyPolicies(List rangerPolicies, 
String rangerServiceName,
-String sourceDb, String targetDb) 
throws SemanticException {
+  public RangerPolicy getDenyPolicyForReplicatedDb(String rangerServiceName,

Review comment:
   yes, while defining ranger policy name.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org
For additional commands, e-mail: gitbox-h...@hive.apache.org



[GitHub] [hive] hmangla98 commented on a change in pull request #2240: HIVE-25086: Create Ranger Deny Policy for replication db in all cases.

2021-05-23 Thread GitBox


hmangla98 commented on a change in pull request #2240:
URL: https://github.com/apache/hive/pull/2240#discussion_r637697872



##
File path: 
ql/src/java/org/apache/hadoop/hive/ql/exec/repl/ranger/RangerRestClientImpl.java
##
@@ -444,7 +482,7 @@ boolean checkConnectionPlain(String url, HiveConf hiveConf) 
{
 List denyExceptionsPolicyItemAccesses 
= new ArrayList();
 
-resourceNameList.add(sourceDb);
+resourceNameList.add(targetDb);
 resourceNameList.add("dummy");

Review comment:
   It is required to avoid new deny policy from overriding the existing 
policy. Ref. HIVE-24371.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org
For additional commands, e-mail: gitbox-h...@hive.apache.org



[GitHub] [hive] maheshk114 commented on a change in pull request #2266: HIVE-24663 : Batch process in ColStatsProcessor for partitions

2021-05-23 Thread GitBox


maheshk114 commented on a change in pull request #2266:
URL: https://github.com/apache/hive/pull/2266#discussion_r637691746



##
File path: 
ql/src/java/org/apache/hadoop/hive/ql/ddl/table/info/desc/DescTableOperation.java
##
@@ -213,11 +214,14 @@ private void getColumnDataColPathSpecified(Table table, 
Partition part, List partitions = new ArrayList();
-  partitions.add(part.getName());
+  // The partition name is converted to lowercase before generating the 
stats. So we should use the same

Review comment:
   The lower case conversion is only for column name ..the value part is 
not converted.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org
For additional commands, e-mail: gitbox-h...@hive.apache.org



[GitHub] [hive] maheshk114 commented on a change in pull request #2266: HIVE-24663 : Batch process in ColStatsProcessor for partitions

2021-05-23 Thread GitBox


maheshk114 commented on a change in pull request #2266:
URL: https://github.com/apache/hive/pull/2266#discussion_r637691386



##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
##
@@ -5390,6 +5406,493 @@ public void countOpenTxns() throws MetaException {
 }
   }
 
+  private void cleanOldStatsFromPartColStatTable(Map 
statsPartInfoMap,
+ Map 
newStatsMap,
+ Connection dbConn) throws 
SQLException {
+PreparedStatement statementDelete = null;
+int numRows = 0;
+int maxNumRows = MetastoreConf.getIntVar(conf, 
ConfVars.DIRECT_SQL_MAX_ELEMENTS_VALUES_CLAUSE);
+String delete = "DELETE FROM \"PART_COL_STATS\" where \"PART_ID\" = ? AND 
\"COLUMN_NAME\" = ?";
+
+try {
+  statementDelete = dbConn.prepareStatement(delete);
+  for (Map.Entry entry : newStatsMap.entrySet()) {
+// If the partition does not exist (deleted/removed by some other 
task), no need to update the stats.
+if (!statsPartInfoMap.containsKey(entry.getKey())) {
+  continue;
+}
+
+ColumnStatistics colStats = (ColumnStatistics) entry.getValue();
+for (ColumnStatisticsObj statisticsObj : colStats.getStatsObj()) {
+  statementDelete.setLong(1, 
statsPartInfoMap.get(entry.getKey()).partitionId);
+  statementDelete.setString(2, statisticsObj.getColName());
+  numRows++;
+  statementDelete.addBatch();
+  if (numRows == maxNumRows) {
+statementDelete.executeBatch();
+numRows = 0;
+LOG.info("Executed delete " + delete + " for numRows " + numRows);
+  }
+}
+  }
+
+  if (numRows != 0) {
+statementDelete.executeBatch();
+  }
+} finally {
+  closeStmt(statementDelete);
+}
+  }
+
+  private long getMaxCSId(Connection dbConn) throws SQLException {
+Statement stmtInt = null;
+ResultSet rsInt = null;
+long maxCsId = 0;
+try {
+  stmtInt = dbConn.createStatement();
+  while (maxCsId == 0) {
+String query = "SELECT \"NEXT_VAL\" FROM \"SEQUENCE_TABLE\" WHERE 
\"SEQUENCE_NAME\"= "

Review comment:
   done




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org
For additional commands, e-mail: gitbox-h...@hive.apache.org



[GitHub] [hive] ramesh0201 commented on a change in pull request #2286: HIVE-25117 Vector PTF ClassCastException with Decimal64

2021-05-23 Thread GitBox


ramesh0201 commented on a change in pull request #2286:
URL: https://github.com/apache/hive/pull/2286#discussion_r637686832



##
File path: 
ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/Vectorizer.java
##
@@ -4962,9 +4969,8 @@ private static void createVectorPTFDesc(Operator ptfOp,
 evaluatorWindowFrameDefs,
 evaluatorInputExprNodeDescLists);
 
-TypeInfo[] reducerBatchTypeInfos = vContext.getAllTypeInfos();
-
 vectorPTFDesc.setReducerBatchTypeInfos(reducerBatchTypeInfos);
+
vectorPTFDesc.setReducerBatchDataTypePhysicalVariations(reducerBatchDataTypePhysicalVariations);

Review comment:
   Resolved thanks




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org
For additional commands, e-mail: gitbox-h...@hive.apache.org



[GitHub] [hive] ramesh0201 commented on pull request #2286: HIVE-25117 Vector PTF ClassCastException with Decimal64

2021-05-23 Thread GitBox


ramesh0201 commented on pull request #2286:
URL: https://github.com/apache/hive/pull/2286#issuecomment-846728695


   None of the vector PTF decimal operators have support for decimal64. Like we 
need to have corresponding decimal64 version for VectorPTFEvaluatorDecimalSum 
and similar classes. I will create a jira to handle this.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org
For additional commands, e-mail: gitbox-h...@hive.apache.org



[GitHub] [hive] aasha commented on a change in pull request #2240: HIVE-25086: Create Ranger Deny Policy for replication db in all cases.

2021-05-23 Thread GitBox


aasha commented on a change in pull request #2240:
URL: https://github.com/apache/hive/pull/2240#discussion_r637674090



##
File path: 
ql/src/java/org/apache/hadoop/hive/ql/exec/repl/ranger/RangerRestClientImpl.java
##
@@ -79,6 +74,7 @@
   private static final String RANGER_REST_URL_EXPORTJSONFILE = 
"service/plugins/policies/exportJson";
   private static final String RANGER_REST_URL_IMPORTJSONFILE =
   "service/plugins/policies/importPoliciesFromFile";
+  private static final String RANGER_REST_URL_DELETEPOLICY = 
"service/public/v2/api/policy";

Review comment:
   is this not available as part of service/plugins/policies?

##
File path: ql/src/java/org/apache/hadoop/hive/ql/exec/repl/ReplLoadTask.java
##
@@ -127,6 +127,7 @@ public int execute() {
   if (shouldLoadAtlasMetadata()) {
 addAtlasLoadTask();
   }
+  initiateRangerDenytask();

Review comment:
   you can do this only if deny config is enabled

##
File path: 
ql/src/java/org/apache/hadoop/hive/ql/exec/repl/ranger/RangerRestClientImpl.java
##
@@ -420,8 +458,8 @@ boolean checkConnectionPlain(String url, HiveConf hiveConf) 
{
   }
 
   @Override
-  public List addDenyPolicies(List rangerPolicies, 
String rangerServiceName,
-String sourceDb, String targetDb) 
throws SemanticException {
+  public RangerPolicy getDenyPolicyForReplicatedDb(String rangerServiceName,

Review comment:
   are u using the source db param

##
File path: 
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestReplicationOnHDFSEncryptedZones.java
##
@@ -97,7 +97,6 @@ public void targetAndSourceHaveDifferentEncryptionZoneKeys() 
throws Throwable {
 
 WarehouseInstance replica = new WarehouseInstance(LOG, miniDFSCluster,
 new HashMap() {{
-  put(HiveConf.ConfVars.HIVE_IN_TEST.varname, "false");

Review comment:
   why is this removed?

##
File path: 
ql/src/java/org/apache/hadoop/hive/ql/exec/repl/ranger/RangerRestClientImpl.java
##
@@ -444,7 +482,7 @@ boolean checkConnectionPlain(String url, HiveConf hiveConf) 
{
 List denyExceptionsPolicyItemAccesses 
= new ArrayList();
 
-resourceNameList.add(sourceDb);
+resourceNameList.add(targetDb);
 resourceNameList.add("dummy");

Review comment:
   is this dummy needed?

##
File path: ql/src/java/org/apache/hadoop/hive/ql/exec/repl/RangerDenyTask.java
##
@@ -0,0 +1,155 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hive.ql.exec.repl;
+
+import com.google.common.annotations.VisibleForTesting;
+import org.apache.commons.collections.CollectionUtils;
+import org.apache.commons.lang3.StringUtils;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.hive.conf.HiveConf;
+import org.apache.hadoop.hive.metastore.utils.SecurityUtils;
+import org.apache.hadoop.hive.ql.ErrorMsg;
+import org.apache.hadoop.hive.ql.exec.Task;
+import org.apache.hadoop.hive.ql.exec.repl.ranger.RangerRestClient;
+import org.apache.hadoop.hive.ql.exec.repl.ranger.RangerRestClientImpl;
+import org.apache.hadoop.hive.ql.exec.repl.ranger.NoOpRangerRestClient;
+import org.apache.hadoop.hive.ql.exec.repl.ranger.RangerPolicy;
+import org.apache.hadoop.hive.ql.exec.repl.ranger.RangerExportPolicyList;
+import org.apache.hadoop.hive.ql.exec.repl.util.ReplUtils;
+import org.apache.hadoop.hive.ql.parse.SemanticException;
+import org.apache.hadoop.hive.ql.parse.repl.metric.event.Status;
+import org.apache.hadoop.hive.ql.plan.api.StageType;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.Serializable;
+import java.net.URL;
+import java.util.ArrayList;
+/**
+ * RangerDenyTask.
+ *
+ * Task to add Ranger Deny Policy
+ **/
+public class RangerDenyTask extends Task implements 
Serializable {
+private static final long serialVersionUID = 1L;
+
+private static final Logger LOG = 
LoggerFactory.getLogger(RangerDenyTask.class);
+
+private transient RangerRestClient rangerRestClient;
+
+public RangerDenyTask() {
+super();
+}
+
+@VisibleForTesting
+RangerDenyTask(final RangerRestClient 

[GitHub] [hive] ramesh0201 commented on a change in pull request #2286: HIVE-25117 Vector PTF ClassCastException with Decimal64

2021-05-23 Thread GitBox


ramesh0201 commented on a change in pull request #2286:
URL: https://github.com/apache/hive/pull/2286#discussion_r637668639



##
File path: 
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/ptf/VectorPTFOperator.java
##
@@ -250,13 +253,16 @@ protected VectorizedRowBatch setupOverflowBatch() throws 
HiveException {
 for (int i = 0; i < outputProjectionColumnMap.length; i++) {
   int outputColumn = outputProjectionColumnMap[i];
   String typeName = outputTypeInfos[i].getTypeName();
-  allocateOverflowBatchColumnVector(overflowBatch, outputColumn, typeName);
+  allocateOverflowBatchColumnVector(overflowBatch, outputColumn, typeName, 
outputDataTypePhysicalVariations[i]);
 }
 
 // Now, add any scratch columns needed for children operators.
 int outputColumn = initialColumnCount;
+DataTypePhysicalVariation[] dataTypePhysicalVariations = 
vOutContext.getScratchDataTypePhysicalVariations();
 for (String typeName : vOutContext.getScratchColumnTypeNames()) {
-  allocateOverflowBatchColumnVector(overflowBatch, outputColumn++, 
typeName);
+  allocateOverflowBatchColumnVector(overflowBatch, outputColumn, typeName,
+  dataTypePhysicalVariations[outputColumn-initialColumnCount]);

Review comment:
   Actually based on the code here, I think we index the scratch column 
based on the outputColumnNum
   
https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizationContext.java#L525
   
https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizationContext.java#L800
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org
For additional commands, e-mail: gitbox-h...@hive.apache.org



[GitHub] [hive] belugabehr commented on pull request #2002: HIVE-24810: Use JDK 8 String Switch in TruncDateFromTimestamp

2021-05-23 Thread GitBox


belugabehr commented on pull request #2002:
URL: https://github.com/apache/hive/pull/2002#issuecomment-846656268


   @pgaref If you are available for a quick review :)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org
For additional commands, e-mail: gitbox-h...@hive.apache.org



[GitHub] [hive] belugabehr merged pull request #2309: HIVE-25151: Remove Unused Interner from HiveMetastoreChecker

2021-05-23 Thread GitBox


belugabehr merged pull request #2309:
URL: https://github.com/apache/hive/pull/2309


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org
For additional commands, e-mail: gitbox-h...@hive.apache.org



[GitHub] [hive] belugabehr merged pull request #2310: HIVE-25152: Remove Superfluous Logging Code

2021-05-23 Thread GitBox


belugabehr merged pull request #2310:
URL: https://github.com/apache/hive/pull/2310


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org
For additional commands, e-mail: gitbox-h...@hive.apache.org



[GitHub] [hive] belugabehr commented on pull request #2002: HIVE-24810: Use JDK 8 String Switch in TruncDateFromTimestamp

2021-05-23 Thread GitBox


belugabehr commented on pull request #2002:
URL: https://github.com/apache/hive/pull/2002#issuecomment-846606241


   @miklosgergely  :)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org
For additional commands, e-mail: gitbox-h...@hive.apache.org



[GitHub] [hive] belugabehr commented on pull request #2299: HIVE-25141: Review Error Level Logging in HMS Module

2021-05-23 Thread GitBox


belugabehr commented on pull request #2299:
URL: https://github.com/apache/hive/pull/2299#issuecomment-846605840


   @miklosgergely I made a couple of changes to get tests to pass.  Can you 
please take a quick look to validate the change?  Thanks!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org
For additional commands, e-mail: gitbox-h...@hive.apache.org



[GitHub] [hive] belugabehr closed pull request #2309: HIVE-25151: Remove Unused Interner from HiveMetastoreChecker

2021-05-23 Thread GitBox


belugabehr closed pull request #2309:
URL: https://github.com/apache/hive/pull/2309


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org
For additional commands, e-mail: gitbox-h...@hive.apache.org



[GitHub] [hive] ashish-kumar-sharma edited a comment on pull request #2211: HIVE-23931: Send ValidWriteIdList and tableId to get_*_constraints HMS APIs

2021-05-23 Thread GitBox


ashish-kumar-sharma edited a comment on pull request #2211:
URL: https://github.com/apache/hive/pull/2211#issuecomment-846585770


   @kgyrtkirk 
   
   struct UniqueConstraintsRequest {
   1: TableReference table;
   }
   
   Above change  will not be compatible with older HMS client. Should I go 
ahead and implement the above change as part of this PR or Should I create a 
epic for cleaning up the deprecated api in metastore and move all api to 
request model where we can have common models for multiple api?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org
For additional commands, e-mail: gitbox-h...@hive.apache.org



[GitHub] [hive] ashish-kumar-sharma commented on pull request #2211: HIVE-23931: Send ValidWriteIdList and tableId to get_*_constraints HMS APIs

2021-05-23 Thread GitBox


ashish-kumar-sharma commented on pull request #2211:
URL: https://github.com/apache/hive/pull/2211#issuecomment-846585770


   @kgyrtkirk Adding
   
   struct UniqueConstraintsRequest {
   1: TableReference table;
   }
   
   Above change  will not be compatible with older HMS client. Should I go 
ahead and implement the above change as part of this PR or Should I create a 
epic for cleaning up the deprecated api in metastore and move all api to 
request model where we can have common models for multiple api?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org
For additional commands, e-mail: gitbox-h...@hive.apache.org



[GitHub] [hive] pgaref commented on pull request #2310: HIVE-25152: Remove Superfluous Logging Code

2021-05-23 Thread GitBox


pgaref commented on pull request #2310:
URL: https://github.com/apache/hive/pull/2310#issuecomment-846543650


   > ... continued...
   > 
   > That is to say:
   > 
   > ```java
   > LOG.info("New Final Path: FS " + fsp.finalPaths[filesIdx]);
   > LOG.info("New Final Path: FS {}", fsp.finalPaths[filesIdx]);
   > ```
   > 
   > These two statements, will always produce the same output (as INFO is on 
by default in every production environment under the sun). However, the second 
one will always have the overhead of having to find the anchor in the format 
string and then replace it with the value. The first example, there is simply a 
string concatenation that occurs. This will be faster and I don't find this log 
particularly hard to read.
   
   Hey @belugabehr thanks for the details! I agree adding anchors on all INFO 
statements wont add and perf value here considering that INFO is the default 
log lvl -- I was mostly thinking consistency when posting the comments, 
eventually using a single logging style but maybe thats out of the scope of 
this PR.
   
   Happy to +1 as is and maybe discuss this as part of a new ticket


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org
For additional commands, e-mail: gitbox-h...@hive.apache.org