date:20200623

[jira] [Work logged] (HIVE-23619) HiveServer2 should retry query if the TezAM running it gets killed

2020-06-23 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23619?focusedWorklogId=450246=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-450246
 ]

ASF GitHub Bot logged work on HIVE-23619:
-

Author: ASF GitHub Bot
Created on: 24/Jun/20 05:57
Start Date: 24/Jun/20 05:57
Worklog Time Spent: 10m 
  Work Description: sankarh commented on a change in pull request #1146:
URL: https://github.com/apache/hive/pull/1146#discussion_r444660896



##
File path: 
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/reexec/TestReExecuteKilledTezAMQueryPlugin.java
##
@@ -0,0 +1,195 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hive.ql.reexec;
+
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.hive.conf.HiveConf;
+import org.apache.hadoop.yarn.api.records.ApplicationReport;
+import org.apache.hadoop.yarn.api.records.YarnApplicationState;
+import org.apache.hadoop.yarn.client.api.YarnClient;
+import org.apache.hive.jdbc.HiveStatement;
+import org.apache.hive.jdbc.TestJdbcWithMiniLlapArrow;
+import org.apache.hive.jdbc.miniHS2.MiniHS2;
+import org.junit.*;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.File;
+import java.net.URL;
+import java.sql.Connection;
+import java.sql.DriverManager;
+import java.sql.SQLException;
+import java.sql.Statement;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+
+public class TestReExecuteKilledTezAMQueryPlugin {
+  protected static final Logger LOG = 
LoggerFactory.getLogger(TestReExecuteKilledTezAMQueryPlugin.class);
+
+  private static MiniHS2 miniHS2 = null;
+  private static final String tableName = "testKillTezAmTbl";
+  private static String dataFileDir;
+  private static final String testDbName = "testKillTezAmDb";
+  protected static Connection hs2Conn = null;
+  private static HiveConf conf;
+
+  private static class ExceptionHolder {
+Throwable throwable;
+  }
+
+  static HiveConf defaultConf() throws Exception {
+String confDir = "../../data/conf/llap/";
+if (confDir != null && !confDir.isEmpty()) {
+  HiveConf.setHiveSiteLocation(new URL("file://"+ new 
File(confDir).toURI().getPath() + "/hive-site.xml"));
+  System.out.println("Setting hive-site: " + 
HiveConf.getHiveSiteLocation());
+}
+HiveConf defaultConf = new HiveConf();
+defaultConf.setBoolVar(HiveConf.ConfVars.HIVE_SUPPORT_CONCURRENCY, false);
+defaultConf.setBoolVar(HiveConf.ConfVars.HIVE_SERVER2_ENABLE_DOAS, false);
+defaultConf.addResource(new URL("file://" + new 
File(confDir).toURI().getPath() + "/tez-site.xml"));
+return defaultConf;
+  }
+
+  @BeforeClass
+public static void beforeTest() throws Exception {
+  conf = defaultConf();
+  conf.setVar(HiveConf.ConfVars.USERS_IN_ADMIN_ROLE, 
System.getProperty("user.name"));
+  conf.set(HiveConf.ConfVars.HIVE_QUERY_REEXECUTION_STRATEGIES.varname, 
"reexecute_lost_am");
+  MiniHS2.cleanupLocalDir();
+  Class.forName(MiniHS2.getJdbcDriverName());
+  miniHS2 = new MiniHS2(conf, MiniHS2.MiniClusterType.LLAP);
+  dataFileDir = conf.get("test.data.files").replace('\\', 
'/').replace("c:", "");
+  Map confOverlay = new HashMap();
+  miniHS2.start(confOverlay);
+  miniHS2.getDFS().getFileSystem().mkdirs(new 
Path("/apps_staging_dir/anonymous"));
+
+  Connection conDefault = getConnection(miniHS2.getJdbcURL(),
+  System.getProperty("user.name"), "bar");
+  Statement stmt = conDefault.createStatement();
+  String tblName = testDbName + "." + tableName;
+  Path dataFilePath = new Path(dataFileDir, "kv1.txt");
+  String udfName = TestJdbcWithMiniLlapArrow.SleepMsUDF.class.getName();
+  stmt.execute("drop database if exists " + testDbName + " cascade");
+  stmt.execute("create database " + testDbName);
+  stmt.execute("set role admin");
+  stmt.execute("dfs -put " + dataFilePath.toString() + " " + "kv1.txt");
+  stmt.execute("use " + testDbName);
+  stmt.execute("create table " + tblName + " (int_col int, value string) 
");
+  stmt.execute("load data inpath

[jira] [Work logged] (HIVE-23619) HiveServer2 should retry query if the TezAM running it gets killed

2020-06-23 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23619?focusedWorklogId=450245=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-450245
 ]

ASF GitHub Bot logged work on HIVE-23619:
-

Author: ASF GitHub Bot
Created on: 24/Jun/20 05:56
Start Date: 24/Jun/20 05:56
Worklog Time Spent: 10m 
  Work Description: sankarh commented on a change in pull request #1146:
URL: https://github.com/apache/hive/pull/1146#discussion_r444660568



##
File path: 
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/reexec/TestReExecuteKilledTezAMQueryPlugin.java
##
@@ -0,0 +1,195 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hive.ql.reexec;
+
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.hive.conf.HiveConf;
+import org.apache.hadoop.yarn.api.records.ApplicationReport;
+import org.apache.hadoop.yarn.api.records.YarnApplicationState;
+import org.apache.hadoop.yarn.client.api.YarnClient;
+import org.apache.hive.jdbc.HiveStatement;
+import org.apache.hive.jdbc.TestJdbcWithMiniLlapArrow;
+import org.apache.hive.jdbc.miniHS2.MiniHS2;
+import org.junit.*;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.File;
+import java.net.URL;
+import java.sql.Connection;
+import java.sql.DriverManager;
+import java.sql.SQLException;
+import java.sql.Statement;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+
+public class TestReExecuteKilledTezAMQueryPlugin {
+  protected static final Logger LOG = 
LoggerFactory.getLogger(TestReExecuteKilledTezAMQueryPlugin.class);
+
+  private static MiniHS2 miniHS2 = null;
+  private static final String tableName = "testKillTezAmTbl";
+  private static String dataFileDir;
+  private static final String testDbName = "testKillTezAmDb";
+  protected static Connection hs2Conn = null;

Review comment:
   hs2Conn is unused. Shall remove it.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 450245)
Time Spent: 2h 10m  (was: 2h)

> HiveServer2 should retry query if the TezAM running it gets killed
> --
>
> Key: HIVE-23619
> URL: https://issues.apache.org/jira/browse/HIVE-23619
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Adesh Kumar Rao
>Assignee: Adesh Kumar Rao
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> If the TezAM was running a query and it gets killed because of external 
> factors like node going node, HS2 should retry the query in different TezAM.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23619) HiveServer2 should retry query if the TezAM running it gets killed

2020-06-23 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23619?focusedWorklogId=450244=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-450244
 ]

ASF GitHub Bot logged work on HIVE-23619:
-

Author: ASF GitHub Bot
Created on: 24/Jun/20 05:54
Start Date: 24/Jun/20 05:54
Worklog Time Spent: 10m 
  Work Description: sankarh commented on a change in pull request #1146:
URL: https://github.com/apache/hive/pull/1146#discussion_r444660173



##
File path: 
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/reexec/TestReExecuteKilledTezAMQueryPlugin.java
##
@@ -0,0 +1,195 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hive.ql.reexec;
+
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.hive.conf.HiveConf;
+import org.apache.hadoop.yarn.api.records.ApplicationReport;
+import org.apache.hadoop.yarn.api.records.YarnApplicationState;
+import org.apache.hadoop.yarn.client.api.YarnClient;
+import org.apache.hive.jdbc.HiveStatement;
+import org.apache.hive.jdbc.TestJdbcWithMiniLlapArrow;
+import org.apache.hive.jdbc.miniHS2.MiniHS2;
+import org.junit.*;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.File;
+import java.net.URL;
+import java.sql.Connection;
+import java.sql.DriverManager;
+import java.sql.SQLException;
+import java.sql.Statement;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+
+public class TestReExecuteKilledTezAMQueryPlugin {
+  protected static final Logger LOG = 
LoggerFactory.getLogger(TestReExecuteKilledTezAMQueryPlugin.class);
+
+  private static MiniHS2 miniHS2 = null;
+  private static final String tableName = "testKillTezAmTbl";
+  private static String dataFileDir;
+  private static final String testDbName = "testKillTezAmDb";
+  protected static Connection hs2Conn = null;
+  private static HiveConf conf;
+
+  private static class ExceptionHolder {
+Throwable throwable;
+  }
+
+  static HiveConf defaultConf() throws Exception {
+String confDir = "../../data/conf/llap/";
+if (confDir != null && !confDir.isEmpty()) {
+  HiveConf.setHiveSiteLocation(new URL("file://"+ new 
File(confDir).toURI().getPath() + "/hive-site.xml"));
+  System.out.println("Setting hive-site: " + 
HiveConf.getHiveSiteLocation());
+}
+HiveConf defaultConf = new HiveConf();
+defaultConf.setBoolVar(HiveConf.ConfVars.HIVE_SUPPORT_CONCURRENCY, false);
+defaultConf.setBoolVar(HiveConf.ConfVars.HIVE_SERVER2_ENABLE_DOAS, false);
+defaultConf.addResource(new URL("file://" + new 
File(confDir).toURI().getPath() + "/tez-site.xml"));
+return defaultConf;
+  }
+
+  @BeforeClass
+public static void beforeTest() throws Exception {

Review comment:
   Unaligned to 2 spaces tab. Check other methods too. It seems only 
annotation is corrected.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 450244)
Time Spent: 2h  (was: 1h 50m)

> HiveServer2 should retry query if the TezAM running it gets killed
> --
>
> Key: HIVE-23619
> URL: https://issues.apache.org/jira/browse/HIVE-23619
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Adesh Kumar Rao
>Assignee: Adesh Kumar Rao
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> If the TezAM was running a query and it gets killed because of external 
> factors like node going node, HS2 should retry the query in different TezAM.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23619) HiveServer2 should retry query if the TezAM running it gets killed

2020-06-23 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23619?focusedWorklogId=450243=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-450243
 ]

ASF GitHub Bot logged work on HIVE-23619:
-

Author: ASF GitHub Bot
Created on: 24/Jun/20 05:52
Start Date: 24/Jun/20 05:52
Worklog Time Spent: 10m 
  Work Description: sankarh commented on a change in pull request #1146:
URL: https://github.com/apache/hive/pull/1146#discussion_r444659453



##
File path: 
ql/src/java/org/apache/hadoop/hive/ql/reexec/ReExecuteLostAMQueryPlugin.java
##
@@ -0,0 +1,75 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hive.ql.reexec;
+
+import org.apache.hadoop.hive.ql.Driver;
+import org.apache.hadoop.hive.ql.hooks.ExecuteWithHookContext;
+import org.apache.hadoop.hive.ql.hooks.HookContext;
+import org.apache.hadoop.hive.ql.plan.mapper.PlanMapper;
+
+import java.util.regex.Pattern;
+
+public class ReExecuteLostAMQueryPlugin implements IReExecutionPlugin {
+private boolean retryPossible;
+
+// Lost am container have exit code -100, due to node failures.
+private Pattern lostAMContainerErrorPattern = Pattern.compile(".*AM 
Container for .* exited .* exitCode: -100.*");

Review comment:
   Then no worries.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 450243)
Time Spent: 1h 50m  (was: 1h 40m)

> HiveServer2 should retry query if the TezAM running it gets killed
> --
>
> Key: HIVE-23619
> URL: https://issues.apache.org/jira/browse/HIVE-23619
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Adesh Kumar Rao
>Assignee: Adesh Kumar Rao
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> If the TezAM was running a query and it gets killed because of external 
> factors like node going node, HS2 should retry the query in different TezAM.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23619) HiveServer2 should retry query if the TezAM running it gets killed

2020-06-23 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23619?focusedWorklogId=450242=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-450242
 ]

ASF GitHub Bot logged work on HIVE-23619:
-

Author: ASF GitHub Bot
Created on: 24/Jun/20 05:51
Start Date: 24/Jun/20 05:51
Worklog Time Spent: 10m 
  Work Description: sankarh commented on a change in pull request #1146:
URL: https://github.com/apache/hive/pull/1146#discussion_r444659267



##
File path: common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
##
@@ -4979,10 +4979,11 @@ private static void 
populateLlapDaemonVarsSet(Set llapDaemonVarsSetLocal
 
 HIVE_QUERY_REEXECUTION_ENABLED("hive.query.reexecution.enabled", true,
 "Enable query reexecutions"),
-HIVE_QUERY_REEXECUTION_STRATEGIES("hive.query.reexecution.strategies", 
"overlay,reoptimize",
+HIVE_QUERY_REEXECUTION_STRATEGIES("hive.query.reexecution.strategies", 
"overlay,reoptimize,reexecute_lost_am",
 "comma separated list of plugin can be used:\n"
 + "  overlay: hiveconf subtree 'reexec.overlay' is used as an 
overlay in case of an execution errors out\n"
-+ "  reoptimize: collects operator statistics during execution and 
recompile the query after a failure"),
++ "  reoptimize: collects operator statistics during execution and 
recompile the query after a failure\n"
++ "  reexecute_lost_am: reexecutes query if it failed due to tez 
am node gets decommissioned"),

Review comment:
   Make sense.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 450242)
Time Spent: 1h 40m  (was: 1.5h)

> HiveServer2 should retry query if the TezAM running it gets killed
> --
>
> Key: HIVE-23619
> URL: https://issues.apache.org/jira/browse/HIVE-23619
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Adesh Kumar Rao
>Assignee: Adesh Kumar Rao
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> If the TezAM was running a query and it gets killed because of external 
> factors like node going node, HS2 should retry the query in different TezAM.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23619) HiveServer2 should retry query if the TezAM running it gets killed

2020-06-23 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23619?focusedWorklogId=450237=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-450237
 ]

ASF GitHub Bot logged work on HIVE-23619:
-

Author: ASF GitHub Bot
Created on: 24/Jun/20 05:37
Start Date: 24/Jun/20 05:37
Worklog Time Spent: 10m 
  Work Description: adesh-rao commented on a change in pull request #1146:
URL: https://github.com/apache/hive/pull/1146#discussion_r444654835



##
File path: 
ql/src/java/org/apache/hadoop/hive/ql/reexec/ReExecuteLostAMQueryPlugin.java
##
@@ -0,0 +1,75 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hive.ql.reexec;
+
+import org.apache.hadoop.hive.ql.Driver;
+import org.apache.hadoop.hive.ql.hooks.ExecuteWithHookContext;
+import org.apache.hadoop.hive.ql.hooks.HookContext;
+import org.apache.hadoop.hive.ql.plan.mapper.PlanMapper;
+
+import java.util.regex.Pattern;
+
+public class ReExecuteLostAMQueryPlugin implements IReExecutionPlugin {
+private boolean retryPossible;
+
+// Lost am container have exit code -100, due to node failures.
+private Pattern lostAMContainerErrorPattern = Pattern.compile(".*AM 
Container for .* exited .* exitCode: -100.*");

Review comment:
   Kill Query will kill the running AM, and the error message will be 
different (like "application killed by user"). Anyway, the plugin does not 
retry when the tez AM is killed manually.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 450237)
Time Spent: 1.5h  (was: 1h 20m)

> HiveServer2 should retry query if the TezAM running it gets killed
> --
>
> Key: HIVE-23619
> URL: https://issues.apache.org/jira/browse/HIVE-23619
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Adesh Kumar Rao
>Assignee: Adesh Kumar Rao
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> If the TezAM was running a query and it gets killed because of external 
> factors like node going node, HS2 should retry the query in different TezAM.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23619) HiveServer2 should retry query if the TezAM running it gets killed

2020-06-23 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23619?focusedWorklogId=450234=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-450234
 ]

ASF GitHub Bot logged work on HIVE-23619:
-

Author: ASF GitHub Bot
Created on: 24/Jun/20 05:34
Start Date: 24/Jun/20 05:34
Worklog Time Spent: 10m 
  Work Description: adesh-rao commented on a change in pull request #1146:
URL: https://github.com/apache/hive/pull/1146#discussion_r444653982



##
File path: common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
##
@@ -4979,10 +4979,11 @@ private static void 
populateLlapDaemonVarsSet(Set llapDaemonVarsSetLocal
 
 HIVE_QUERY_REEXECUTION_ENABLED("hive.query.reexecution.enabled", true,
 "Enable query reexecutions"),
-HIVE_QUERY_REEXECUTION_STRATEGIES("hive.query.reexecution.strategies", 
"overlay,reoptimize",
+HIVE_QUERY_REEXECUTION_STRATEGIES("hive.query.reexecution.strategies", 
"overlay,reoptimize,reexecute_lost_am",
 "comma separated list of plugin can be used:\n"
 + "  overlay: hiveconf subtree 'reexec.overlay' is used as an 
overlay in case of an execution errors out\n"
-+ "  reoptimize: collects operator statistics during execution and 
recompile the query after a failure"),
++ "  reoptimize: collects operator statistics during execution and 
recompile the query after a failure\n"
++ "  reexecute_lost_am: reexecutes query if it failed due to tez 
am node gets decommissioned"),

Review comment:
   No, the plugin does not retry if the AM was killed manually because of 
the error message that we are grepping for.
   
   In case AM is killed, the error message is something like "*killed by user". 
It does not throw -100 exitcode.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 450234)
Time Spent: 1h 10m  (was: 1h)

> HiveServer2 should retry query if the TezAM running it gets killed
> --
>
> Key: HIVE-23619
> URL: https://issues.apache.org/jira/browse/HIVE-23619
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Adesh Kumar Rao
>Assignee: Adesh Kumar Rao
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> If the TezAM was running a query and it gets killed because of external 
> factors like node going node, HS2 should retry the query in different TezAM.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23619) HiveServer2 should retry query if the TezAM running it gets killed

2020-06-23 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23619?focusedWorklogId=450236=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-450236
 ]

ASF GitHub Bot logged work on HIVE-23619:
-

Author: ASF GitHub Bot
Created on: 24/Jun/20 05:34
Start Date: 24/Jun/20 05:34
Worklog Time Spent: 10m 
  Work Description: adesh-rao commented on a change in pull request #1146:
URL: https://github.com/apache/hive/pull/1146#discussion_r444654203



##
File path: 
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/reexec/TestReExecuteKilledTezAMQueryPlugin.java
##
@@ -0,0 +1,207 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hive.ql.reexec;
+
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.hive.conf.HiveConf;
+import org.apache.hadoop.hive.llap.LlapBaseInputFormat;
+import org.apache.hadoop.hive.ql.metadata.HiveException;
+import org.apache.hadoop.yarn.api.records.ApplicationReport;
+import org.apache.hadoop.yarn.api.records.YarnApplicationState;
+import org.apache.hadoop.yarn.client.api.YarnClient;
+import org.apache.hive.jdbc.BaseJdbcWithMiniLlap;
+import org.apache.hive.jdbc.HiveStatement;
+import org.apache.hive.jdbc.TestJdbcWithMiniLlapArrow;
+import org.apache.hive.jdbc.miniHS2.MiniHS2;
+import org.junit.*;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.File;
+import java.net.URL;
+import java.sql.Connection;
+import java.sql.DriverManager;
+import java.sql.SQLException;
+import java.sql.Statement;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+
+
+public class TestReExecuteKilledTezAMQueryPlugin {
+protected static final Logger LOG = 
LoggerFactory.getLogger(TestJdbcWithMiniLlapArrow.class);
+
+private static MiniHS2 miniHS2 = null;
+private static final String tableName = "testKillTezAmTbl";
+private static String dataFileDir;
+private static final String testDbName = "testKillTezAmDb";
+protected static Connection hs2Conn = null;
+private static HiveConf conf;
+
+private static class ExceptionHolder {
+Throwable throwable;
+}
+
+static HiveConf defaultConf() throws Exception {
+String confDir = "../../data/conf/llap/";
+if (confDir != null && !confDir.isEmpty()) {
+HiveConf.setHiveSiteLocation(new URL("file://"+ new 
File(confDir).toURI().getPath() + "/hive-site.xml"));
+System.out.println("Setting hive-site: " + 
HiveConf.getHiveSiteLocation());
+}
+HiveConf defaultConf = new HiveConf();
+defaultConf.setBoolVar(HiveConf.ConfVars.HIVE_SUPPORT_CONCURRENCY, 
false);
+defaultConf.setBoolVar(HiveConf.ConfVars.HIVE_SERVER2_ENABLE_DOAS, 
false);
+defaultConf.addResource(new URL("file://" + new 
File(confDir).toURI().getPath() + "/tez-site.xml"));
+return defaultConf;
+}
+
+@BeforeClass
+public static void beforeTest() throws Exception {
+conf = defaultConf();
+conf.setVar(HiveConf.ConfVars.USERS_IN_ADMIN_ROLE, 
System.getProperty("user.name"));
+conf.set(HiveConf.ConfVars.HIVE_QUERY_REEXECUTION_STRATEGIES.varname, 
"reexecute_lost_am");
+MiniHS2.cleanupLocalDir();
+Class.forName(MiniHS2.getJdbcDriverName());
+miniHS2 = new MiniHS2(conf, MiniHS2.MiniClusterType.LLAP);
+dataFileDir = conf.get("test.data.files").replace('\\', 
'/').replace("c:", "");
+Map confOverlay = new HashMap();
+miniHS2.start(confOverlay);
+miniHS2.getDFS().getFileSystem().mkdirs(new 
Path("/apps_staging_dir/anonymous"));
+
+Connection conDefault = getConnection(miniHS2.getJdbcURL(),
+System.getProperty("user.name"), "bar");
+Statement stmt = conDefault.createStatement();
+String tblName = testDbName + "." + tableName;
+Path dataFilePath = new Path(dataFileDir, "kv1.txt");
+String udfName = TestJdbcWithMiniLlapArrow.SleepMsUDF.class.getName();
+stmt.execute("drop database if exists " + testDbName + " cascade");
+stmt.execute("create database " + testDbName);
+

[jira] [Updated] (HIVE-23754) LLAP: Add LoggingHandler in ShuffleHandler pipeline for better debuggability

2020-06-23 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-23754:
--
Labels: pull-request-available  (was: )

> LLAP: Add LoggingHandler in ShuffleHandler pipeline for better debuggability
> 
>
> Key: HIVE-23754
> URL: https://issues.apache.org/jira/browse/HIVE-23754
> Project: Hive
>  Issue Type: Improvement
> Environment:  
>  
>Reporter: Rajesh Balamohan
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> [https://github.com/apache/hive/blob/master/llap-server/src/java/org/apache/hadoop/hive/llap/shufflehandler/ShuffleHandler.java#L616]
>  
> For corner case debugging, it would be helpful to understand when netty 
> processed OPEN/BOUND/CLOSE/RECEIVED/CONNECTED events along with payload 
> details.
> Adding "LoggingHandler" in ChannelPipeline mode can help in debugging.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23754) LLAP: Add LoggingHandler in ShuffleHandler pipeline for better debuggability

2020-06-23 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23754?focusedWorklogId=450215=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-450215
 ]

ASF GitHub Bot logged work on HIVE-23754:
-

Author: ASF GitHub Bot
Created on: 24/Jun/20 05:02
Start Date: 24/Jun/20 05:02
Worklog Time Spent: 10m 
  Work Description: rbalamohan opened a new pull request #1172:
URL: https://github.com/apache/hive/pull/1172


   (HIVE-23754) LLAP: Add LoggingHandler in ShuffleHandler pipeline for better 
debuggability.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 450215)
Remaining Estimate: 0h
Time Spent: 10m

> LLAP: Add LoggingHandler in ShuffleHandler pipeline for better debuggability
> 
>
> Key: HIVE-23754
> URL: https://issues.apache.org/jira/browse/HIVE-23754
> Project: Hive
>  Issue Type: Improvement
> Environment:  
>  
>Reporter: Rajesh Balamohan
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> [https://github.com/apache/hive/blob/master/llap-server/src/java/org/apache/hadoop/hive/llap/shufflehandler/ShuffleHandler.java#L616]
>  
> For corner case debugging, it would be helpful to understand when netty 
> processed OPEN/BOUND/CLOSE/RECEIVED/CONNECTED events along with payload 
> details.
> Adding "LoggingHandler" in ChannelPipeline mode can help in debugging.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-23754) LLAP: Add LoggingHandler in ShuffleHandler pipeline for better debuggability

2020-06-23 Thread Rajesh Balamohan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated HIVE-23754:

Description: 
[https://github.com/apache/hive/blob/master/llap-server/src/java/org/apache/hadoop/hive/llap/shufflehandler/ShuffleHandler.java#L616]

 

For corner case debugging, it would be helpful to understand when netty 
processed OPEN/BOUND/CLOSE/RECEIVED/CONNECTED events along with payload details.

Adding "LoggingHandler" in ChannelPipeline mode can help in debugging.

> LLAP: Add LoggingHandler in ShuffleHandler pipeline for better debuggability
> 
>
> Key: HIVE-23754
> URL: https://issues.apache.org/jira/browse/HIVE-23754
> Project: Hive
>  Issue Type: Improvement
> Environment:  
>  
>Reporter: Rajesh Balamohan
>Priority: Major
>
> [https://github.com/apache/hive/blob/master/llap-server/src/java/org/apache/hadoop/hive/llap/shufflehandler/ShuffleHandler.java#L616]
>  
> For corner case debugging, it would be helpful to understand when netty 
> processed OPEN/BOUND/CLOSE/RECEIVED/CONNECTED events along with payload 
> details.
> Adding "LoggingHandler" in ChannelPipeline mode can help in debugging.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-23754) LLAP: Add LoggingHandler in ShuffleHandler pipeline for better debuggability

2020-06-23 Thread Rajesh Balamohan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated HIVE-23754:

Environment: 
 

 

  was:
[https://github.com/apache/hive/blob/master/llap-server/src/java/org/apache/hadoop/hive/llap/shufflehandler/ShuffleHandler.java#L616]

 

For corner case debugging, it would be helpful to understand when netty 
processed OPEN/BOUND/CLOSE/RECEIVED/CONNECTED events along with payload details.

Adding "LoggingHandler" in ChannelPipeline mode can help in debugging.

 


> LLAP: Add LoggingHandler in ShuffleHandler pipeline for better debuggability
> 
>
> Key: HIVE-23754
> URL: https://issues.apache.org/jira/browse/HIVE-23754
> Project: Hive
>  Issue Type: Improvement
> Environment:  
>  
>Reporter: Rajesh Balamohan
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work started] (HIVE-22826) ALTER TABLE RENAME COLUMN doesn't update list of bucketed column names

2020-06-23 Thread Ashish Sharma (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-22826 started by Ashish Sharma.

>  ALTER TABLE RENAME COLUMN doesn't update list of bucketed column names
> ---
>
> Key: HIVE-22826
> URL: https://issues.apache.org/jira/browse/HIVE-22826
> Project: Hive
>  Issue Type: Bug
>Reporter: Karen Coppage
>Assignee: Ashish Sharma
>Priority: Major
> Attachments: unitTest.patch
>
>
> Compaction for tables where a bucketed column has been renamed fails since 
> the list of bucketed columns in the StorageDescriptor doesn't get updated 
> when the column is renamed, therefore we can't recreate the table correctly 
> during compaction.
> Attached a unit test that fails.
> NO PRECOMMIT TESTS



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work started] (HIVE-23545) Insert table partition(column) occasionally occurs when the target partition is not created

2020-06-23 Thread Ashish Sharma (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-23545 started by Ashish Sharma.

> Insert table partition(column) occasionally occurs when the target partition 
> is not created
> ---
>
> Key: HIVE-23545
> URL: https://issues.apache.org/jira/browse/HIVE-23545
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.1.0
>Reporter: LuGuangMing
>Assignee: Ashish Sharma
>Priority: Major
> Attachments: test.sql
>
>
> Insert data into the static partition of an external table, this static 
> partition is created in advance, When hive.exec.parallel is turned on, it 
> {color:#FF}occasionally{color} occurs after execution that the partition 
> does not exist and the it no apparent error logs during execution



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work started] (HIVE-22949) Handle empty insert overwrites without inserting an empty file in case of acid/mm tables

2020-06-23 Thread Ashish Sharma (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-22949 started by Ashish Sharma.

> Handle empty insert overwrites without inserting an empty file in case of 
> acid/mm tables
> 
>
> Key: HIVE-22949
> URL: https://issues.apache.org/jira/browse/HIVE-22949
> Project: Hive
>  Issue Type: Improvement
>Reporter: László Bodor
>Assignee: Ashish Sharma
>Priority: Major
>
> HIVE-22941 was a quick workaround for empty files in case of external tables. 
> An optimal solution would to completely prevent writing empty files just for 
> having the table contents cleared on an empty INSERT OVERWRITE.
> There are other tickets about a similar topic, for example, if we need empty 
> files at all: HIVE-22918, HIVE-22938



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23727) Improve SQLOperation log handling when cleanup

2020-06-23 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23727?focusedWorklogId=450178=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-450178
 ]

ASF GitHub Bot logged work on HIVE-23727:
-

Author: ASF GitHub Bot
Created on: 24/Jun/20 02:04
Start Date: 24/Jun/20 02:04
Worklog Time Spent: 10m 
  Work Description: dengzhhu653 edited a comment on pull request #1149:
URL: https://github.com/apache/hive/pull/1149#issuecomment-648507858


   @belugabehr can you take a look? Thanks



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 450178)
Time Spent: 50m  (was: 40m)

> Improve SQLOperation log handling when cleanup
> --
>
> Key: HIVE-23727
> URL: https://issues.apache.org/jira/browse/HIVE-23727
> Project: Hive
>  Issue Type: Improvement
>Reporter: Zhihua Deng
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> The SQLOperation checks _if (shouldRunAsync() && state != 
> OperationState.CANCELED && state != OperationState.TIMEDOUT)_ to cancel the 
> background task. If true, the state should not be OperationState.CANCELED, so 
> logging under the state == OperationState.CANCELED should never happen.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23546) Skip authorization when user is a superuser

2020-06-23 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23546?focusedWorklogId=450172=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-450172
 ]

ASF GitHub Bot logged work on HIVE-23546:
-

Author: ASF GitHub Bot
Created on: 24/Jun/20 00:59
Start Date: 24/Jun/20 00:59
Worklog Time Spent: 10m 
  Work Description: dengzhhu653 closed pull request #1033:
URL: https://github.com/apache/hive/pull/1033


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 450172)
Time Spent: 20m  (was: 10m)

> Skip authorization when user is a superuser
> ---
>
> Key: HIVE-23546
> URL: https://issues.apache.org/jira/browse/HIVE-23546
> Project: Hive
>  Issue Type: Improvement
>Reporter: Zhihua Deng
>Assignee: Zhihua Deng
>Priority: Minor
>  Labels: pull-request-available
> Attachments: HIVE-23546.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> If the current user is a superuser, there is no need to do authorization. 
> This can speed up queries, especially for those ddl queries. For example, the 
> superuser add partitions when the external data is ready, or show partitions 
> to check whether it OK to take the work flow one step further in a busy hive 
> cluster.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-14759) GenericUDF.getFuncName breaks with UDF Classnames less than 10 characters

2020-06-23 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-14759?focusedWorklogId=450147=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-450147
 ]

ASF GitHub Bot logged work on HIVE-14759:
-

Author: ASF GitHub Bot
Created on: 24/Jun/20 00:28
Start Date: 24/Jun/20 00:28
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #101:
URL: https://github.com/apache/hive/pull/101


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 450147)
Remaining Estimate: 40m  (was: 50m)
Time Spent: 20m  (was: 10m)

> GenericUDF.getFuncName breaks with UDF Classnames less than 10 characters
> -
>
> Key: HIVE-14759
> URL: https://issues.apache.org/jira/browse/HIVE-14759
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Affects Versions: 2.1.0
>Reporter: Clemens Valiente
>Assignee: Clemens Valiente
>Priority: Trivial
>  Labels: pull-request-available
> Attachments: HIVE-14759.1.patch, HIVE-14759.2.patch, HIVE-14759.patch
>
>   Original Estimate: 1h
>  Time Spent: 20m
>  Remaining Estimate: 40m
>
> {code}
> return getClass().getSimpleName().substring(10).toLowerCase();
> {code}
> causes
> {code}
> java.lang.StringIndexOutOfBoundsException: String index out of range: -2
> at java.lang.String.substring(String.java:1875)
> at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDF.getFuncName(GenericUDF.java:258)
> {code}
> if the Classname of my UDF is less than 10 characters.
> this was probably to remove "GenericUDF" from the classname but causes issues 
> if the class doesn't start with it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-13879) add HiveAuthzContext to grant/revoke methods in HiveAuthorizer api

2020-06-23 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-13879?focusedWorklogId=450156=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-450156
 ]

ASF GitHub Bot logged work on HIVE-13879:
-

Author: ASF GitHub Bot
Created on: 24/Jun/20 00:28
Start Date: 24/Jun/20 00:28
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #88:
URL: https://github.com/apache/hive/pull/88


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 450156)
Time Spent: 20m  (was: 10m)

> add HiveAuthzContext to grant/revoke methods in HiveAuthorizer api
> --
>
> Key: HIVE-13879
> URL: https://issues.apache.org/jira/browse/HIVE-13879
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization
>Reporter: Thejas Nair
>Assignee: Thejas Nair
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-13879.1.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> HiveAuthzContext provides useful information about the context of the 
> commands, such as the command string and ip address information. However, 
> this is available to only checkPrivileges and filterListCmdObjects api calls.
> This should be made available for other api calls such as grant/revoke 
> methods and role management methods.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-9660) store end offset of compressed data for RG in RowIndex in ORC

2020-06-23 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-9660?focusedWorklogId=450154=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-450154
 ]

ASF GitHub Bot logged work on HIVE-9660:


Author: ASF GitHub Bot
Created on: 24/Jun/20 00:28
Start Date: 24/Jun/20 00:28
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #77:
URL: https://github.com/apache/hive/pull/77


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 450154)
Time Spent: 20m  (was: 10m)

> store end offset of compressed data for RG in RowIndex in ORC
> -
>
> Key: HIVE-9660
> URL: https://issues.apache.org/jira/browse/HIVE-9660
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-9660.01.patch, HIVE-9660.02.patch, 
> HIVE-9660.03.patch, HIVE-9660.04.patch, HIVE-9660.05.patch, 
> HIVE-9660.06.patch, HIVE-9660.07.patch, HIVE-9660.07.patch, 
> HIVE-9660.08.patch, HIVE-9660.09.patch, HIVE-9660.10.patch, 
> HIVE-9660.10.patch, HIVE-9660.11.patch, HIVE-9660.patch, HIVE-9660.patch, 
> HIVE-9660.patch, owen-hive-9660.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Right now the end offset is estimated, which in some cases results in tons of 
> extra data being read.
> We can add a separate array to RowIndex (positions_v2?) that stores number of 
> compressed buffers for each RG, or end offset, or something, to remove this 
> estimation magic



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-15690) Speed up WebHCat DDL Response Time by Using JDBC

2020-06-23 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-15690?focusedWorklogId=450148=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-450148
 ]

ASF GitHub Bot logged work on HIVE-15690:
-

Author: ASF GitHub Bot
Created on: 24/Jun/20 00:28
Start Date: 24/Jun/20 00:28
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #133:
URL: https://github.com/apache/hive/pull/133


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 450148)
Remaining Estimate: 23h 40m  (was: 23h 50m)
Time Spent: 20m  (was: 10m)

> Speed up WebHCat DDL Response Time by Using JDBC
> 
>
> Key: HIVE-15690
> URL: https://issues.apache.org/jira/browse/HIVE-15690
> Project: Hive
>  Issue Type: Improvement
>  Components: WebHCat
>Reporter: Amin Abbaspour
>Assignee: Amin Abbaspour
>Priority: Minor
>  Labels: easyfix, patch, performance, pull-request-available, 
> security
>   Original Estimate: 24h
>  Time Spent: 20m
>  Remaining Estimate: 23h 40m
>
> WebHCat launches new hcat scripts for each DDL call which makes it unsuitable 
> for interactive REST environments.
> This change to speed up /ddl query calls by running them over JDBC connection 
> to Hive thrift server.
> Also being JDBC connection, this is secure and compatible with all access 
> policies define in Hive server2. User does not have metadata visibility over 
> other databases (which is the case in hcat mode.)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-15497) Unthrown SerDeException in ThriftJDBCBinarySerDe.java

2020-06-23 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-15497?focusedWorklogId=450144=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-450144
 ]

ASF GitHub Bot logged work on HIVE-15497:
-

Author: ASF GitHub Bot
Created on: 24/Jun/20 00:28
Start Date: 24/Jun/20 00:28
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #126:
URL: https://github.com/apache/hive/pull/126


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 450144)
Time Spent: 20m  (was: 10m)

> Unthrown SerDeException in ThriftJDBCBinarySerDe.java
> -
>
> Key: HIVE-15497
> URL: https://issues.apache.org/jira/browse/HIVE-15497
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Reporter: JC
>Priority: Trivial
>  Labels: pull-request-available
> Attachments: HIVE-15497.txt
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> There is an unthrown SerDeException in 
> serde/src/java/org/apache/hadoop/hive/serde2/thrift/ThriftJDBCBinarySerDe.java
>  (found in the currenet github snapshot, 
> 4ba713ccd85c3706d195aeef9476e6e6363f1c21)
> {code}
>  91 initializeRowAndColumns();
>  92 try {
>  93   thriftFormatter.initialize(conf, tbl);
>  94 } catch (Exception e) {
>  95   new SerDeException(e);
>  96 }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-15746) Fix default delimiter2 in str_to_map UDF or in method description

2020-06-23 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-15746?focusedWorklogId=450145=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-450145
 ]

ASF GitHub Bot logged work on HIVE-15746:
-

Author: ASF GitHub Bot
Created on: 24/Jun/20 00:28
Start Date: 24/Jun/20 00:28
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #140:
URL: https://github.com/apache/hive/pull/140


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 450145)
Time Spent: 20m  (was: 10m)

> Fix default delimiter2 in str_to_map UDF or in method description
> -
>
> Key: HIVE-15746
> URL: https://issues.apache.org/jira/browse/HIVE-15746
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Affects Versions: 2.1.1
>Reporter: Alexander Pivovarov
>Assignee: Alexander Pivovarov
>Priority: Trivial
>  Labels: pull-request-available
> Fix For: 2.3.0
>
> Attachments: HIVE-15746.1.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> According to UDF wiki and to GenericUDFStringToMap.java class comments 
> default delimiter 2 should be '='.
> But in the code default_del2 = ":"
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFStringToMap.java#L53
> We need to fix code or fix the method description and UDF wiki
> Let me know what you think?
> {code}
> str_to_map("a=1,b=2")
> vs
> str_to_map("a:1,b:2")
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-15423) Allowing Hive to reverse map IP from hostname for partition info

2020-06-23 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-15423?focusedWorklogId=450152=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-450152
 ]

ASF GitHub Bot logged work on HIVE-15423:
-

Author: ASF GitHub Bot
Created on: 24/Jun/20 00:28
Start Date: 24/Jun/20 00:28
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #122:
URL: https://github.com/apache/hive/pull/122


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 450152)
Time Spent: 20m  (was: 10m)

> Allowing Hive to reverse map IP from hostname for partition info
> 
>
> Key: HIVE-15423
> URL: https://issues.apache.org/jira/browse/HIVE-15423
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.2.1
>Reporter: Suresh Bahuguna
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Hive - Namenode hostname mismatch when running queries with 2 MR jobs.
> Hive tries to find Partition info using hdfs://:, 
> whereas the info has been hashed using hdfs://:.
> Exception raised in HiveFileFormatUtils.java:
> -
> java.io.IOException: cannot find dir = 
> hdfs://hd-nn-24:9000/tmp/hive-admin/hive_2013-08-30_06-11-52_007_1545561832334194535/-mr-10002/00_0
>  in pathToPartitionInfo: 
> [hdfs://192.168.156.24:9000/tmp/hive-admin/hive_2013-08-30_06-11-52_007_1545561832334194535/-mr-10002]
> at 
> org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getPartitionDescFromPathRecursively(HiveFileFormatUtils.java
> -



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23727) Improve SQLOperation log handling when cleanup

2020-06-23 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23727?focusedWorklogId=450158=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-450158
 ]

ASF GitHub Bot logged work on HIVE-23727:
-

Author: ASF GitHub Bot
Created on: 24/Jun/20 00:28
Start Date: 24/Jun/20 00:28
Worklog Time Spent: 10m 
  Work Description: dengzhhu653 commented on pull request #1149:
URL: https://github.com/apache/hive/pull/1149#issuecomment-648507858


   @belugabehr can you take a look?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 450158)
Time Spent: 40m  (was: 0.5h)

> Improve SQLOperation log handling when cleanup
> --
>
> Key: HIVE-23727
> URL: https://issues.apache.org/jira/browse/HIVE-23727
> Project: Hive
>  Issue Type: Improvement
>Reporter: Zhihua Deng
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> The SQLOperation checks _if (shouldRunAsync() && state != 
> OperationState.CANCELED && state != OperationState.TIMEDOUT)_ to cancel the 
> background task. If true, the state should not be OperationState.CANCELED, so 
> logging under the state == OperationState.CANCELED should never happen.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-14483) java.lang.ArrayIndexOutOfBoundsException org.apache.orc.impl.TreeReaderFactory$BytesColumnVectorUtil.commonReadByteArrays

2020-06-23 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-14483?focusedWorklogId=450149=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-450149
 ]

ASF GitHub Bot logged work on HIVE-14483:
-

Author: ASF GitHub Bot
Created on: 24/Jun/20 00:28
Start Date: 24/Jun/20 00:28
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #96:
URL: https://github.com/apache/hive/pull/96


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 450149)
Time Spent: 20m  (was: 10m)

>  java.lang.ArrayIndexOutOfBoundsException 
> org.apache.orc.impl.TreeReaderFactory$BytesColumnVectorUtil.commonReadByteArrays
> --
>
> Key: HIVE-14483
> URL: https://issues.apache.org/jira/browse/HIVE-14483
> Project: Hive
>  Issue Type: Bug
>  Components: ORC
>Affects Versions: 2.1.0
>Reporter: Sergey Zadoroshnyak
>Assignee: Sergey Zadoroshnyak
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 1.3.0, 2.0.2, 2.1.1, 2.2.0
>
> Attachments: HIVE-14483.01.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Error message:
> Caused by: java.lang.ArrayIndexOutOfBoundsException: 1024
> at 
> org.apache.orc.impl.RunLengthIntegerReaderV2.nextVector(RunLengthIntegerReaderV2.java:369)
> at 
> org.apache.orc.impl.TreeReaderFactory$BytesColumnVectorUtil.commonReadByteArrays(TreeReaderFactory.java:1231)
> at 
> org.apache.orc.impl.TreeReaderFactory$BytesColumnVectorUtil.readOrcByteArrays(TreeReaderFactory.java:1268)
> at 
> org.apache.orc.impl.TreeReaderFactory$StringDirectTreeReader.nextVector(TreeReaderFactory.java:1368)
> at 
> org.apache.orc.impl.TreeReaderFactory$StringTreeReader.nextVector(TreeReaderFactory.java:1212)
> at 
> org.apache.orc.impl.TreeReaderFactory$ListTreeReader.nextVector(TreeReaderFactory.java:1902)
> at 
> org.apache.orc.impl.TreeReaderFactory$StructTreeReader.nextBatch(TreeReaderFactory.java:1737)
> at org.apache.orc.impl.RecordReaderImpl.nextBatch(RecordReaderImpl.java:1045)
> at 
> org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.ensureBatch(RecordReaderImpl.java:77)
> at 
> org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.hasNext(RecordReaderImpl.java:89)
> at 
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:230)
> at 
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:205)
> at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:350)
> ... 22 more
> How to reproduce?
> Configure StringTreeReader  which contains StringDirectTreeReader as 
> TreeReader (DIRECT or DIRECT_V2 column encoding)
> batchSize = 1026;
> invoke method nextVector(ColumnVector previousVector,boolean[] isNull, final 
> int batchSize)
> scratchlcv is LongColumnVector with long[] vector  (length 1024)
>  which execute BytesColumnVectorUtil.readOrcByteArrays(stream, lengths, 
> scratchlcv,result, batchSize);
> as result in method commonReadByteArrays(stream, lengths, scratchlcv,
> result, (int) batchSize) we received 
> ArrayIndexOutOfBoundsException.
> If we use StringDictionaryTreeReader, then there is no exception, as we have 
> a verification  scratchlcv.ensureSize((int) batchSize, false) before 
> reader.nextVector(scratchlcv, scratchlcv.vector, batchSize);
> These changes were made for Hive 2.1.0 by corresponding commit 
> https://github.com/apache/hive/commit/0ac424f0a17b341efe299da167791112e4a953e9#diff-a1cec556fb2db4b69a1a4127a6908177R1467
>  for task  https://issues.apache.org/jira/browse/HIVE-12159 by Owen O'Malley
> How to fix?
> add  only one line :
> scratchlcv.ensureSize((int) batchSize, false) ;
> in method 
> org.apache.orc.impl.TreeReaderFactory#BytesColumnVectorUtil#commonReadByteArrays(InStream
>  stream, IntegerReader lengths,
> LongColumnVector scratchlcv,
> BytesColumnVector result, final int batchSize) before invocation 
> lengths.nextVector(scratchlcv, scratchlcv.vector, batchSize);



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-13170) HiveAccumuloTableOutputFormat should implement HiveOutputFormat to ensure compatibility

2020-06-23 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-13170?focusedWorklogId=450155=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-450155
 ]

ASF GitHub Bot logged work on HIVE-13170:
-

Author: ASF GitHub Bot
Created on: 24/Jun/20 00:28
Start Date: 24/Jun/20 00:28
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #66:
URL: https://github.com/apache/hive/pull/66


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 450155)
Time Spent: 20m  (was: 10m)

> HiveAccumuloTableOutputFormat should implement HiveOutputFormat to ensure 
> compatibility
> ---
>
> Key: HIVE-13170
> URL: https://issues.apache.org/jira/browse/HIVE-13170
> Project: Hive
>  Issue Type: Bug
>  Components: Accumulo Storage Handler
>Affects Versions: 1.2.1, 2.0.0
>Reporter: Teng Qiu
>Assignee: Teng Qiu
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> this issue was caused by same reason described in 
> https://issues.apache.org/jira/browse/HIVE-11166
> both HiveAccumuloTableOutputFormat and HiveHBaseTableOutputFormat does not 
> implemented HiveOutputFormat, it may break the compatibility in some other 
> APIs that are using hive, such as spark's API.
> spark expects the OutputFormat called by hive storage handler is some kind of 
> HiveOutputFormat. which is totally reasonable.
> and since they are OutputFormat for hive storage handler, they should not 
> only extend the 3rd party OutputFormat (AccumuloOutputFormat or 
> hbase.TableOutputFormat), but also implement HiveOutputFormat interface.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-13877) Hplsql UDF doesn't work in Hive Cli

2020-06-23 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-13877?focusedWorklogId=450153=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-450153
 ]

ASF GitHub Bot logged work on HIVE-13877:
-

Author: ASF GitHub Bot
Created on: 24/Jun/20 00:28
Start Date: 24/Jun/20 00:28
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #78:
URL: https://github.com/apache/hive/pull/78


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 450153)
Time Spent: 20m  (was: 10m)

> Hplsql UDF doesn't work in Hive Cli
> ---
>
> Key: HIVE-13877
> URL: https://issues.apache.org/jira/browse/HIVE-13877
> Project: Hive
>  Issue Type: Bug
>  Components: hpl/sql
>Affects Versions: 2.2.0
>Reporter: jiangxintong
>Assignee: Dmitry Tolpeko
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-13877.2.patch, HIVE-13877.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Hive cli will throw "Error evaluating hplsql" exception when i use the hplsql 
> udf like "SELECT hplsql('hello[:1]', columnName) FROM tableName".



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-11741) Add a new hook to run before query parse/compile

2020-06-23 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-11741?focusedWorklogId=450157=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-450157
 ]

ASF GitHub Bot logged work on HIVE-11741:
-

Author: ASF GitHub Bot
Created on: 24/Jun/20 00:28
Start Date: 24/Jun/20 00:28
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #63:
URL: https://github.com/apache/hive/pull/63


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 450157)
Time Spent: 20m  (was: 10m)

> Add a new hook to run before query parse/compile
> 
>
> Key: HIVE-11741
> URL: https://issues.apache.org/jira/browse/HIVE-11741
> Project: Hive
>  Issue Type: New Feature
>  Components: Parser, SQL
>Affects Versions: 1.2.1
>Reporter: Guilherme Braccialli
>Assignee: Guilherme Braccialli
>Priority: Minor
>  Labels: patch, pull-request-available
> Attachments: HIVE-11741.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> It would be nice to allow developers to extend hive query language, making 
> possible to use custom wildcards on queries. 
> People uses Python or R to iterate over vectors or lists and create SQL 
> commands, this could be implemented directly on sql syntax.
> For example this python script:
> >>> sql = "SELECT state, "
> >>> for i in range(10):
> ...   sql += "   sum(case when type = " + str(i) + " then value end) as 
> sum_of_" + str(i) + " ,"
> ...
> >>> sql += " count(1) as  total FROM table"
> >>> print(sql)
> Could be written directly in extended sql like this:
> SELECT state,
> %for id = 1 to 10%
>sum(case when type = %id% then value end) as sum_of_%id%,
> %end%
> , count(1) as total
> FROM table
> GROUP BY state
> This kind of extensibility can be easily added if we add a new hook after 
> VariableSubstitution call on Driver.compile method.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-14585) Add travis.yml and update README to show build status

2020-06-23 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-14585?focusedWorklogId=450146=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-450146
 ]

ASF GitHub Bot logged work on HIVE-14585:
-

Author: ASF GitHub Bot
Created on: 24/Jun/20 00:28
Start Date: 24/Jun/20 00:28
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #97:
URL: https://github.com/apache/hive/pull/97


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 450146)
Time Spent: 20m  (was: 10m)

> Add travis.yml and update README to show build status
> -
>
> Key: HIVE-14585
> URL: https://issues.apache.org/jira/browse/HIVE-14585
> Project: Hive
>  Issue Type: Improvement
>  Components: Build Infrastructure
>Affects Versions: 2.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 2.3.0
>
> Attachments: HIVE-14585.1.patch, HIVE-14585.2.patch, 
> HIVE-14585.3.patch, HIVE-14585.4.patch, HIVE-14585.5.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Travis CI is free to use for all open source projects. To start off with we 
> can just run the builds and show the status on github page. In future, we can 
> leverage the tests and explore parallel testing.
> NO PRECOMMIT TESTS



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-15424) Hive dropped table during table creation if table already exists

2020-06-23 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-15424?focusedWorklogId=450151=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-450151
 ]

ASF GitHub Bot logged work on HIVE-15424:
-

Author: ASF GitHub Bot
Created on: 24/Jun/20 00:28
Start Date: 24/Jun/20 00:28
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #123:
URL: https://github.com/apache/hive/pull/123


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 450151)
Time Spent: 20m  (was: 10m)

> Hive dropped table during table creation if table already exists
> 
>
> Key: HIVE-15424
> URL: https://issues.apache.org/jira/browse/HIVE-15424
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.2.1
>Reporter: Suresh Bahuguna
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> While creating a table, rollbackCreateTable() shouldn't be called if table 
> already exists.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-16480) ORC file with empty array and array fails to read

2020-06-23 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-16480?focusedWorklogId=450134=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-450134
 ]

ASF GitHub Bot logged work on HIVE-16480:
-

Author: ASF GitHub Bot
Created on: 24/Jun/20 00:27
Start Date: 24/Jun/20 00:27
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #285:
URL: https://github.com/apache/hive/pull/285


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 450134)
Time Spent: 20m  (was: 10m)

> ORC file with empty array and array fails to read
> 
>
> Key: HIVE-16480
> URL: https://issues.apache.org/jira/browse/HIVE-16480
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.1.1, 2.2.0
>Reporter: David Capwell
>Assignee: Owen O'Malley
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.1.2, 2.2.1
>
> Attachments: HIVE-16480.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> We have a schema that has a array in it.  We were unable to read this 
> file and digging into ORC it seems that the issue is when the array is empty.
> Here is the stack trace
> {code:title=EmptyList.log|borderStyle=solid}
> ERROR 2017-04-19 09:29:17,075 [main] [EmptyList] [line 56] Failed to work 
> with type float 
> java.io.IOException: Error reading file: 
> /var/folders/t8/t5x1031d7mn17f6xpwnkkv_4gn/T/1492619355819-0/file-float.orc
>   at 
> org.apache.orc.impl.RecordReaderImpl.nextBatch(RecordReaderImpl.java:1052) 
> ~[hive-orc-2.1.1.jar:2.1.1]
>   at 
> org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.nextBatch(RecordReaderImpl.java:135)
>  ~[hive-exec-2.1.1.jar:2.1.1]
>   at EmptyList.emptyList(EmptyList.java:49) ~[test-classes/:na]
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> ~[na:1.8.0_121]
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> ~[na:1.8.0_121]
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[na:1.8.0_121]
>   at java.lang.reflect.Method.invoke(Method.java:498) ~[na:1.8.0_121]
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>  [junit-4.12.jar:4.12]
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>  [junit-4.12.jar:4.12]
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>  [junit-4.12.jar:4.12]
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>  [junit-4.12.jar:4.12]
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) 
> [junit-4.12.jar:4.12]
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>  [junit-4.12.jar:4.12]
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>  [junit-4.12.jar:4.12]
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) 
> [junit-4.12.jar:4.12]
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) 
> [junit-4.12.jar:4.12]
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) 
> [junit-4.12.jar:4.12]
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) 
> [junit-4.12.jar:4.12]
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) 
> [junit-4.12.jar:4.12]
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363) 
> [junit-4.12.jar:4.12]
>   at org.junit.runner.JUnitCore.run(JUnitCore.java:137) [junit-4.12.jar:4.12]
>   at 
> com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:68)
>  [junit-rt.jar:na]
>   at 
> com.intellij.rt.execution.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:51)
>  [junit-rt.jar:na]
>   at 
> com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:237)
>  [junit-rt.jar:na]
>   at com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:70) 
> [junit-rt.jar:na]
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> ~[na:1.8.0_121]
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> ~[na:1.8.0_121]
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[na:1.8.0_121]
>   at java.lang.reflect.Method.invoke(Method.java:498) ~[na:1.8.0_121]
>

[jira] [Work logged] (HIVE-19584) Dictionary encoding for string types

2020-06-23 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-19584?focusedWorklogId=450132=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-450132
 ]

ASF GitHub Bot logged work on HIVE-19584:
-

Author: ASF GitHub Bot
Created on: 24/Jun/20 00:27
Start Date: 24/Jun/20 00:27
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #355:
URL: https://github.com/apache/hive/pull/355


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 450132)
Time Spent: 20m  (was: 10m)

> Dictionary encoding for string types
> 
>
> Key: HIVE-19584
> URL: https://issues.apache.org/jira/browse/HIVE-19584
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Teddy Choi
>Assignee: Teddy Choi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-19584.1.patch, HIVE-19584.10.patch, 
> HIVE-19584.2.patch, HIVE-19584.3.patch, HIVE-19584.4.patch, 
> HIVE-19584.5.patch, HIVE-19584.6.patch, HIVE-19584.7.patch, 
> HIVE-19584.8.patch, HIVE-19584.9.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Apache Arrow supports dictionary encoding for some data types. So implement 
> dictionary encoding for string types in Arrow SerDe.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-13745) UDF current_date、current_timestamp、unix_timestamp NPE

2020-06-23 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-13745?focusedWorklogId=450141=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-450141
 ]

ASF GitHub Bot logged work on HIVE-13745:
-

Author: ASF GitHub Bot
Created on: 24/Jun/20 00:27
Start Date: 24/Jun/20 00:27
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #76:
URL: https://github.com/apache/hive/pull/76


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 450141)
Time Spent: 20m  (was: 10m)

> UDF current_date、current_timestamp、unix_timestamp NPE
> -
>
> Key: HIVE-13745
> URL: https://issues.apache.org/jira/browse/HIVE-13745
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Biao Wu
>Assignee: Biao Wu
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-13745.1.patch, HIVE-13745.2-branch-2.patch, 
> HIVE-13745.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> NullPointerException when current_date is used in mapreduce



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-12698) Remove exposure to internal privilege and principal classes in HiveAuthorizer

2020-06-23 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-12698?focusedWorklogId=450142=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-450142
 ]

ASF GitHub Bot logged work on HIVE-12698:
-

Author: ASF GitHub Bot
Created on: 24/Jun/20 00:27
Start Date: 24/Jun/20 00:27
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #58:
URL: https://github.com/apache/hive/pull/58


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 450142)
Time Spent: 20m  (was: 10m)

> Remove exposure to internal privilege and principal classes in HiveAuthorizer
> -
>
> Key: HIVE-12698
> URL: https://issues.apache.org/jira/browse/HIVE-12698
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization
>Affects Versions: 1.3.0, 2.0.0
>Reporter: Thejas Nair
>Assignee: Thejas Nair
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.3.0, 2.0.0
>
> Attachments: HIVE-12698.1.patch, HIVE-12698.2.patch, 
> HIVE-12698.3.patch, HIVE-12698.4.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> The changes in HIVE-11179 expose several internal classes to 
> HiveAuthorization implementations. These include PrivilegeObjectDesc, 
> PrivilegeDesc, PrincipalDesc and AuthorizationUtils.
> We should avoid exposing that to all Authorization implementations, but also 
> make the ability to customize the mapping of internal classes to the public 
> api classes possible for Apache Sentry (incubating).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-13545) Add GLOBAL Type to Entity

2020-06-23 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-13545?focusedWorklogId=450140=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-450140
 ]

ASF GitHub Bot logged work on HIVE-13545:
-

Author: ASF GitHub Bot
Created on: 24/Jun/20 00:27
Start Date: 24/Jun/20 00:27
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #73:
URL: https://github.com/apache/hive/pull/73


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 450140)
Time Spent: 20m  (was: 10m)

> Add GLOBAL Type to Entity
> -
>
> Key: HIVE-13545
> URL: https://issues.apache.org/jira/browse/HIVE-13545
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.3.0, 2.1.0
>Reporter: Dapeng Sun
>Assignee: Dapeng Sun
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-13545.001.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> {{ql/src/java/org/apache/hadoop/hive/ql/hooks/Entity.java}} don't have the 
> {{GLOBAL}} type, it should be matched with 
> {{org.apache.hadoop.hive.ql.security.authorization.plugin.HivePrivilegeObject.HivePrivilegeObjectType}}.
>  At the same time, we should enable the custom converting from Entity to 
> HivePrivilegeObject



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-15705) Event replication for constraints

2020-06-23 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-15705?focusedWorklogId=450138=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-450138
 ]

ASF GitHub Bot logged work on HIVE-15705:
-

Author: ASF GitHub Bot
Created on: 24/Jun/20 00:27
Start Date: 24/Jun/20 00:27
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #219:
URL: https://github.com/apache/hive/pull/219


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 450138)
Time Spent: 20m  (was: 10m)

> Event replication for constraints
> -
>
> Key: HIVE-15705
> URL: https://issues.apache.org/jira/browse/HIVE-15705
> Project: Hive
>  Issue Type: Sub-task
>  Components: repl
>Reporter: Daniel Dai
>Assignee: Daniel Dai
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.0.0
>
> Attachments: HIVE-15705.1.patch, HIVE-15705.2.patch, 
> HIVE-15705.3.patch, HIVE-15705.4.patch, HIVE-15705.5.patch, 
> HIVE-15705.6.patch, HIVE-15705.7.patch, HIVE-15705.8.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Make event replication for primary key and foreign key work.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-17331) Path must be used as key type of the pathToAlises

2020-06-23 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-17331?focusedWorklogId=450135=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-450135
 ]

ASF GitHub Bot logged work on HIVE-17331:
-

Author: ASF GitHub Bot
Created on: 24/Jun/20 00:27
Start Date: 24/Jun/20 00:27
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #292:
URL: https://github.com/apache/hive/pull/292


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 450135)
Time Spent: 20m  (was: 10m)

> Path must be used as key type of the pathToAlises
> -
>
> Key: HIVE-17331
> URL: https://issues.apache.org/jira/browse/HIVE-17331
> Project: Hive
>  Issue Type: Bug
>Reporter: Oleg Danilov
>Assignee: Oleg Danilov
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.0.0
>
> Attachments: HIVE-17331.2.patch, HIVE-17331.3.patch, 
> HIVE-17331.4.patch, HIVE-17331.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> This code uses String instead of Path as key type of the pathToAliases map, 
> so seems like get(String) always null.
> +*GenMapRedUtils.java*+
> {code:java}
> for (int pos = 0; pos < size; pos++) {
>   String taskTmpDir = taskTmpDirLst.get(pos);
>   TableDesc tt_desc = tt_descLst.get(pos);
>   MapWork mWork = plan.getMapWork();
>   if (mWork.getPathToAliases().get(taskTmpDir) == null) {
> taskTmpDir = taskTmpDir.intern();
> Path taskTmpDirPath = 
> StringInternUtils.internUriStringsInPath(new Path(taskTmpDir));
> mWork.removePathToAlias(taskTmpDirPath);
> mWork.addPathToAlias(taskTmpDirPath, taskTmpDir);
> mWork.addPathToPartitionInfo(taskTmpDirPath, new 
> PartitionDesc(tt_desc, null));
> mWork.getAliasToWork().put(taskTmpDir, topOperators.get(pos));
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-18777) Add Authorization interface to support information_schema integration with external authorization

2020-06-23 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-18777?focusedWorklogId=450137=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-450137
 ]

ASF GitHub Bot logged work on HIVE-18777:
-

Author: ASF GitHub Bot
Created on: 24/Jun/20 00:27
Start Date: 24/Jun/20 00:27
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #312:
URL: https://github.com/apache/hive/pull/312


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 450137)
Time Spent: 20m  (was: 10m)

> Add Authorization interface to support information_schema integration with 
> external authorization
> -
>
> Key: HIVE-18777
> URL: https://issues.apache.org/jira/browse/HIVE-18777
> Project: Hive
>  Issue Type: Bug
>Reporter: Thejas Nair
>Assignee: Thejas Nair
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.0.0
>
> Attachments: HIVE-18777.1.patch, HIVE-18777.2.patch, 
> HIVE-18777.3.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> HIVE-1010 added support for information_schema. However, the authorization 
> information is not integrated when another project such as Ranger is used to 
> do the authorization.
> We need to add API which Ranger/Sentry can implement, so that it is possible 
> to retrieve authorization policy information from them.
> The existing API only supports checking if user has a permission on an object 
> and can't be used to retrieve policy details.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-19423) REPL LOAD creates staging directory in source dump directory instead of table data location

2020-06-23 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-19423?focusedWorklogId=450131=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-450131
 ]

ASF GitHub Bot logged work on HIVE-19423:
-

Author: ASF GitHub Bot
Created on: 24/Jun/20 00:27
Start Date: 24/Jun/20 00:27
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #340:
URL: https://github.com/apache/hive/pull/340


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 450131)
Time Spent: 20m  (was: 10m)

> REPL LOAD creates staging directory in source dump directory instead of table 
> data location
> ---
>
> Key: HIVE-19423
> URL: https://issues.apache.org/jira/browse/HIVE-19423
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, HiveServer2, repl
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: Hive, Repl, pull-request-available
> Fix For: 3.0.0, 3.1.0
>
> Attachments: HIVE-19423.01.patch, HIVE-19423.02.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> REPL LOAD creates staging directory in source dump directory instead of table 
> data location. In case of replication from on-perm to cloud it can create 
> problem. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-17366) Constraint replication in bootstrap

2020-06-23 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-17366?focusedWorklogId=450136=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-450136
 ]

ASF GitHub Bot logged work on HIVE-17366:
-

Author: ASF GitHub Bot
Created on: 24/Jun/20 00:27
Start Date: 24/Jun/20 00:27
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #236:
URL: https://github.com/apache/hive/pull/236


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 450136)
Time Spent: 20m  (was: 10m)

> Constraint replication in bootstrap
> ---
>
> Key: HIVE-17366
> URL: https://issues.apache.org/jira/browse/HIVE-17366
> Project: Hive
>  Issue Type: New Feature
>  Components: repl
>Reporter: Daniel Dai
>Assignee: Daniel Dai
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.0.0
>
> Attachments: HIVE-17366.1.patch, HIVE-17366.2.patch, 
> HIVE-17366.3.patch, HIVE-17366.4.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Incremental constraint replication is tracked in HIVE-15705. This is to track 
> the bootstrap replication.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-15848) count or sum distinct incorrect when hive.optimize.reducededuplication set to true

2020-06-23 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-15848?focusedWorklogId=450133=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-450133
 ]

ASF GitHub Bot logged work on HIVE-15848:
-

Author: ASF GitHub Bot
Created on: 24/Jun/20 00:27
Start Date: 24/Jun/20 00:27
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #150:
URL: https://github.com/apache/hive/pull/150


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 450133)
Time Spent: 20m  (was: 10m)

> count or sum distinct incorrect when hive.optimize.reducededuplication set to 
> true
> --
>
> Key: HIVE-15848
> URL: https://issues.apache.org/jira/browse/HIVE-15848
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.13.0
>Reporter: Biao Wu
>Assignee: Zoltan Haindrich
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 2.3.0
>
> Attachments: HIVE-15848.1.patch, HIVE-15848.2.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Test Table:
> {code:sql}
> create table test(id int,key int,name int);
> {code}
> Data：
> ||id||key||name||
> |1|1  |2
> |1|2  |3
> |1|3  |2
> |1|4  |2
> |1|5  |3
> Test SQL1:
> {code:sql}
> select id,count(Distinct key),count(Distinct name)
> from (select id,key,name from count_distinct_test group by id,key,name)m
> group by id;
> {code}
> result：
> |1|5|4
> expect:
> |1|5|2
> Test SQL2:
> {code:sql}
> select id,count(Distinct name),count(Distinct key)
> from (select id,key,name from count_distinct_test group by id,name,key)m
> group by id;
> {code}
> result:
> |1|2|5



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-18130) Update table path to storage description parameter when alter a table

2020-06-23 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-18130?focusedWorklogId=450130=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-450130
 ]

ASF GitHub Bot logged work on HIVE-18130:
-

Author: ASF GitHub Bot
Created on: 24/Jun/20 00:27
Start Date: 24/Jun/20 00:27
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #273:
URL: https://github.com/apache/hive/pull/273


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 450130)
Time Spent: 20m  (was: 10m)

> Update table path to storage description parameter when alter a table
> -
>
> Key: HIVE-18130
> URL: https://issues.apache.org/jira/browse/HIVE-18130
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Zhe Sun
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-18130.02.patch, HIVE-18130.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> When an managed table is created by Spark, table path information is not only 
> store in `location` field (first figure), but also in `parameters.path` 
> fields (second figure).
> The `parameters.path` should be also checked and updated when alter a table
> Example explaining this issue can be found here 
> https://github.com/apache/hive/pull/273



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-13539) HiveHFileOutputFormat searching the wrong directory for HFiles

2020-06-23 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-13539?focusedWorklogId=450143=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-450143
 ]

ASF GitHub Bot logged work on HIVE-13539:
-

Author: ASF GitHub Bot
Created on: 24/Jun/20 00:27
Start Date: 24/Jun/20 00:27
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #74:
URL: https://github.com/apache/hive/pull/74


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 450143)
Time Spent: 20m  (was: 10m)

> HiveHFileOutputFormat searching the wrong directory for HFiles
> --
>
> Key: HIVE-13539
> URL: https://issues.apache.org/jira/browse/HIVE-13539
> Project: Hive
>  Issue Type: Bug
>  Components: HBase Handler
>Affects Versions: 1.1.0
> Environment: Built into CDH 5.4.7
>Reporter: Tim Robertson
>Assignee: Chaoyu Tang
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 2.1.1, 2.2.0
>
> Attachments: HIVE-13539.1.patch, HIVE-13539.patch, 
> hive_hfile_output_format.q, hive_hfile_output_format.q.out
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> When creating HFiles for a bulkload in HBase I believe it is looking in the 
> wrong directory to find the HFiles, resulting in the following exception:
> {code}
> Error: java.lang.RuntimeException: Hive Runtime Error while closing 
> operators: java.io.IOException: Multiple family directories found in 
> hdfs://c1n1.gbif.org:8020/user/hive/warehouse/tim.db/coords_hbase/_temporary/2/_temporary
>   at 
> org.apache.hadoop.hive.ql.exec.mr.ExecReducer.close(ExecReducer.java:295)
>   at 
> org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:453)
>   at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392)
>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.io.IOException: Multiple family directories found in 
> hdfs://c1n1.gbif.org:8020/user/hive/warehouse/tim.db/coords_hbase/_temporary/2/_temporary
>   at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.closeWriters(FileSinkOperator.java:188)
>   at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.closeOp(FileSinkOperator.java:958)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:598)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610)
>   at 
> org.apache.hadoop.hive.ql.exec.mr.ExecReducer.close(ExecReducer.java:287)
>   ... 7 more
> Caused by: java.io.IOException: Multiple family directories found in 
> hdfs://c1n1.gbif.org:8020/user/hive/warehouse/tim.db/coords_hbase/_temporary/2/_temporary
>   at 
> org.apache.hadoop.hive.hbase.HiveHFileOutputFormat$1.close(HiveHFileOutputFormat.java:158)
>   at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.closeWriters(FileSinkOperator.java:185)
>   ... 11 more
> {code}
> The issue is that is looks for the HFiles in 
> {{hdfs://c1n1.gbif.org:8020/user/hive/warehouse/tim.db/coords_hbase/_temporary/2/_temporary}}
>  when I believe it should be looking in the task attempt subfolder, such as 
> {{hdfs://c1n1.gbif.org:8020/user/hive/warehouse/tim.db/coords_hbase/_temporary/2/_temporary/attempt_1461004169450_0002_r_00_1000}}.
> This can be reproduced in any HFile creation such as:
> {code:sql}
> CREATE TABLE coords_hbase(id INT, x DOUBLE, y DOUBLE)
> STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
> WITH SERDEPROPERTIES (
>   'hbase.columns.mapping' = ':key,o:x,o:y',
>   'hbase.table.default.storage.type' = 'binary');
> SET hfile.family.path=/tmp/coords_hfiles/o; 
> SET hive.hbase.generatehfiles=true;
> INSERT OVERWRITE TABLE coords_hbase 
> SELECT id, decimalLongitude, decimalLatitude
> FROM source
> CLUSTER BY id; 
> {code}
> Any advice greatly appreciated



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-15900) Beeline prints tez job progress in stdout instead of stderr

2020-06-23 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-15900?focusedWorklogId=450139=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-450139
 ]

ASF GitHub Bot logged work on HIVE-15900:
-

Author: ASF GitHub Bot
Created on: 24/Jun/20 00:27
Start Date: 24/Jun/20 00:27
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #148:
URL: https://github.com/apache/hive/pull/148


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 450139)
Time Spent: 20m  (was: 10m)

> Beeline prints tez job progress in stdout instead of stderr
> ---
>
> Key: HIVE-15900
> URL: https://issues.apache.org/jira/browse/HIVE-15900
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 2.2.0
>Reporter: Aswathy Chellammal Sreekumar
>Assignee: Thejas Nair
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.2.0
>
> Attachments: HIVE-15900.1.patch, HIVE-15900.2.patch, 
> HIVE-15900.3.patch, HIVE-15900.3.patch, std_out
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Tez job progress messages are getting updated to stdout instead of stderr
> Attaching output file for below command, with the tez job status printed
> $HIVE_HOME/bin/beeline -n  -p  -u " --outputformat=tsv -e "analyze table studenttab10k compute statistics;" > 
> stdout



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-14660) ArrayIndexOutOfBoundsException on delete

2020-06-23 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-14660?focusedWorklogId=450129=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-450129
 ]

ASF GitHub Bot logged work on HIVE-14660:
-

Author: ASF GitHub Bot
Created on: 24/Jun/20 00:27
Start Date: 24/Jun/20 00:27
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #299:
URL: https://github.com/apache/hive/pull/299


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 450129)
Time Spent: 20m  (was: 10m)

> ArrayIndexOutOfBoundsException on delete
> 
>
> Key: HIVE-14660
> URL: https://issues.apache.org/jira/browse/HIVE-14660
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor, Transactions
>Affects Versions: 1.2.1
>Reporter: Benjamin BONNET
>Assignee: Benjamin BONNET
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-14660.1-banch-1.2.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Hi,
> DELETE on an ACID table may fail on an ArrayIndexOutOfBoundsException.
> That bug occurs at Reduce phase when there are less reducers than the number 
> of the table buckets.
> In order to reproduce, create a simple ACID table :
> {code:sql}
> CREATE TABLE test (`cle` bigint,`valeur` string)
>  PARTITIONED BY (`annee` string)
>  CLUSTERED BY (cle) INTO 5 BUCKETS
>  TBLPROPERTIES ('transactional'='true');
> {code}
> Populate it with lines distributed among all buckets, with random values and 
> a few partitions.
> Force the Reducers to be less than the buckets :
> {code:sql}
> set mapred.reduce.tasks=1;
> {code}
> Then execute a delete that will remove many lines from all the buckets.
> {code:sql}
> DELETE FROM test WHERE valeur<'some_value';
> {code}
> Then you will get an ArrayIndexOutOfBoundsException :
> {code}
> 2016-08-22 21:21:02,500 [FATAL] [TezChild] |tez.ReduceRecordSource|: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row (tag=0) 
> {"key":{"reducesinkkey0":{"transactionid":119,"bucketid":0,"rowid":0}},"value":{"_col0":"4"}}
> at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:352)
> at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:274)
> at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.run(ReduceRecordProcessor.java:252)
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:150)
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:139)
> at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:344)
> at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:181)
> at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:172)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
> at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:172)
> at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:168)
> at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.ArrayIndexOutOfBoundsException: 5
> at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:769)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:838)
> at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:88)
> at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:343)
> ... 17 more
> {code}
> Adding logs into FileSinkOperator, one sees the operator deals with buckets 
> 0, 1, 2, 3, 4, then 0 again and it fails at line 769 :

[jira] [Work logged] (HIVE-17077) Hive should raise StringIndexOutOfBoundsException when LPAD/RPAD len character's value is negative number

2020-06-23 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-17077?focusedWorklogId=450119=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-450119
 ]

ASF GitHub Bot logged work on HIVE-17077:
-

Author: ASF GitHub Bot
Created on: 24/Jun/20 00:26
Start Date: 24/Jun/20 00:26
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #203:
URL: https://github.com/apache/hive/pull/203


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 450119)
Time Spent: 20m  (was: 10m)

> Hive should raise StringIndexOutOfBoundsException when LPAD/RPAD len 
> character's value is negative number
> -
>
> Key: HIVE-17077
> URL: https://issues.apache.org/jira/browse/HIVE-17077
> Project: Hive
>  Issue Type: Bug
>Reporter: Lingang Deng
>Assignee: Lingang Deng
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> lpad(rpad) throw a exception when the second argument a negative number, as 
> follows,
> {code:java}
> hive> select lpad("hello", -1 ,"h");
> FAILED: StringIndexOutOfBoundsException String index out of range: -1
> hive> select rpad("hello", -1 ,"h");
> FAILED: StringIndexOutOfBoundsException String index out of range: -1
> {code}
> Maybe we should return friendly result such as mysql.
> {code:java}
> mysql> select lpad("hello", -1 ,"h");
> +--+
> | lpad("hello", -1 ,"h") |
> +--+
> | NULL |
> +--+
> 1 row in set (0.00 sec)
> mysql> select rpad("hello", -1 ,"h");
> +--+
> | rpad("hello", -1 ,"h") |
> +--+
> | NULL |
> +--+
> 1 row in set (0.00 sec)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-19103) Nested structure Projection Push Down in Hive with ORC

2020-06-23 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-19103?focusedWorklogId=450124=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-450124
 ]

ASF GitHub Bot logged work on HIVE-19103:
-

Author: ASF GitHub Bot
Created on: 24/Jun/20 00:26
Start Date: 24/Jun/20 00:26
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #330:
URL: https://github.com/apache/hive/pull/330


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 450124)
Time Spent: 20m  (was: 10m)

> Nested structure Projection Push Down in Hive with ORC
> --
>
> Key: HIVE-19103
> URL: https://issues.apache.org/jira/browse/HIVE-19103
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive, ORC
>Reporter: Ashish Sharma
>Assignee: Ashish Sharma
>Priority: Critical
>  Labels: pull-request-available
> Attachments: HIVE-19103.2.patch, HIVE-19103.3.patch, HIVE-19103.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Reading required columns only in nested structure schema
> Example - 
> *Current state* - 
> Schema  -  struct,g:string>>
> Query - select c.e.f from t where c.e.f > 10;
> Current state - read entire c struct from the file and then filter because 
> "hive.io.file.readcolumn.ids" is referred due to which all the children 
> column are select to read from the file.
> Conf -
>  _hive.io.file.readcolumn.ids  = "2"
>  hive.io.file.readNestedColumn.paths = "c.e.f"_
> Result -   
> boolean[ ] include  = [true,false,false,true,true,true,true,true]
> *Expected state* -
> Schema  -  struct,g:string>>
> Query - select c.e.f from t where c.e.f > 10;
> Expected state - instead of reading entire c struct from the file just read 
> only the f column by referring the  " hive.io.file.readNestedColumn.paths".
> Conf -
>  _hive.io.file.readcolumn.ids  = "2"
>  hive.io.file.readNestedColumn.paths = "c.e.f"_
> Result -   
> boolean[ ] include  = [true,false,false,true,false,true,true,false]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-17990) Add Thrift and DB storage for Schema Registry objects

2020-06-23 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-17990?focusedWorklogId=450125=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-450125
 ]

ASF GitHub Bot logged work on HIVE-17990:
-

Author: ASF GitHub Bot
Created on: 24/Jun/20 00:26
Start Date: 24/Jun/20 00:26
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #308:
URL: https://github.com/apache/hive/pull/308


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 450125)
Time Spent: 20m  (was: 10m)

> Add Thrift and DB storage for Schema Registry objects
> -
>
> Key: HIVE-17990
> URL: https://issues.apache.org/jira/browse/HIVE-17990
> Project: Hive
>  Issue Type: Sub-task
>  Components: Standalone Metastore
>Reporter: Alan Gates
>Assignee: Alan Gates
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.0.0
>
> Attachments: Adding-Schema-Registry-to-Metastore.pdf, 
> HIVE-17990.2.patch, HIVE-17990.3.patch, HIVE-17990.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> This JIRA tracks changes to Thrift, RawStore, and DB scripts to support 
> objects in the Schema Registry.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-12898) Hive should support ORC block skipping on nested fields

2020-06-23 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-12898?focusedWorklogId=450117=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-450117
 ]

ASF GitHub Bot logged work on HIVE-12898:
-

Author: ASF GitHub Bot
Created on: 24/Jun/20 00:26
Start Date: 24/Jun/20 00:26
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #346:
URL: https://github.com/apache/hive/pull/346


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 450117)
Time Spent: 20m  (was: 10m)

> Hive should support ORC block skipping on nested fields
> ---
>
> Key: HIVE-12898
> URL: https://issues.apache.org/jira/browse/HIVE-12898
> Project: Hive
>  Issue Type: Improvement
>  Components: ORC
>Affects Versions: 0.14.0, 1.2.1
>Reporter: Michael Haeusler
>Assignee: Ashish Sharma
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Hive supports predicate pushdown (block skipping) for ORC tables only on 
> top-level fields. Hive should also support block skipping on nested fields 
> (within structs).
> Example top-level: the following query selects 0 rows, using a predicate on 
> top-level column foo. We also see 0 INPUT_RECORDS in the summary:
> {code:sql}
> SET hive.tez.exec.print.summary=true;
> CREATE TABLE t_toplevel STORED AS ORC AS SELECT 23 AS foo;
> SELECT * FROM t_toplevel WHERE foo=42 ORDER BY foo;
> [...]
> VERTICES TOTAL_TASKS  FAILED_ATTEMPTS KILLED_TASKS DURATION_SECONDS   
>  CPU_TIME_MILLIS GC_TIME_MILLIS  INPUT_RECORDS   OUTPUT_RECORDS
> Map 1  100 1.22   
>2,640102  00
> {code}
> Example nested: the following query also selects 0 rows, but using a 
> predicate on nested column foo.bar. Unfortunately we see 1 INPUT_RECORDS in 
> the summary:
> {code:sql}
> SET hive.tez.exec.print.summary=true;
> CREATE TABLE t_nested STORED AS ORC AS SELECT NAMED_STRUCT('bar', 23) AS foo;
> SELECT * FROM t_nested WHERE foo.bar=42 ORDER BY foo;
> [...]
> VERTICES TOTAL_TASKS  FAILED_ATTEMPTS KILLED_TASKS DURATION_SECONDS   
>  CPU_TIME_MILLIS GC_TIME_MILLIS  INPUT_RECORDS   OUTPUT_RECORDS
> Map 1  100 3.66   
>5,210 68  10
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-18338) [Client, JDBC] Expose async interface through hive JDBC.

2020-06-23 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-18338?focusedWorklogId=450116=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-450116
 ]

ASF GitHub Bot logged work on HIVE-18338:
-

Author: ASF GitHub Bot
Created on: 24/Jun/20 00:26
Start Date: 24/Jun/20 00:26
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #284:
URL: https://github.com/apache/hive/pull/284


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 450116)
Time Spent: 20m  (was: 10m)

> [Client, JDBC] Expose async interface through hive JDBC.
> 
>
> Key: HIVE-18338
> URL: https://issues.apache.org/jira/browse/HIVE-18338
> Project: Hive
>  Issue Type: Improvement
>  Components: Clients, JDBC
>Affects Versions: 2.3.2
>Reporter: Amruth Sampath
>Assignee: Amruth Sampath
>Priority: Minor
>  Labels: pull-request-available
> Attachments: HIVE-18338.patch, HIVE-18338.patch.1, 
> HIVE-18338.patch.2, HIVE-18338.patch.3, HIVE-18338.patch.4
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> This exposes async API in HiveStatement (jdbc module)
> The JDBC interface always have had strict synchronous APIs. 
> So the hive JDBC implementation also had to follow it though the hive server 
> is fully asynchronous.
> Developers trying to build proxies on top of hive servers end up writing 
> thrift client from scratch to make it asynchronous and robust to its restarts.
> The common pattern is
>  # Submit query, get operation handle and store in a persistent store
>  # Poll and wait for completion
>  # Stream results
>  # In the event of restarts, restore OperationHandle from persistent store 
> and continue execution.
> The patch does 2 things
>  * exposes operation handle (once a query is submitted) 
> {{getOperationhandle()}} 
> Developers can persist this along with the actual hive server url 
> {{getJdbcUrl}}
>  * latch APIs 
> Developers can create a statement and latch on to an operation handle that 
> was persisted earlier. For latch, the statement should be created from the 
> actual hive server URI connection in which the query was submitted.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-12274) Increase width of columns used for general configuration in the metastore.

2020-06-23 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-12274?focusedWorklogId=450120=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-450120
 ]

ASF GitHub Bot logged work on HIVE-12274:
-

Author: ASF GitHub Bot
Created on: 24/Jun/20 00:26
Start Date: 24/Jun/20 00:26
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #232:
URL: https://github.com/apache/hive/pull/232


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 450120)
Time Spent: 20m  (was: 10m)

> Increase width of columns used for general configuration in the metastore.
> --
>
> Key: HIVE-12274
> URL: https://issues.apache.org/jira/browse/HIVE-12274
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Affects Versions: 2.0.0
>Reporter: Elliot West
>Assignee: Naveen Gangam
>Priority: Major
>  Labels: metastore, pull-request-available
> Fix For: 2.3.0, 3.0.0
>
> Attachments: HIVE-12274.2.patch, HIVE-12274.3.patch, 
> HIVE-12274.4.patch, HIVE-12274.5.patch, HIVE-12274.example.ddl.hql, 
> HIVE-12274.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> h2. Overview
> This issue is very similar in principle to HIVE-1364. We are hitting a limit 
> when processing JSON data that has a large nested schema. The struct 
> definition is truncated when inserted into the metastore database column 
> {{COLUMNS_V2.TYPE_NAME}} as it is greater than 4000 characters in length.
> Given that the purpose of these columns is to hold very loosely defined 
> configuration values it seems rather limiting to impose such a relatively low 
> length bound. One can imagine that valid use cases will arise where 
> reasonable parameter/property values exceed the current limit. 
> h2. Context
> These limitations were in by the [patch 
> attributed|https://github.com/apache/hive/commit/c21a526b0a752df2a51d20a2729cc8493c228799]
>  to HIVE-1364 which mentions the _"max length on Oracle 9i/10g/11g"_ as the 
> reason. However, nowadays the limit can be increased because:
> * Oracle DB's {{varchar2}} supports 32767 bytes now, by setting the 
> configuration parameter {{MAX_STRING_SIZE}} to {{EXTENDED}}. 
> ([source|http://docs.oracle.com/database/121/SQLRF/sql_elements001.htm#SQLRF55623])
> * Postgres supports a max of 1GB for {{character}} datatype. 
> ([source|http://www.postgresql.org/docs/8.3/static/datatype-character.html])
> * MySQL can support upto 65535 bytes for the entire row. So long as the 
> {{PARAM_KEY}} value + {{PARAM_VALUE}} is less than 65535, we should be good. 
> ([source|http://dev.mysql.com/doc/refman/5.0/en/char.html])
> * SQL Server's {{varchar}} max length is 8000 and can go beyond using 
> "varchar(max)" with the same limitation as MySQL being 65535 bytes for the 
> entire row. ([source|http://dev.mysql.com/doc/refman/5.0/en/char.html])
> * Derby's {{varchar}} can be upto 32672 bytes. 
> ([source|https://db.apache.org/derby/docs/10.7/ref/rrefsqlj41207.html])
> h2. Proposal
> Can these columns not use CLOB-like types as for example as used by 
> {{TBLS.VIEW_EXPANDED_TEXT}}? It would seem that suitable type equivalents 
> exist for all targeted database platforms:
> * MySQL: {{mediumtext}}
> * Postgres: {{text}}
> * Oracle: {{CLOB}}
> * Derby: {{LONG VARCHAR}}
> I'd suggest that the candidates for type change are:
> * {{COLUMNS_V2.TYPE_NAME}}
> * {{TABLE_PARAMS.PARAM_VALUE}}
> * {{SERDE_PARAMS.PARAM_VALUE}}
> * {{SD_PARAMS.PARAM_VALUE}}
> After updating the maximum length the metastore database needs to be 
> configured and restarted with the new settings. Altering {{MAX_STRING_SIZE}} 
> will update database objects and possibly invalidate them, as follows:
> * Tables with virtual columns will be updated with new data type metadata for 
> virtual columns of {{VARCHAR2(4000)}}, 4000-byte {{NVARCHAR2}}, or 
> {{RAW(2000)}} type.
> * Functional indexes will become unusable if a change to their associated 
> virtual columns causes the index key to exceed index key length limits. 
> Attempts to rebuild such indexes will fail with {{ORA-01450: maximum key 
> length exceeded}}.
> * Views will be invalidated if they contain {{VARCHAR2(4000)}}, 4000-byte 
> {{NVARCHAR2}}, or {{RAW(2000)}} typed expression columns.
> * Materialized views will be updated with new metadata {{VARCHAR2(4000)}}, 
> 4000-byte {{NVARCHAR2}}, and {{RAW(2000)}} typed expression columns
>

[jira] [Work logged] (HIVE-16645) Commands.java has missed the catch statement and has some code format errors

2020-06-23 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-16645?focusedWorklogId=450126=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-450126
 ]

ASF GitHub Bot logged work on HIVE-16645:
-

Author: ASF GitHub Bot
Created on: 24/Jun/20 00:26
Start Date: 24/Jun/20 00:26
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #183:
URL: https://github.com/apache/hive/pull/183


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 450126)
Time Spent: 20m  (was: 10m)

> Commands.java has missed the catch statement and has some code format errors
> 
>
> Key: HIVE-16645
> URL: https://issues.apache.org/jira/browse/HIVE-16645
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Saijin Huang
>Assignee: Saijin Huang
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.0.0
>
> Attachments: HIVE-16645.1.patch, HIVE-16645.2.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> In commands.java, the catch statement is missing and the Resultset statement 
> is not closed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-17260) Typo: exception has been created and lost in the ThriftJDBCBinarySerDe

2020-06-23 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-17260?focusedWorklogId=450114=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-450114
 ]

ASF GitHub Bot logged work on HIVE-17260:
-

Author: ASF GitHub Bot
Created on: 24/Jun/20 00:26
Start Date: 24/Jun/20 00:26
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #224:
URL: https://github.com/apache/hive/pull/224


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 450114)
Time Spent: 20m  (was: 10m)

> Typo: exception has been created and lost in the ThriftJDBCBinarySerDe
> --
>
> Key: HIVE-17260
> URL: https://issues.apache.org/jira/browse/HIVE-17260
> Project: Hive
>  Issue Type: Bug
>Reporter: Oleg Danilov
>Assignee: Oleg Danilov
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.0.0
>
> Attachments: HIVE-17260.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Line 100:
> {code:java}
> } catch (Exception e) {
>   new SerDeException(e);
> }
> {code}
> Seems like it should be thrown there :-)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-19038) LLAP: Service loader throws "Provider not found" exception if hive-llap-server is in class path while loading tokens

2020-06-23 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-19038?focusedWorklogId=450128=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-450128
 ]

ASF GitHub Bot logged work on HIVE-19038:
-

Author: ASF GitHub Bot
Created on: 24/Jun/20 00:26
Start Date: 24/Jun/20 00:26
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #327:
URL: https://github.com/apache/hive/pull/327


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 450128)
Time Spent: 20m  (was: 10m)

> LLAP: Service loader throws "Provider not found" exception if 
> hive-llap-server is in class path while loading tokens
> 
>
> Key: HIVE-19038
> URL: https://issues.apache.org/jira/browse/HIVE-19038
> Project: Hive
>  Issue Type: Bug
>Reporter: Arun Mahadevan
>Assignee: Arun Mahadevan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.0.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> While testing storm in secure mode, the hive-llap-server jar file was 
> included in the class path and resulted in the below exception while trying 
> to renew credentials when invoking 
> "org.apache.hadoop.security.token.Token.getRenewer"
>  
>  
> {noformat}
> java.util.ServiceConfigurationError: 
> org.apache.hadoop.security.token.TokenRenewer: Provider 
> org.apache.hadoop.hive.llap.security.LlapTokenIdentifier.Renewer not found at 
> java.util.ServiceLoader.fail(ServiceLoader.java:239) ~[?:1.8.0_161] at 
> java.util.ServiceLoader.access$300(ServiceLoader.java:185) ~[?:1.8.0_161] at 
> java.util.ServiceLoader$LazyIterator.nextService(ServiceLoader.java:372) 
> ~[?:1.8.0_161] at 
> java.util.ServiceLoader$LazyIterator.next(ServiceLoader.java:404) 
> ~[?:1.8.0_161] at 
> java.util.ServiceLoader$1.next(ServiceLoader.java:480) ~[?:1.8.0_161] at 
> org.apache.hadoop.security.token.Token.getRenewer(Token.java:463) 
> ~[hadoop-common-3.0.0.3.0.0.0-1064.jar:?] at 
> org.apache.hadoop.security.token.Token.renew(Token.java:490) 
> ~[hadoop-common-3.0.0.3.0.0.0-1064.jar:?] at 
> org.apache.storm.hdfs.security.AutoHDFS.doRenew(AutoHDFS.java:159) 
> ~[storm-autocreds-1.2.1.3.0.0.0-1064.jar:1.2.1.3.0.0.0-1064] at 
> org.apache.storm.common.AbstractAutoCreds.renew(AbstractAutoCreds.java:104) 
> ~[storm-autocreds-1.2.1.3.0.0.0-1064.jar:1.2.1.3.0.0.0-1064] at 
> sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_161] at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> ~[?:1.8.0_161] at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[?:1.8.0_161] at java.lang.reflect.Method.invoke(Method.java:498) 
> ~[?:1.8.0_161] at 
> clojure.lang.Reflector.invokeMatchingMethod(Reflector.java:93) 
> ~[clojure-1.7.0.jar:?] at 
> clojure.lang.Reflector.invokeInstanceMethod(Reflector.java:28) 
> ~[clojure-1.7.0.jar:?] at 
> org.apache.storm.daemon.nimbus$renew_credentials$fn__9121$fn__9126.invoke(nimbus.clj:1450)
>  ~[storm-core-1.2.1.3.0.0.0-1064.jar:1.2.1.3.0.0.0-1064] at 
> org.apache.storm.daemon.nimbus$renew_credentials$fn__9121.invoke(nimbus.clj:1449)
>  ~[storm-core-1.2.1.3.0.0.0-1064.jar:1.2.1.3.0.0.0-1064] at 
> org.apache.storm.daemon.nimbus$renew_credentials.invoke(nimbus.clj:1439) 
> ~[storm-core-1.2.1.3.0.0.0-1064.jar:1.2.1.3.0.0.0-1064] at 
> org.apache.storm.daemon.nimbus$fn__9547$exec_fn__3301__auto9548$fn__9567.invoke(nimbus.clj:2521)
>  ~[storm-core-1.2.1.3.0.0.0-1064.jar:1.2.1.3.0.0.0-1064] at 
> org.apache.storm.timer$schedule_recurring$this__1656.invoke(timer.clj:105) 
> ~[storm-core-1.2.1.3.0.0.0-1064.jar:1.2.1.3.0.0.0-1064] at 
> org.apache.storm.timer$mk_timer$fn__1639$fn__1640.invoke(timer.clj:50) 
> ~[storm-core-1.2.1.3.0.0.0-1064.jar:1.2.1.3.0.0.0-1064] at 
> org.apache.storm.timer$mk_timer$fn__1639.invoke(timer.clj:42) 
> ~[storm-core-1.2.1.3.0.0.0-1064.jar:1.2.1.3.0.0.0-1064] at 
> clojure.lang.AFn.run(AFn.java:22) ~[clojure-1.7.0.jar:?] at 
> java.lang.Thread.run(Thread.java:748) [?:1.8.0_161] 2018-03-22 22:08:59.088 
> o.a.s.util timer [ERROR] Halting process: ("Error when processing an event") 
> java.lang.RuntimeException: ("Error when processing an event") at 
> org.apache.storm.util$exit_process_BANG_.doInvoke(util.clj:341) 
> ~[storm-core-1.2.1.3.0.0.0-1064.jar:1.2.1.3.0.0.0-1064] at 
> clojure.lang.RestFn.invoke(RestFn.java:423)

[jira] [Work logged] (HIVE-19432) HIVE-7575: GetTablesOperation is too slow if the hive has too many databases and tables

2020-06-23 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-19432?focusedWorklogId=450118=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-450118
 ]

ASF GitHub Bot logged work on HIVE-19432:
-

Author: ASF GitHub Bot
Created on: 24/Jun/20 00:26
Start Date: 24/Jun/20 00:26
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #341:
URL: https://github.com/apache/hive/pull/341


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 450118)
Time Spent: 20m  (was: 10m)

> HIVE-7575: GetTablesOperation is too slow if the hive has too many databases 
> and tables
> ---
>
> Key: HIVE-19432
> URL: https://issues.apache.org/jira/browse/HIVE-19432
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive, HiveServer2
>Affects Versions: 2.2.0
>Reporter: Rajkumar Singh
>Assignee: Rajkumar Singh
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-19432.01.patch, HIVE-19432.01.patch, 
> HIVE-19432.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> GetTablesOperation is too slow since it does not check for the authorization 
> for databases and try pulling all the tables from all the databases using 
> getTableMeta. for operation like follows
> {code}
> con.getMetaData().getTables("", "", "%", new String[] \{ "TABLE", "VIEW" });
> {code}
> build the getTableMeta call with wildcard *
> {code}
>  metastore.HiveMetaStore: 8: get_table_metas : db=* tbl=*
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-19340) Disable timeout of transactions opened by replication task at target cluster

2020-06-23 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-19340?focusedWorklogId=450127=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-450127
 ]

ASF GitHub Bot logged work on HIVE-19340:
-

Author: ASF GitHub Bot
Created on: 24/Jun/20 00:26
Start Date: 24/Jun/20 00:26
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #337:
URL: https://github.com/apache/hive/pull/337


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 450127)
Time Spent: 20m  (was: 10m)

> Disable timeout of transactions opened by replication task at target cluster
> 
>
> Key: HIVE-19340
> URL: https://issues.apache.org/jira/browse/HIVE-19340
> Project: Hive
>  Issue Type: Sub-task
>  Components: repl, Transactions
>Affects Versions: 3.0.0
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: ACID, DR, pull-request-available, replication
> Fix For: 3.1.0, 4.0.0
>
> Attachments: HIVE-19340.01.patch, HIVE-19340.02.patch, 
> HIVE-19340.03-branch-3.patch, HIVE-19340.03.patch, 
> HIVE-19340.04-branch-3.patch, HIVE-19340.06-branch-3.patch, 
> HIVE-19340.06.patch, HIVE-19340.07-branch-3.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> The transactions opened by applying EVENT_OPEN_TXN should never be aborted 
> automatically due to time-out. Aborting of transaction started by replication 
> task may leads to inconsistent state at target which needs additional 
> overhead to clean-up. So, it is proposed to mark the transactions opened by 
> replication task as special ones and shouldn't be aborted if heart beat is 
> lost. This helps to ensure all ABORT and COMMIT events will always find the 
> corresponding txn at target to operate.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-16497) FileUtils. isActionPermittedForFileHierarchy, isOwnerOfFileHierarchy file system operations should be impersonated

2020-06-23 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-16497?focusedWorklogId=450113=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-450113
 ]

ASF GitHub Bot logged work on HIVE-16497:
-

Author: ASF GitHub Bot
Created on: 24/Jun/20 00:26
Start Date: 24/Jun/20 00:26
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #171:
URL: https://github.com/apache/hive/pull/171


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 450113)
Time Spent: 20m  (was: 10m)

> FileUtils. isActionPermittedForFileHierarchy, isOwnerOfFileHierarchy file 
> system operations should be impersonated
> --
>
> Key: HIVE-16497
> URL: https://issues.apache.org/jira/browse/HIVE-16497
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization
>Reporter: Thejas Nair
>Assignee: Thejas Nair
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.0.0
>
> Attachments: HIVE-16497.1.patch, HIVE-16497.2.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> FileUtils.isActionPermittedForFileHierarchy checks if user has permissions 
> for given action. The checks are made by impersonating the user.
> However, the listing of child dirs are done as the hiveserver2 user. If the 
> hive user doesn't have permissions on the filesystem, it gives incorrect 
> error that the user doesn't have permissions to perform the action.
> Impersonating the end user for all file operations in that function is also 
> logically correct thing to do.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-18295) Add ability to ignore invalid values in JSON SerDe

2020-06-23 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-18295?focusedWorklogId=450115=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-450115
 ]

ASF GitHub Bot logged work on HIVE-18295:
-

Author: ASF GitHub Bot
Created on: 24/Jun/20 00:26
Start Date: 24/Jun/20 00:26
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #278:
URL: https://github.com/apache/hive/pull/278


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 450115)
Time Spent: 20m  (was: 10m)

> Add ability to ignore invalid values in JSON SerDe
> --
>
> Key: HIVE-18295
> URL: https://issues.apache.org/jira/browse/HIVE-18295
> Project: Hive
>  Issue Type: Improvement
>  Components: HCatalog
>Reporter: Matthew Knox
>Assignee: Matthew Knox
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> It would be nice to be able to configure the JSON SerDe to ignore invalid 
> values while parsing JSON. 
> In our case our raw JSON data is ingested from multiple sources, some of 
> which unreliably sanitize the data. Our current practice is to cleanse the 
> data after ingestion, but that can lead to other issues as well. Having the 
> ability to simply default to NULL if a value can not be parsed would be 
> immensely helpful to us.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-17038) invalid result when CAST-ing to DATE

2020-06-23 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-17038?focusedWorklogId=450123=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-450123
 ]

ASF GitHub Bot logged work on HIVE-17038:
-

Author: ASF GitHub Bot
Created on: 24/Jun/20 00:26
Start Date: 24/Jun/20 00:26
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #204:
URL: https://github.com/apache/hive/pull/204


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 450123)
Time Spent: 20m  (was: 10m)

> invalid result when CAST-ing to DATE
> 
>
> Key: HIVE-17038
> URL: https://issues.apache.org/jira/browse/HIVE-17038
> Project: Hive
>  Issue Type: Bug
>  Components: CLI, Hive
>Affects Versions: 1.2.1
>Reporter: Jim Hopper
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> when casting incorrect date literals to DATE data type hive returns wrong 
> values instead of NULL.
> {code}
> SELECT CAST('2017-02-31' AS DATE);
> SELECT CAST('2017-04-31' AS DATE);
> {code}
> Some examples below where it really can produce weird results:
> {code}
> select *
>   from (
> select cast('2017-07-01' as date) as dt
> ) as t
> where t.dt = '2017-06-31';
> select *
>   from (
> select cast('2017-07-01' as date) as dt
> ) as t
> where t.dt = cast('2017-06-31' as date);
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-18755) Modifications to the metastore for catalogs

2020-06-23 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-18755?focusedWorklogId=450121=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-450121
 ]

ASF GitHub Bot logged work on HIVE-18755:
-

Author: ASF GitHub Bot
Created on: 24/Jun/20 00:26
Start Date: 24/Jun/20 00:26
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #320:
URL: https://github.com/apache/hive/pull/320


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 450121)
Time Spent: 20m  (was: 10m)

> Modifications to the metastore for catalogs
> ---
>
> Key: HIVE-18755
> URL: https://issues.apache.org/jira/browse/HIVE-18755
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Reporter: Alan Gates
>Assignee: Alan Gates
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.0.0
>
> Attachments: HIVE-18755.2.patch, HIVE-18755.3.patch, 
> HIVE-18755.4.patch, HIVE-18755.final.patch, HIVE-18755.nothrift, 
> HIVE-18755.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Step 1 of adding catalogs is to add support in the metastore.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-17983) Make the standalone metastore generate tarballs etc.

2020-06-23 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-17983?focusedWorklogId=450122=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-450122
 ]

ASF GitHub Bot logged work on HIVE-17983:
-

Author: ASF GitHub Bot
Created on: 24/Jun/20 00:26
Start Date: 24/Jun/20 00:26
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #291:
URL: https://github.com/apache/hive/pull/291


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 450122)
Time Spent: 20m  (was: 10m)

> Make the standalone metastore generate tarballs etc.
> 
>
> Key: HIVE-17983
> URL: https://issues.apache.org/jira/browse/HIVE-17983
> Project: Hive
>  Issue Type: Sub-task
>  Components: Standalone Metastore
>Reporter: Alan Gates
>Assignee: Alan Gates
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.0.0
>
> Attachments: HIVE-17983.2.patch, HIVE-17983.3.patch, 
> HIVE-17983.4.patch, HIVE-17983.5.patch, HIVE-17983.6.patch, HIVE-17983.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> In order to be separately installable the standalone metastore needs its own 
> tarballs, startup scripts, etc.  All of the SQL installation and upgrade 
> scripts also need to move from metastore to standalone-metastore.
> I also plan to create Dockerfiles for different database types so that 
> developers can test the SQL installation and upgrade scripts.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-19488) Enable CM root based on db parameter, identifying a db as source of replication.

2020-06-23 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-19488?focusedWorklogId=450112=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-450112
 ]

ASF GitHub Bot logged work on HIVE-19488:
-

Author: ASF GitHub Bot
Created on: 24/Jun/20 00:26
Start Date: 24/Jun/20 00:26
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #345:
URL: https://github.com/apache/hive/pull/345


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 450112)
Time Spent: 20m  (was: 10m)

> Enable CM root based on db parameter, identifying a db as source of 
> replication.
> 
>
> Key: HIVE-19488
> URL: https://issues.apache.org/jira/browse/HIVE-19488
> Project: Hive
>  Issue Type: Task
>  Components: Hive, HiveServer2, repl
>Affects Versions: 3.1.0
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.1.0, 4.0.0
>
> Attachments: HIVE-19488.01.patch, HIVE-19488.02.patch, 
> HIVE-19488.03.patch, HIVE-19488.04.patch, HIVE-19488.05.patch, 
> HIVE-19488.06.patch, HIVE-19488.07.patch, HIVE-19488.08-branch-3.patch, 
> HIVE-19488.08.patch, HIVE-19488.09-branch-3.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> * add a parameter at db level to identify if its a source of replication. 
> user should set this.
>  * Enable CM root only for databases that are a source of a replication 
> policy, for other db's skip the CM root functionality.
>  * prevent database drop if the parameter indicating its source of a 
> replication, is set.
>  * as an upgrade to this version, user should set the property on all 
> existing database policies, in affect.
>  * the parameter should be of the form . –  repl.source.for : List < policy 
> ids >



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-18973) Make transaction system work with catalogs

2020-06-23 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-18973?focusedWorklogId=450109=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-450109
 ]

ASF GitHub Bot logged work on HIVE-18973:
-

Author: ASF GitHub Bot
Created on: 24/Jun/20 00:25
Start Date: 24/Jun/20 00:25
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #353:
URL: https://github.com/apache/hive/pull/353


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 450109)
Time Spent: 20m  (was: 10m)

> Make transaction system work with catalogs
> --
>
> Key: HIVE-18973
> URL: https://issues.apache.org/jira/browse/HIVE-18973
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Affects Versions: 3.0.0
>Reporter: Alan Gates
>Assignee: Alan Gates
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-18973.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> The transaction tables need to understand catalogs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-17313) Potentially possible 'case fall through' in the ObjectInspectorConverters

2020-06-23 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-17313?focusedWorklogId=450106=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-450106
 ]

ASF GitHub Bot logged work on HIVE-17313:
-

Author: ASF GitHub Bot
Created on: 24/Jun/20 00:25
Start Date: 24/Jun/20 00:25
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #230:
URL: https://github.com/apache/hive/pull/230


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 450106)
Time Spent: 20m  (was: 10m)

> Potentially possible 'case fall through' in the ObjectInspectorConverters
> -
>
> Key: HIVE-17313
> URL: https://issues.apache.org/jira/browse/HIVE-17313
> Project: Hive
>  Issue Type: Bug
>Reporter: Oleg Danilov
>Assignee: Oleg Danilov
>Priority: Trivial
>  Labels: pull-request-available
> Fix For: 3.0.0
>
> Attachments: HIVE-17313.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Lines 103-110:
> {code:java}
> case STRING:
>   if (outputOI instanceof WritableStringObjectInspector) {
> return new PrimitiveObjectInspectorConverter.TextConverter(
> inputOI);
>   } else if (outputOI instanceof JavaStringObjectInspector) {
> return new PrimitiveObjectInspectorConverter.StringConverter(
> inputOI);
>   }
> case CHAR:
> {code}
> De-facto it should work correctly since outputOI is either an instance of 
> WritableStringObjectInspector or JavaStringObjectInspector, but it would be 
> better to rewrite this case to avoid possible fall through.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-19267) Replicate ACID/MM tables write operations.

2020-06-23 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-19267?focusedWorklogId=450107=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-450107
 ]

ASF GitHub Bot logged work on HIVE-19267:
-

Author: ASF GitHub Bot
Created on: 24/Jun/20 00:25
Start Date: 24/Jun/20 00:25
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #335:
URL: https://github.com/apache/hive/pull/335


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 450107)
Time Spent: 40m  (was: 0.5h)

> Replicate ACID/MM tables write operations.
> --
>
> Key: HIVE-19267
> URL: https://issues.apache.org/jira/browse/HIVE-19267
> Project: Hive
>  Issue Type: Sub-task
>  Components: repl, Transactions
>Affects Versions: 3.0.0
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: ACID, DR, pull-request-available, replication
> Fix For: 4.0.0, 3.2.0
>
> Attachments: HIVE-19267.01-branch-3.patch, HIVE-19267.01.patch, 
> HIVE-19267.02-branch-3.patch, HIVE-19267.02.patch, HIVE-19267.03.patch, 
> HIVE-19267.04.patch, HIVE-19267.05.patch, HIVE-19267.06.patch, 
> HIVE-19267.07.patch, HIVE-19267.08.patch, HIVE-19267.09.patch, 
> HIVE-19267.10.patch, HIVE-19267.11.patch, HIVE-19267.12.patch, 
> HIVE-19267.13.patch, HIVE-19267.14.patch, HIVE-19267.15.patch, 
> HIVE-19267.16.patch, HIVE-19267.17.patch, HIVE-19267.18.patch, 
> HIVE-19267.19.patch, HIVE-19267.20.patch, HIVE-19267.21.patch, 
> HIVE-19267.22.patch
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
>  
> h1. Replicate ACID write Events
>  * Create new EVENT_WRITE event with related message format to log the write 
> operations with in a txn along with data associated.
>  * Log this event when perform any writes (insert into, insert overwrite, 
> load table, delete, update, merge, truncate) on table/partition.
>  * If a single MERGE/UPDATE/INSERT/DELETE statement operates on multiple 
> partitions, then need to log one event per partition.
>  * DbNotificationListener should log this type of event to special metastore 
> table named "MTxnWriteNotificationLog".
>  * This table should maintain a map of txn ID against list of 
> tables/partitions written by given txn.
>  * The entry for a given txn should be removed by the cleaner thread that 
> removes the expired events from EventNotificationTable.
> h1. Replicate Commit Txn operation (with writes)
> Add new EVENT_COMMIT_TXN to log the metadata/data of all tables/partitions 
> modified within the txn.
> *Source warehouse:*
>  * This event should read the EVENT_WRITEs from "MTxnWriteNotificationLog" 
> metastore table to consolidate the list of tables/partitions modified within 
> this txn scope.
>  * Based on the list of tables/partitions modified and table Write ID, need 
> to compute the list of delta files added by this txn.
>  * Repl dump should read this message and dump the metadata and delta files 
> list.
> *Target warehouse:*
>  * Ensure snapshot isolation at target for on-going read txns which shouldn't 
> view the data replicated from committed txn. (Ensured with open and allocate 
> write ID events).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-1010) Implement INFORMATION_SCHEMA in Hive

2020-06-23 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-1010?focusedWorklogId=450103=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-450103
 ]

ASF GitHub Bot logged work on HIVE-1010:


Author: ASF GitHub Bot
Created on: 24/Jun/20 00:25
Start Date: 24/Jun/20 00:25
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #181:
URL: https://github.com/apache/hive/pull/181


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 450103)
Time Spent: 20m  (was: 10m)

> Implement INFORMATION_SCHEMA in Hive
> 
>
> Key: HIVE-1010
> URL: https://issues.apache.org/jira/browse/HIVE-1010
> Project: Hive
>  Issue Type: New Feature
>  Components: Metastore, Query Processor, Server Infrastructure
>Reporter: Jeff Hammerbacher
>Assignee: Gunther Hagleitner
>Priority: Major
>  Labels: TODOC3.0, pull-request-available
> Fix For: 3.0.0
>
> Attachments: HIVE-1010.10.patch, HIVE-1010.11.patch, 
> HIVE-1010.12.patch, HIVE-1010.13.patch, HIVE-1010.14.patch, 
> HIVE-1010.15.patch, HIVE-1010.16.patch, HIVE-1010.7.patch, HIVE-1010.8.patch, 
> HIVE-1010.9.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> INFORMATION_SCHEMA is part of the SQL92 standard and would be useful to 
> implement using our metastore.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-15442) Driver.java has a redundancy code

2020-06-23 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-15442?focusedWorklogId=450111=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-450111
 ]

ASF GitHub Bot logged work on HIVE-15442:
-

Author: ASF GitHub Bot
Created on: 24/Jun/20 00:25
Start Date: 24/Jun/20 00:25
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #169:
URL: https://github.com/apache/hive/pull/169


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 450111)
Time Spent: 20m  (was: 10m)

> Driver.java has a redundancy  code
> --
>
> Key: HIVE-15442
> URL: https://issues.apache.org/jira/browse/HIVE-15442
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Saijin Huang
>Assignee: Saijin Huang
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.0.0
>
> Attachments: HIVE-15442.1.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Driver.java has a  redundancy  code about "explain output", i think the if 
> statement " if (conf.getBoolVar(ConfVars.HIVE_LOG_EXPLAIN_OUTPUT))" has a 
> repeat judge with the above statement. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-19653) Incorrect predicate pushdown for groupby with grouping sets

2020-06-23 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-19653?focusedWorklogId=450098=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-450098
 ]

ASF GitHub Bot logged work on HIVE-19653:
-

Author: ASF GitHub Bot
Created on: 24/Jun/20 00:25
Start Date: 24/Jun/20 00:25
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #354:
URL: https://github.com/apache/hive/pull/354


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 450098)
Time Spent: 2h  (was: 1h 50m)

> Incorrect predicate pushdown for groupby with grouping sets
> ---
>
> Key: HIVE-19653
> URL: https://issues.apache.org/jira/browse/HIVE-19653
> Project: Hive
>  Issue Type: Bug
>  Components: Logical Optimizer
>Reporter: Zhang Li
>Assignee: Zhihua Deng
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-19653.1.patch, HIVE-19653.patch
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> Consider the following query:
> {code:java}
> CREATE TABLE T1(a STRING, b STRING, s BIGINT);
> INSERT OVERWRITE TABLE T1 VALUES ('', '', 123456);
> SELECT * FROM (
> SELECT a, b, sum(s)
> FROM T1
> GROUP BY a, b GROUPING SETS ((), (a), (b), (a, b))
> ) t WHERE a IS NOT NULL;
> {code}
> When hive.optimize.ppd is enabled (and hive.cbo.enable=false), the query will 
> output:
> {code:java}
> NULL  NULL123456
> NULL  123456
>   NULL123456
>   123456
> {code}
> We can see the predicate "a IS NOT NULL" takes no effect, which is incorrect.
> When performing PPD optimization for a GBY operator, we should make sure all 
> grouping sets contains the processing expr before pushdown. otherwise the 
> expr value after GBY is changed and the result is wrong.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-17311) Numeric overflow in the HiveConf

2020-06-23 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-17311?focusedWorklogId=450105=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-450105
 ]

ASF GitHub Bot logged work on HIVE-17311:
-

Author: ASF GitHub Bot
Created on: 24/Jun/20 00:25
Start Date: 24/Jun/20 00:25
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #229:
URL: https://github.com/apache/hive/pull/229


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 450105)
Time Spent: 20m  (was: 10m)

> Numeric overflow in the HiveConf
> 
>
> Key: HIVE-17311
> URL: https://issues.apache.org/jira/browse/HIVE-17311
> Project: Hive
>  Issue Type: Bug
>Reporter: Oleg Danilov
>Assignee: Oleg Danilov
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.0.0
>
> Attachments: HIVE-17311.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> multiplierFor() method contains a typo, which causes wrong parsing of the 
> rare suffixes ('tb' & 'pb').



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-16925) isSlowStart lost during refactoring

2020-06-23 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-16925?focusedWorklogId=450110=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-450110
 ]

ASF GitHub Bot logged work on HIVE-16925:
-

Author: ASF GitHub Bot
Created on: 24/Jun/20 00:25
Start Date: 24/Jun/20 00:25
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #195:
URL: https://github.com/apache/hive/pull/195


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 450110)
Time Spent: 20m  (was: 10m)

> isSlowStart lost during refactoring
> ---
>
> Key: HIVE-16925
> URL: https://issues.apache.org/jira/browse/HIVE-16925
> Project: Hive
>  Issue Type: Bug
>Reporter: Oleg Danilov
>Priority: Minor
>  Labels: pull-request-available
> Attachments: HIVE-16925.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> TezEdgeProperty.setAutoReduce() should have isSlowStart as parameter



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-19708) Repl copy retrying with cm path even if the failure is due to network issue

2020-06-23 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-19708?focusedWorklogId=450102=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-450102
 ]

ASF GitHub Bot logged work on HIVE-19708:
-

Author: ASF GitHub Bot
Created on: 24/Jun/20 00:25
Start Date: 24/Jun/20 00:25
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #359:
URL: https://github.com/apache/hive/pull/359


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 450102)
Time Spent: 20m  (was: 10m)

> Repl copy retrying with cm path even if the failure is due to network issue
> ---
>
> Key: HIVE-19708
> URL: https://issues.apache.org/jira/browse/HIVE-19708
> Project: Hive
>  Issue Type: Task
>  Components: Hive, HiveServer2, repl
>Affects Versions: 3.1.0
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.1.0, 4.0.0
>
> Attachments: HIVE-19708.01.patch, HIVE-19708.02.patch, 
> HIVE-19708.04.patch, HIVE-19708.05.patch, HIVE-19708.06-branch-3.patch, 
> HIVE-19708.06.patch, HIVE-19708.07-branch-3.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> * During repl load
>  ** for filesystem based copying of file if the copy fails due to a 
> connection error to source Name Node, we should recreate the filesystem 
> object.
>  ** the retry logic for local file copy should be triggered using the 
> original source file path ( and not the CM root path ) since failure can be 
> due to network issues between DFSClient and NN.
>  * When listing files in tables / partition to include them in _files, we 
> should add retry logic when failure occurs. FileSystem object here also 
> should be recreated since the existing one might be in inconsistent state.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-19569) alter table db1.t1 rename db2.t2 generates MetaStoreEventListener.onDropTable()

2020-06-23 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-19569?focusedWorklogId=450100=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-450100
 ]

ASF GitHub Bot logged work on HIVE-19569:
-

Author: ASF GitHub Bot
Created on: 24/Jun/20 00:25
Start Date: 24/Jun/20 00:25
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #368:
URL: https://github.com/apache/hive/pull/368


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 450100)
Time Spent: 20m  (was: 10m)

> alter table db1.t1 rename db2.t2 generates 
> MetaStoreEventListener.onDropTable()
> ---
>
> Key: HIVE-19569
> URL: https://issues.apache.org/jira/browse/HIVE-19569
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore, Standalone Metastore, Transactions
>Affects Versions: 3.0.0
>Reporter: Eugene Koifman
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.1.0, 4.0.0
>
> Attachments: HIVE-19569.01-branch-3.patch, HIVE-19569.01.patch, 
> HIVE-19569.02.patch, HIVE-19569.03.patch, HIVE-19569.04.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> When renaming a table within the same DB, this operation causes 
> {{MetaStoreEventListener.onAlterTable()}} to fire but when changing DB name 
> for a table it causes {{MetaStoreEventListener.onDropTable()}} + 
> {{MetaStoreEventListener.onCreateTable()}}.
> The files from original table are moved to new table location.  
> This creates confusing semantics since any logic in {{onDropTable()}} doesn't 
> know about the larger context, i.e. that there will be a matching 
> {{onCreateTable()}}.
> In particular, this causes a problem for Acid tables since files moved from 
> old table use WriteIDs that are not meaningful with the context of new table.
> Current implementation is due to replication.  This should ideally be changed 
> to raise a "not supported" error for tables that are marked for replication.
> cc [~sankarh]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-14836) Test the predicate pushing down support for Parquet vectorization read path

2020-06-23 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-14836?focusedWorklogId=450108=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-450108
 ]

ASF GitHub Bot logged work on HIVE-14836:
-

Author: ASF GitHub Bot
Created on: 24/Jun/20 00:25
Start Date: 24/Jun/20 00:25
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #251:
URL: https://github.com/apache/hive/pull/251


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 450108)
Time Spent: 20m  (was: 10m)

> Test the predicate pushing down support for Parquet vectorization read path
> ---
>
> Key: HIVE-14836
> URL: https://issues.apache.org/jira/browse/HIVE-14836
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 3.0.0
>Reporter: Ferdinand Xu
>Assignee: Ferdinand Xu
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-14836.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> We should add more UT test for predicate pushing down support for Parquet 
> vectorization read path.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-19829) Incremental replication load should create tasks in execution phase rather than semantic phase

2020-06-23 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-19829?focusedWorklogId=450099=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-450099
 ]

ASF GitHub Bot logged work on HIVE-19829:
-

Author: ASF GitHub Bot
Created on: 24/Jun/20 00:25
Start Date: 24/Jun/20 00:25
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #370:
URL: https://github.com/apache/hive/pull/370


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 450099)
Time Spent: 20m  (was: 10m)

> Incremental replication load should create tasks in execution phase rather 
> than semantic phase
> --
>
> Key: HIVE-19829
> URL: https://issues.apache.org/jira/browse/HIVE-19829
> Project: Hive
>  Issue Type: Task
>  Components: repl
>Affects Versions: 3.1.0, 4.0.0
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0, 3.2.0
>
> Attachments: HIVE-19829.01.patch, HIVE-19829.02.patch, 
> HIVE-19829.03.patch, HIVE-19829.04.patch, HIVE-19829.06.patch, 
> HIVE-19829.07.patch, HIVE-19829.07.patch, HIVE-19829.08-branch-3.patch, 
> HIVE-19829.08.patch, HIVE-19829.09.patch, HIVE-19829.10-branch-3.patch, 
> HIVE-19829.10.patch, HIVE-19829.11-branch-3.patch, 
> HIVE-19829.12-branch-3.patch, HIVE-19829.13-branch-3.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Split the incremental load into multiple iterations. In each iteration create 
> number of tasks equal to the configured value.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23235) Checkpointing in repl dump failing for orc format

2020-06-23 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23235?focusedWorklogId=450101=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-450101
 ]

ASF GitHub Bot logged work on HIVE-23235:
-

Author: ASF GitHub Bot
Created on: 24/Jun/20 00:25
Start Date: 24/Jun/20 00:25
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #987:
URL: https://github.com/apache/hive/pull/987#issuecomment-648503920


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 450101)
Time Spent: 40m  (was: 0.5h)

> Checkpointing in repl dump failing for orc format
> -
>
> Key: HIVE-23235
> URL: https://issues.apache.org/jira/browse/HIVE-23235
> Project: Hive
>  Issue Type: Bug
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
> Attachments: HIVE-23235.01.patch, HIVE-23235.02.patch, 
> HIVE-23235.03.patch, HIVE-23235.04.patch, HIVE-23235.05.patch, 
> HIVE-23235.06.patch, HIVE-23235.07.patch, HIVE-23235.08.patch, 
> HIVE-23235.09.patch, HIVE-23235.10.patch
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-23235) Checkpointing in repl dump failing for orc format

2020-06-23 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-23235:
--
Labels: pull-request-available  (was: )

> Checkpointing in repl dump failing for orc format
> -
>
> Key: HIVE-23235
> URL: https://issues.apache.org/jira/browse/HIVE-23235
> Project: Hive
>  Issue Type: Bug
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-23235.01.patch, HIVE-23235.02.patch, 
> HIVE-23235.03.patch, HIVE-23235.04.patch, HIVE-23235.05.patch, 
> HIVE-23235.06.patch, HIVE-23235.07.patch, HIVE-23235.08.patch, 
> HIVE-23235.09.patch, HIVE-23235.10.patch
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-17314) LazySimpleSerializeWrite.writeString() contains if with an empty body

2020-06-23 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-17314?focusedWorklogId=450104=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-450104
 ]

ASF GitHub Bot logged work on HIVE-17314:
-

Author: ASF GitHub Bot
Created on: 24/Jun/20 00:25
Start Date: 24/Jun/20 00:25
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #234:
URL: https://github.com/apache/hive/pull/234


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 450104)
Time Spent: 20m  (was: 10m)

> LazySimpleSerializeWrite.writeString() contains if with an empty body
> -
>
> Key: HIVE-17314
> URL: https://issues.apache.org/jira/browse/HIVE-17314
> Project: Hive
>  Issue Type: Bug
>Reporter: Oleg Danilov
>Assignee: Oleg Danilov
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.0.0
>
> Attachments: HIVE-17314.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Looking at the LazySimpleSerializeWrite.java I found odd 'if':
> Lines 234-235:
> {code:java}
> if (v.equals(nullSequenceBytes)) {
> }
> {code}
> Seems like either something is missed there or this 'if' could be dropped out.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23040) Checkpointing for repl dump incremental phase

2020-06-23 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23040?focusedWorklogId=450095=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-450095
 ]

ASF GitHub Bot logged work on HIVE-23040:
-

Author: ASF GitHub Bot
Created on: 24/Jun/20 00:24
Start Date: 24/Jun/20 00:24
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #977:
URL: https://github.com/apache/hive/pull/977#issuecomment-648503943


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 450095)
Time Spent: 2h 10m  (was: 2h)

> Checkpointing for repl dump incremental phase
> -
>
> Key: HIVE-23040
> URL: https://issues.apache.org/jira/browse/HIVE-23040
> Project: Hive
>  Issue Type: Improvement
>Reporter: Aasha Medhi
>Assignee: Pravin Sinha
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-23040.01.patch, HIVE-23040.02.patch, 
> HIVE-23040.03.patch, HIVE-23040.04.patch, HIVE-23040.05.patch, 
> HIVE-23040.06.patch, HIVE-23040.06.patch, HIVE-23040.07.patch
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-19725) Add ability to dump non-native tables in replication metadata dump

2020-06-23 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-19725?focusedWorklogId=450094=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-450094
 ]

ASF GitHub Bot logged work on HIVE-19725:
-

Author: ASF GitHub Bot
Created on: 24/Jun/20 00:24
Start Date: 24/Jun/20 00:24
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #361:
URL: https://github.com/apache/hive/pull/361


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 450094)
Time Spent: 20m  (was: 10m)

> Add ability to dump non-native tables in replication metadata dump
> --
>
> Key: HIVE-19725
> URL: https://issues.apache.org/jira/browse/HIVE-19725
> Project: Hive
>  Issue Type: Task
>  Components: repl
>Affects Versions: 3.1.0, 4.0.0
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: Repl, pull-request-available
> Fix For: 3.1.0, 4.0.0
>
> Attachments: HIVE-19725.01.patch, HIVE-19725.02.patch, 
> HIVE-19725.03.patch, HIVE-19725.04.patch, HIVE-19725.05.patch, 
> HIVE-19725.06-branch-3.patch, HIVE-19725.07-branch-3.patch, 
> HIVE-19725.07.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> if hive.repl.dump.metadata.only is set to true, allow dumping non native 
> tables also. 
> Data dump for non-native tables should never be allowed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-16391) Publish proper Hive 1.2 jars (without including all dependencies in uber jar)

2020-06-23 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-16391?focusedWorklogId=450093=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-450093
 ]

ASF GitHub Bot logged work on HIVE-16391:
-

Author: ASF GitHub Bot
Created on: 24/Jun/20 00:24
Start Date: 24/Jun/20 00:24
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #364:
URL: https://github.com/apache/hive/pull/364


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 450093)
Time Spent: 20m  (was: 10m)

> Publish proper Hive 1.2 jars (without including all dependencies in uber jar)
> -
>
> Key: HIVE-16391
> URL: https://issues.apache.org/jira/browse/HIVE-16391
> Project: Hive
>  Issue Type: Task
>  Components: Build Infrastructure
>Affects Versions: 1.2.2
>Reporter: Reynold Xin
>Assignee: Saisai Shao
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.2.3
>
> Attachments: HIVE-16391.1.patch, HIVE-16391.2.patch, HIVE-16391.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Apache Spark currently depends on a forked version of Apache Hive. AFAIK, the 
> only change in the fork is to work around the issue that Hive publishes only 
> two sets of jars: one set with no dependency declared, and another with all 
> the dependencies included in the published uber jar. That is to say, Hive 
> doesn't publish a set of jars with the proper dependencies declared.
> There is general consensus on both sides that we should remove the forked 
> Hive.
> The change in the forked version is recorded here 
> https://github.com/JoshRosen/hive/tree/release-1.2.1-spark2
> Note that the fork in the past included other fixes but those have all become 
> unnecessary.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-19661) switch Hive UDFs to use Re2J regex engine

2020-06-23 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-19661?focusedWorklogId=450097=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-450097
 ]

ASF GitHub Bot logged work on HIVE-19661:
-

Author: ASF GitHub Bot
Created on: 24/Jun/20 00:24
Start Date: 24/Jun/20 00:24
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #362:
URL: https://github.com/apache/hive/pull/362


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 450097)
Time Spent: 20m  (was: 10m)

> switch Hive UDFs to use Re2J regex engine
> -
>
> Key: HIVE-19661
> URL: https://issues.apache.org/jira/browse/HIVE-19661
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Rajkumar Singh
>Assignee: Rajkumar Singh
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-19661.01.patch, HIVE-19661.02.patch, 
> HIVE-19661.03.patch, HIVE-19661.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Java regex engine can be very slow in some cases e.g. 
> https://bugs.java.com/bugdatabase/view_bug.do?bug_id=JDK-8203458



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-19812) Disable external table replication by default via a configuration property

2020-06-23 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-19812?focusedWorklogId=450096=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-450096
 ]

ASF GitHub Bot logged work on HIVE-19812:
-

Author: ASF GitHub Bot
Created on: 24/Jun/20 00:24
Start Date: 24/Jun/20 00:24
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #365:
URL: https://github.com/apache/hive/pull/365


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 450096)
Time Spent: 20m  (was: 10m)

> Disable external table replication by default via a configuration property
> --
>
> Key: HIVE-19812
> URL: https://issues.apache.org/jira/browse/HIVE-19812
> Project: Hive
>  Issue Type: Task
>  Components: repl
>Affects Versions: 3.1.0, 4.0.0
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0, 3.2.0
>
> Attachments: HIVE-19812.01.patch, HIVE-19812.02.patch, 
> HIVE-19812.03.patch, HIVE-19812.04.patch, HIVE-19812.05.patch, 
> HIVE-19812.06-branch-3.patch, HIVE-19812.06.patch, HIVE-19812.07.patch, 
> HIVE-19812.08.patch, HIVE-19812.09.patch, HIVE-19812.10-branch-3.patch, 
> HIVE-19812.10.patch, HIVE-19812.11.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> use a hive config property to allow external table replication. set this 
> property by default to prevent external table replication.
> for metadata only hive repl always export metadata for external tables.
>  
> REPL_DUMP_EXTERNAL_TABLES("hive.repl.dump.include.external.tables", false,
> "Indicates if repl dump should include information about external tables. It 
> should be \n"
> + "used in conjunction with 'hive.repl.dump.metadata.only' set to false. if 
> 'hive.repl.dump.metadata.only' \n"
> + " is set to true then this config parameter has no effect as external table 
> meta data is flushed \n"
> + " always by default.")
> This should be done for only replication dump and not for export



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (HIVE-23753) Make LLAP Secretmanager token path configurable

2020-06-23 Thread Rajkumar Singh (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajkumar Singh reassigned HIVE-23753:
-


> Make LLAP Secretmanager token path configurable
> ---
>
> Key: HIVE-23753
> URL: https://issues.apache.org/jira/browse/HIVE-23753
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 4.0.0
>Reporter: Rajkumar Singh
>Assignee: Rajkumar Singh
>Priority: Major
>
> In a very Busy LLAP cluster if for some reason the Tokens under 
> zkdtsm_hive_llap0 zk path are not cleaned then LLAP Daemon startup takes a 
> very long time to startup, this may lead to service outage if LLAP daemons 
> are not started and the number of retries while checking LLAP app status 
> exceeds. upon looking the jstack of llap daemon it seems to traverse the 
> zkdtsm_hive_llap0 zk path before starting the secret manager.
> {code:java}
>java.lang.Thread.State: WAITING (on object monitor)
>   at java.lang.Object.wait(Native Method)
>   at java.lang.Object.wait(Object.java:502)
>   at org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1386)
>   - locked <0x7fef36cdd338> (a org.apache.zookeeper.ClientCnxn$Packet)
>   at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1153)
>   at 
> org.apache.curator.framework.imps.GetDataBuilderImpl$4.call(GetDataBuilderImpl.java:302)
>   at 
> org.apache.curator.framework.imps.GetDataBuilderImpl$4.call(GetDataBuilderImpl.java:291)
>   at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:107)
>   at 
> org.apache.curator.framework.imps.GetDataBuilderImpl.pathInForeground(GetDataBuilderImpl.java:288)
>   at 
> org.apache.curator.framework.imps.GetDataBuilderImpl.forPath(GetDataBuilderImpl.java:279)
>   at 
> org.apache.curator.framework.imps.GetDataBuilderImpl$2.forPath(GetDataBuilderImpl.java:142)
>   at 
> org.apache.curator.framework.imps.GetDataBuilderImpl$2.forPath(GetDataBuilderImpl.java:138)
>   at 
> org.apache.curator.framework.recipes.cache.PathChildrenCache.internalRebuildNode(PathChildrenCache.java:591)
>   at 
> org.apache.curator.framework.recipes.cache.PathChildrenCache.rebuild(PathChildrenCache.java:331)
>   at 
> org.apache.curator.framework.recipes.cache.PathChildrenCache.start(PathChildrenCache.java:300)
>   at 
> org.apache.hadoop.security.token.delegation.ZKDelegationTokenSecretManager.startThreads(ZKDelegationTokenSecretManager.java:370)
>   at 
> org.apache.hadoop.hive.llap.security.SecretManager.startThreads(SecretManager.java:82)
>   at 
> org.apache.hadoop.hive.llap.security.SecretManager$1.run(SecretManager.java:223)
>   at 
> org.apache.hadoop.hive.llap.security.SecretManager$1.run(SecretManager.java:218)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:360)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1846)
>   at 
> org.apache.hadoop.hive.llap.security.SecretManager.createSecretManager(SecretManager.java:218)
>   at 
> org.apache.hadoop.hive.llap.security.SecretManager.createSecretManager(SecretManager.java:212)
>   at 
> org.apache.hadoop.hive.llap.daemon.impl.LlapDaemon.(LlapDaemon.java:279)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-23752) Cast as Date for invalid date produce the valid output

2020-06-23 Thread Rajkumar Singh (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajkumar Singh updated HIVE-23752:
--
Labels: hive  (was: )

> Cast as Date for invalid date produce the valid output
> --
>
> Key: HIVE-23752
> URL: https://issues.apache.org/jira/browse/HIVE-23752
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 4.0.0
>Reporter: Rajkumar Singh
>Priority: Major
>  Labels: hive
>
> Hive-3:
> {code:java}
> select cast("-00-00" as date) 
> 0002-11-30 
> select cast("2010-27-54" as date)
>  2012-04-23
> select cast("1992-00-74" as date) ;
> 1992-02-12
> {code}
> The reason Hive allowing is because Parser formatted is set to LENIENT 
> https://github.com/apache/hive/blob/ae008b79b5d52ed6a38875b73025a505725828eb/common/src/java/org/apache/hadoop/hive/common/type/Date.java#L50,
>  this seems to be an intentional change as changing the ResolverStyle to 
> STRICT start failing the tests.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-23752) Cast as Date for invalid date produce the valid output

2020-06-23 Thread Rajkumar Singh (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajkumar Singh updated HIVE-23752:
--
Affects Version/s: 4.0.0

> Cast as Date for invalid date produce the valid output
> --
>
> Key: HIVE-23752
> URL: https://issues.apache.org/jira/browse/HIVE-23752
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 4.0.0
>Reporter: Rajkumar Singh
>Priority: Major
>
> Hive-3:
> {code:java}
> select cast("-00-00" as date) 
> 0002-11-30 
> select cast("2010-27-54" as date)
>  2012-04-23
> select cast("1992-00-74" as date) ;
> 1992-02-12
> {code}
> The reason Hive allowing is because Parser formatted is set to LENIENT 
> https://github.com/apache/hive/blob/ae008b79b5d52ed6a38875b73025a505725828eb/common/src/java/org/apache/hadoop/hive/common/type/Date.java#L50,
>  this seems to be an intentional change as changing the ResolverStyle to 
> STRICT start failing the tests.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-23726) Create table may throw MetaException(message:java.lang.IllegalArgumentException: Can not create a Path from a null string)

2020-06-23 Thread Naveen Gangam (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-23726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17143208#comment-17143208
 ] 

Naveen Gangam commented on HIVE-23726:
--

[~sankarh] Apologies for pulling this off Ashish's pile. Given this is assigned 
to him few minutes ago, I assume he hadn't spent much time on this. 
I have a fix I have tested that I made as part of another change that should 
address this. But I can separate the fix out for this issue.

> Create table may throw 
> MetaException(message:java.lang.IllegalArgumentException: Can not create a 
> Path from a null string)
> --
>
> Key: HIVE-23726
> URL: https://issues.apache.org/jira/browse/HIVE-23726
> Project: Hive
>  Issue Type: Bug
>Reporter: Istvan Fajth
>Assignee: Naveen Gangam
>Priority: Major
>
> - Given:
>  metastore.warehouse.tenant.colocation is set to true
>  a test database was created as {{create database test location '/data'}}
>  - When:
>  I try to create a table as {{create table t1 (a int) location '/data/t1'}}
>  - Then:
> The create table fails with the following exception:
> {code}
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> MetaException(message:java.lang.IllegalArgumentException: Can not create a 
> Path from a null string)
>   at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:1138)
>   at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:1143)
>   at 
> org.apache.hadoop.hive.ql.ddl.table.create.CreateTableOperation.createTableNonReplaceMode(CreateTableOperation.java:148)
>   at 
> org.apache.hadoop.hive.ql.ddl.table.create.CreateTableOperation.execute(CreateTableOperation.java:98)
>   at org.apache.hadoop.hive.ql.ddl.DDLTask.execute(DDLTask.java:80)
>   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:213)
>   at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:105) 
>   at org.apache.hadoop.hive.ql.Executor.launchTask(Executor.java:359)
>   at org.apache.hadoop.hive.ql.Executor.launchTasks(Executor.java:330)
>   at org.apache.hadoop.hive.ql.Executor.runTasks(Executor.java:246)
>   at org.apache.hadoop.hive.ql.Executor.execute(Executor.java:109)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:721)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:488)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:482)
>   at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:166)
>   at 
> org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:225)
>   at 
> org.apache.hive.service.cli.operation.SQLOperation.access$700(SQLOperation.java:87)
>   at 
> org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:322)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1876)
>   at 
> org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork.run(SQLOperation.java:340)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: org.apache.hadoop.hive.metastore.api.MetaException: 
> java.lang.IllegalArgumentException: Can not create a Path from a null string
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$create_table_req_result$create_table_req_resultStandardScheme.read(ThriftHiveMetastore.java:63325)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$create_table_req_result$create_table_req_resultStandardScheme.read(ThriftHiveMetastore.java:63293)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$create_table_req_result.read(ThriftHiveMetastore.java:63219)
>   at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:86)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_create_table_req(ThriftHiveMetastore.java:1780)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.create_table_req(ThriftHiveMetastore.java:1767)
>   at 
>

[jira] [Assigned] (HIVE-23726) Create table may throw MetaException(message:java.lang.IllegalArgumentException: Can not create a Path from a null string)

2020-06-23 Thread Naveen Gangam (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naveen Gangam reassigned HIVE-23726:


Assignee: Naveen Gangam  (was: Ashish Sharma)

> Create table may throw 
> MetaException(message:java.lang.IllegalArgumentException: Can not create a 
> Path from a null string)
> --
>
> Key: HIVE-23726
> URL: https://issues.apache.org/jira/browse/HIVE-23726
> Project: Hive
>  Issue Type: Bug
>Reporter: Istvan Fajth
>Assignee: Naveen Gangam
>Priority: Major
>
> - Given:
>  metastore.warehouse.tenant.colocation is set to true
>  a test database was created as {{create database test location '/data'}}
>  - When:
>  I try to create a table as {{create table t1 (a int) location '/data/t1'}}
>  - Then:
> The create table fails with the following exception:
> {code}
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> MetaException(message:java.lang.IllegalArgumentException: Can not create a 
> Path from a null string)
>   at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:1138)
>   at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:1143)
>   at 
> org.apache.hadoop.hive.ql.ddl.table.create.CreateTableOperation.createTableNonReplaceMode(CreateTableOperation.java:148)
>   at 
> org.apache.hadoop.hive.ql.ddl.table.create.CreateTableOperation.execute(CreateTableOperation.java:98)
>   at org.apache.hadoop.hive.ql.ddl.DDLTask.execute(DDLTask.java:80)
>   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:213)
>   at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:105) 
>   at org.apache.hadoop.hive.ql.Executor.launchTask(Executor.java:359)
>   at org.apache.hadoop.hive.ql.Executor.launchTasks(Executor.java:330)
>   at org.apache.hadoop.hive.ql.Executor.runTasks(Executor.java:246)
>   at org.apache.hadoop.hive.ql.Executor.execute(Executor.java:109)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:721)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:488)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:482)
>   at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:166)
>   at 
> org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:225)
>   at 
> org.apache.hive.service.cli.operation.SQLOperation.access$700(SQLOperation.java:87)
>   at 
> org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:322)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1876)
>   at 
> org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork.run(SQLOperation.java:340)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: org.apache.hadoop.hive.metastore.api.MetaException: 
> java.lang.IllegalArgumentException: Can not create a Path from a null string
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$create_table_req_result$create_table_req_resultStandardScheme.read(ThriftHiveMetastore.java:63325)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$create_table_req_result$create_table_req_resultStandardScheme.read(ThriftHiveMetastore.java:63293)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$create_table_req_result.read(ThriftHiveMetastore.java:63219)
>   at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:86)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_create_table_req(ThriftHiveMetastore.java:1780)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.create_table_req(ThriftHiveMetastore.java:1767)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.create_table_with_environment_context(HiveMetaStoreClient.java:3518)
>   at 
> org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.create_table_with_environment_context(SessionHiveMetaStoreClient.java:145)
>   at 
>

[jira] [Work logged] (HIVE-23619) HiveServer2 should retry query if the TezAM running it gets killed

2020-06-23 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23619?focusedWorklogId=449958=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-449958
 ]

ASF GitHub Bot logged work on HIVE-23619:
-

Author: ASF GitHub Bot
Created on: 23/Jun/20 18:18
Start Date: 23/Jun/20 18:18
Worklog Time Spent: 10m 
  Work Description: sankarh commented on a change in pull request #1146:
URL: https://github.com/apache/hive/pull/1146#discussion_r444398139



##
File path: 
ql/src/java/org/apache/hadoop/hive/ql/reexec/ReExecuteLostAMQueryPlugin.java
##
@@ -0,0 +1,75 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hive.ql.reexec;
+
+import org.apache.hadoop.hive.ql.Driver;
+import org.apache.hadoop.hive.ql.hooks.ExecuteWithHookContext;
+import org.apache.hadoop.hive.ql.hooks.HookContext;
+import org.apache.hadoop.hive.ql.plan.mapper.PlanMapper;
+
+import java.util.regex.Pattern;
+
+public class ReExecuteLostAMQueryPlugin implements IReExecutionPlugin {
+private boolean retryPossible;

Review comment:
   In Hive code, we follow 2 spaced tabs. Pls update it in newly created 
files in this patch.

##
File path: 
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/reexec/TestReExecuteKilledTezAMQueryPlugin.java
##
@@ -0,0 +1,207 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hive.ql.reexec;
+
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.hive.conf.HiveConf;
+import org.apache.hadoop.hive.llap.LlapBaseInputFormat;
+import org.apache.hadoop.hive.ql.metadata.HiveException;
+import org.apache.hadoop.yarn.api.records.ApplicationReport;
+import org.apache.hadoop.yarn.api.records.YarnApplicationState;
+import org.apache.hadoop.yarn.client.api.YarnClient;
+import org.apache.hive.jdbc.BaseJdbcWithMiniLlap;
+import org.apache.hive.jdbc.HiveStatement;
+import org.apache.hive.jdbc.TestJdbcWithMiniLlapArrow;
+import org.apache.hive.jdbc.miniHS2.MiniHS2;
+import org.junit.*;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.File;
+import java.net.URL;
+import java.sql.Connection;
+import java.sql.DriverManager;
+import java.sql.SQLException;
+import java.sql.Statement;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+
+
+public class TestReExecuteKilledTezAMQueryPlugin {
+protected static final Logger LOG = 
LoggerFactory.getLogger(TestJdbcWithMiniLlapArrow.class);
+
+private static MiniHS2 miniHS2 = null;
+private static final String tableName = "testKillTezAmTbl";
+private static String dataFileDir;
+private static final String testDbName = "testKillTezAmDb";
+protected static Connection hs2Conn = null;
+private static HiveConf conf;
+
+private static class ExceptionHolder {
+Throwable throwable;
+}
+
+static HiveConf defaultConf() throws Exception {
+String confDir = "../../data/conf/llap/";
+if (confDir != null && !confDir.isEmpty()) {
+HiveConf.setHiveSiteLocation(new URL("file://"+ new 
File(confDir).toURI().getPath() + "/hive-site.xml"));
+System.out.println("Setting hive-site: " + 
HiveConf.getHiveSiteLocation());
+}
+HiveConf defaultConf = new HiveConf();
+defaultConf.setBoolVar(HiveConf.ConfVars.HIVE_SUPPORT_CONCURRENCY,

[jira] [Assigned] (HIVE-22949) Handle empty insert overwrites without inserting an empty file in case of acid/mm tables

2020-06-23 Thread Sankar Hariappan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan reassigned HIVE-22949:
---

Assignee: Ashish Sharma

> Handle empty insert overwrites without inserting an empty file in case of 
> acid/mm tables
> 
>
> Key: HIVE-22949
> URL: https://issues.apache.org/jira/browse/HIVE-22949
> Project: Hive
>  Issue Type: Improvement
>Reporter: László Bodor
>Assignee: Ashish Sharma
>Priority: Major
>
> HIVE-22941 was a quick workaround for empty files in case of external tables. 
> An optimal solution would to completely prevent writing empty files just for 
> having the table contents cleared on an empty INSERT OVERWRITE.
> There are other tickets about a similar topic, for example, if we need empty 
> files at all: HIVE-22918, HIVE-22938



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (HIVE-23545) Insert table partition(column) occasionally occurs when the target partition is not created

2020-06-23 Thread Sankar Hariappan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan reassigned HIVE-23545:
---

Assignee: Ashish Sharma

> Insert table partition(column) occasionally occurs when the target partition 
> is not created
> ---
>
> Key: HIVE-23545
> URL: https://issues.apache.org/jira/browse/HIVE-23545
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.1.0
>Reporter: LuGuangMing
>Assignee: Ashish Sharma
>Priority: Major
> Attachments: test.sql
>
>
> Insert data into the static partition of an external table, this static 
> partition is created in advance, When hive.exec.parallel is turned on, it 
> {color:#FF}occasionally{color} occurs after execution that the partition 
> does not exist and the it no apparent error logs during execution



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Comment Edited] (HIVE-23723) Limit operator pushdown through LOJ

2020-06-23 Thread Attila Magyar (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-23723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17142863#comment-17142863
 ] 

Attila Magyar edited comment on HIVE-23723 at 6/23/20, 5:14 PM:


[~jcamachorodriguez],

??Concerning your patch, it seems you are removing the original limit on top of 
the left outer join? Note that you cannot remove it : If you have 5 input rows 
on the left side, you know the LOJ will produce at least 5 rows, however you 
cannot guarantee the join will produce 5 rows at most.??

 

Got it, that should be kept indeed. However reason why additional reducers are 
introduced by the limittranspose implementation is not fully clear to me.

Do you think we should drop this patch as it's already implemented by the 
limittranspose, and focus on tweaking the existing implementation?

 

cc: [~ashutoshc]


was (Author: amagyar):
[~jcamachorodriguez],

??Concerning your patch, it seems you are removing the original limit on top of 
the left outer join? Note that you cannot remove it : If you have 5 input rows 
on the left side, you know the LOJ will produce at least 5 rows, however you 
cannot guarantee the join will produce 5 rows at most.??

 

Got it, that should be kept indeed. However reason why additional reducers are 
introduced by the limittranspose implementation is not fully clear to me.

Do you think we should drop this patch as it's already implemented by the 
limittranspose, and focus on tweaking the existing implementation?

> Limit operator pushdown through LOJ
> ---
>
> Key: HIVE-23723
> URL: https://issues.apache.org/jira/browse/HIVE-23723
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Reporter: Attila Magyar
>Assignee: Attila Magyar
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-23723.1.patch
>
>
> Limit operator (without an order by) can be pushed through SELECTS and LEFT 
> OUTER JOINs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (HIVE-22826) ALTER TABLE RENAME COLUMN doesn't update list of bucketed column names

2020-06-23 Thread Sankar Hariappan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan reassigned HIVE-22826:
---

Assignee: Ashish Sharma  (was: Nishant Goel)

>  ALTER TABLE RENAME COLUMN doesn't update list of bucketed column names
> ---
>
> Key: HIVE-22826
> URL: https://issues.apache.org/jira/browse/HIVE-22826
> Project: Hive
>  Issue Type: Bug
>Reporter: Karen Coppage
>Assignee: Ashish Sharma
>Priority: Major
> Attachments: unitTest.patch
>
>
> Compaction for tables where a bucketed column has been renamed fails since 
> the list of bucketed columns in the StorageDescriptor doesn't get updated 
> when the column is renamed, therefore we can't recreate the table correctly 
> during compaction.
> Attached a unit test that fails.
> NO PRECOMMIT TESTS



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (HIVE-23726) Create table may throw MetaException(message:java.lang.IllegalArgumentException: Can not create a Path from a null string)

2020-06-23 Thread Sankar Hariappan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan reassigned HIVE-23726:
---

Assignee: Ashish Sharma

> Create table may throw 
> MetaException(message:java.lang.IllegalArgumentException: Can not create a 
> Path from a null string)
> --
>
> Key: HIVE-23726
> URL: https://issues.apache.org/jira/browse/HIVE-23726
> Project: Hive
>  Issue Type: Bug
>Reporter: Istvan Fajth
>Assignee: Ashish Sharma
>Priority: Major
>
> - Given:
>  metastore.warehouse.tenant.colocation is set to true
>  a test database was created as {{create database test location '/data'}}
>  - When:
>  I try to create a table as {{create table t1 (a int) location '/data/t1'}}
>  - Then:
> The create table fails with the following exception:
> {code}
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> MetaException(message:java.lang.IllegalArgumentException: Can not create a 
> Path from a null string)
>   at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:1138)
>   at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:1143)
>   at 
> org.apache.hadoop.hive.ql.ddl.table.create.CreateTableOperation.createTableNonReplaceMode(CreateTableOperation.java:148)
>   at 
> org.apache.hadoop.hive.ql.ddl.table.create.CreateTableOperation.execute(CreateTableOperation.java:98)
>   at org.apache.hadoop.hive.ql.ddl.DDLTask.execute(DDLTask.java:80)
>   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:213)
>   at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:105) 
>   at org.apache.hadoop.hive.ql.Executor.launchTask(Executor.java:359)
>   at org.apache.hadoop.hive.ql.Executor.launchTasks(Executor.java:330)
>   at org.apache.hadoop.hive.ql.Executor.runTasks(Executor.java:246)
>   at org.apache.hadoop.hive.ql.Executor.execute(Executor.java:109)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:721)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:488)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:482)
>   at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:166)
>   at 
> org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:225)
>   at 
> org.apache.hive.service.cli.operation.SQLOperation.access$700(SQLOperation.java:87)
>   at 
> org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:322)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1876)
>   at 
> org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork.run(SQLOperation.java:340)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: org.apache.hadoop.hive.metastore.api.MetaException: 
> java.lang.IllegalArgumentException: Can not create a Path from a null string
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$create_table_req_result$create_table_req_resultStandardScheme.read(ThriftHiveMetastore.java:63325)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$create_table_req_result$create_table_req_resultStandardScheme.read(ThriftHiveMetastore.java:63293)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$create_table_req_result.read(ThriftHiveMetastore.java:63219)
>   at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:86)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_create_table_req(ThriftHiveMetastore.java:1780)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.create_table_req(ThriftHiveMetastore.java:1767)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.create_table_with_environment_context(HiveMetaStoreClient.java:3518)
>   at 
> org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.create_table_with_environment_context(SessionHiveMetaStoreClient.java:145)
>   at 
>

[jira] [Commented] (HIVE-12679) Allow users to be able to specify an implementation of IMetaStoreClient via HiveConf

2020-06-23 Thread Owen O'Malley (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-12679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17143114#comment-17143114
 ] 

Owen O'Malley commented on HIVE-12679:
--

Does the patch still apply to trunk?

> Allow users to be able to specify an implementation of IMetaStoreClient via 
> HiveConf
> 
>
> Key: HIVE-12679
> URL: https://issues.apache.org/jira/browse/HIVE-12679
> Project: Hive
>  Issue Type: Improvement
>  Components: Configuration, Metastore, Query Planning
>Reporter: Austin Lee
>Priority: Minor
>  Labels: metastore
> Attachments: HIVE-12679.1.patch, HIVE-12679.2.patch, 
> HIVE-12679.branch-1.2.patch, HIVE-12679.branch-2.3.patch, HIVE-12679.patch
>
>
> Hi,
> I would like to propose a change that would make it possible for users to 
> choose an implementation of IMetaStoreClient via HiveConf, i.e. 
> hive-site.xml.  Currently, in Hive the choice is hard coded to be 
> SessionHiveMetaStoreClient in org.apache.hadoop.hive.ql.metadata.Hive.  There 
> is no other direct reference to SessionHiveMetaStoreClient other than the 
> hard coded class name in Hive.java and the QL component operates only on the 
> IMetaStoreClient interface so the change would be minimal and it would be 
> quite similar to how an implementation of RawStore is specified and loaded in 
> hive-metastore.  One use case this change would serve would be one where a 
> user wishes to use an implementation of this interface without the dependency 
> on the Thrift server.
>   
> Thank you,
> Austin



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Comment Edited] (HIVE-12679) Allow users to be able to specify an implementation of IMetaStoreClient via HiveConf

2020-06-23 Thread Ratandeep Ratti (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-12679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17143101#comment-17143101
 ] 

Ratandeep Ratti edited comment on HIVE-12679 at 6/23/20, 4:45 PM:
--

What is missing in this patch to get it committed ? Maybe we can fill in the 
gaps


was (Author: rdsr):
What is missing in this patch? Maybe we can fill in the gaps

> Allow users to be able to specify an implementation of IMetaStoreClient via 
> HiveConf
> 
>
> Key: HIVE-12679
> URL: https://issues.apache.org/jira/browse/HIVE-12679
> Project: Hive
>  Issue Type: Improvement
>  Components: Configuration, Metastore, Query Planning
>Reporter: Austin Lee
>Priority: Minor
>  Labels: metastore
> Attachments: HIVE-12679.1.patch, HIVE-12679.2.patch, 
> HIVE-12679.branch-1.2.patch, HIVE-12679.branch-2.3.patch, HIVE-12679.patch
>
>
> Hi,
> I would like to propose a change that would make it possible for users to 
> choose an implementation of IMetaStoreClient via HiveConf, i.e. 
> hive-site.xml.  Currently, in Hive the choice is hard coded to be 
> SessionHiveMetaStoreClient in org.apache.hadoop.hive.ql.metadata.Hive.  There 
> is no other direct reference to SessionHiveMetaStoreClient other than the 
> hard coded class name in Hive.java and the QL component operates only on the 
> IMetaStoreClient interface so the change would be minimal and it would be 
> quite similar to how an implementation of RawStore is specified and loaded in 
> hive-metastore.  One use case this change would serve would be one where a 
> user wishes to use an implementation of this interface without the dependency 
> on the Thrift server.
>   
> Thank you,
> Austin



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-12679) Allow users to be able to specify an implementation of IMetaStoreClient via HiveConf

2020-06-23 Thread Ratandeep Ratti (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-12679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17143101#comment-17143101
 ] 

Ratandeep Ratti commented on HIVE-12679:


What is missing in this patch? Maybe we can fill in the gaps

> Allow users to be able to specify an implementation of IMetaStoreClient via 
> HiveConf
> 
>
> Key: HIVE-12679
> URL: https://issues.apache.org/jira/browse/HIVE-12679
> Project: Hive
>  Issue Type: Improvement
>  Components: Configuration, Metastore, Query Planning
>Reporter: Austin Lee
>Priority: Minor
>  Labels: metastore
> Attachments: HIVE-12679.1.patch, HIVE-12679.2.patch, 
> HIVE-12679.branch-1.2.patch, HIVE-12679.branch-2.3.patch, HIVE-12679.patch
>
>
> Hi,
> I would like to propose a change that would make it possible for users to 
> choose an implementation of IMetaStoreClient via HiveConf, i.e. 
> hive-site.xml.  Currently, in Hive the choice is hard coded to be 
> SessionHiveMetaStoreClient in org.apache.hadoop.hive.ql.metadata.Hive.  There 
> is no other direct reference to SessionHiveMetaStoreClient other than the 
> hard coded class name in Hive.java and the QL component operates only on the 
> IMetaStoreClient interface so the change would be minimal and it would be 
> quite similar to how an implementation of RawStore is specified and loaded in 
> hive-metastore.  One use case this change would serve would be one where a 
> user wishes to use an implementation of this interface without the dependency 
> on the Thrift server.
>   
> Thank you,
> Austin



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Comment Edited] (HIVE-23751) QTest: Override #mkdirs() method in ProxyFileSystem To Align After HADOOP-16582

2020-06-23 Thread Syed Shameerur Rahman (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-23751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17143082#comment-17143082
 ] 

Syed Shameerur Rahman edited comment on HIVE-23751 at 6/23/20, 4:24 PM:


Test failures seems unrelated - *TestPigHBaseStorageHandler#testPigHBaseSchema* 
looks unstable


was (Author: srahman):
Test failures seems unrelated - TestPigHBaseStorageHandler#testPigHBaseSchema 
looks unstable

> QTest: Override #mkdirs() method in ProxyFileSystem To Align After 
> HADOOP-16582
> ---
>
> Key: HIVE-23751
> URL: https://issues.apache.org/jira/browse/HIVE-23751
> Project: Hive
>  Issue Type: Task
>Reporter: Syed Shameerur Rahman
>Assignee: Syed Shameerur Rahman
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0, 3.2.0
>
> Attachments: HIVE-23751.01.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> HADOOP-16582 have changed the way how mkdirs() work:
> *Before HADOOP-16582:*
> All calls to mkdirs(p) were fast-tracked to FileSystem.mkdirs which were then 
> re-routed to mkdirs(p, permission) method. For ProxyFileSytem the call would 
> look like
> {code:java}
> FileUtiles.mkdir(p)  ->  FileSystem.mkdirs(p) ---> 
> ProxyFileSytem.mkdirs(p,permission)
> {code}
> An implementation of FileSystem have only needed implement mkdirs(p, 
> permission)
> *After HADOOP-16582:*
> Since FilterFileSystem overrides mkdirs(p) method the new call to 
> ProxyFileSystem would look like
> {code:java}
> FileUtiles.mkdir(p) ---> FilterFileSystem.mkdirs(p) -->
> {code}
> This will make all the qtests fails with the below exception 
> {code:java}
> Caused by: java.lang.IllegalArgumentException: Wrong FS: 
> pfile:/media/ebs1/workspace/hive-3.1-qtest/group/5/label/HiveQTest/hive-1.2.0/itests/qtest/target/warehouse/dest1,
>  expected: file:///
> {code}
> Note: We will hit this issue when we bump up hadoop version in hive.
> So as per the discussion in HADOOP-16963 ProxyFileSystem would need to 
> override the mkdirs(p) method inorder to solve the above problem. So now the 
> new flow would look like
> {code:java}
> FileUtiles.mkdir(p)  >   ProxyFileSytem.mkdirs(p) ---> 
> ProxyFileSytem.mkdirs(p, permission) --->
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-23751) QTest: Override #mkdirs() method in ProxyFileSystem To Align After HADOOP-16582

2020-06-23 Thread Syed Shameerur Rahman (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-23751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17143082#comment-17143082
 ] 

Syed Shameerur Rahman commented on HIVE-23751:
--

Test failures seems unrelated - TestPigHBaseStorageHandler#testPigHBaseSchema 
looks unstable

> QTest: Override #mkdirs() method in ProxyFileSystem To Align After 
> HADOOP-16582
> ---
>
> Key: HIVE-23751
> URL: https://issues.apache.org/jira/browse/HIVE-23751
> Project: Hive
>  Issue Type: Task
>Reporter: Syed Shameerur Rahman
>Assignee: Syed Shameerur Rahman
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0, 3.2.0
>
> Attachments: HIVE-23751.01.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> HADOOP-16582 have changed the way how mkdirs() work:
> *Before HADOOP-16582:*
> All calls to mkdirs(p) were fast-tracked to FileSystem.mkdirs which were then 
> re-routed to mkdirs(p, permission) method. For ProxyFileSytem the call would 
> look like
> {code:java}
> FileUtiles.mkdir(p)  ->  FileSystem.mkdirs(p) ---> 
> ProxyFileSytem.mkdirs(p,permission)
> {code}
> An implementation of FileSystem have only needed implement mkdirs(p, 
> permission)
> *After HADOOP-16582:*
> Since FilterFileSystem overrides mkdirs(p) method the new call to 
> ProxyFileSystem would look like
> {code:java}
> FileUtiles.mkdir(p) ---> FilterFileSystem.mkdirs(p) -->
> {code}
> This will make all the qtests fails with the below exception 
> {code:java}
> Caused by: java.lang.IllegalArgumentException: Wrong FS: 
> pfile:/media/ebs1/workspace/hive-3.1-qtest/group/5/label/HiveQTest/hive-1.2.0/itests/qtest/target/warehouse/dest1,
>  expected: file:///
> {code}
> Note: We will hit this issue when we bump up hadoop version in hive.
> So as per the discussion in HADOOP-16963 ProxyFileSystem would need to 
> override the mkdirs(p) method inorder to solve the above problem. So now the 
> new flow would look like
> {code:java}
> FileUtiles.mkdir(p)  >   ProxyFileSytem.mkdirs(p) ---> 
> ProxyFileSytem.mkdirs(p, permission) --->
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-23725) ValidTxnManager snapshot outdating causing partial reads in merge insert

2020-06-23 Thread Jesus Camacho Rodriguez (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-23725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17143023#comment-17143023
 ] 

Jesus Camacho Rodriguez commented on HIVE-23725:


+1

> ValidTxnManager snapshot outdating causing partial reads in merge insert
> 
>
> Key: HIVE-23725
> URL: https://issues.apache.org/jira/browse/HIVE-23725
> Project: Hive
>  Issue Type: Bug
>Reporter: Peter Varga
>Assignee: Peter Varga
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> When the ValidTxnManager invalidates the snapshot during merge insert and 
> starts to read committed transactions that were not committed when the query 
> compilation happened, it can cause partial read problems if the committed 
> transaction created new partition in the source or target table.
> The solution should be not only fix the snapshot but also recompile the query 
> and acquire the locks again.
> You could construct an example like this:
> 1. open and compile transaction 1 that merge inserts data from a partitioned 
> source table that has a few partition.
> 2. Open, run and commit transaction 2 that inserts data to an old and a new 
> partition to the source table.
> 3. Open, run and commit transaction 3 that inserts data to the target table 
> of the merge statement, that will retrigger a snapshot generation in 
> transaction 1.
> 4. Run transaction 1, the snapshot will be regenerated, and it will read 
> partial data from transaction 2 breaking the ACID properties.
> Different setup.
> Switch the transaction order:
> 1. compile transaction 1 that inserts data to an old and a new partition of 
> the source table.
> 2. compile transaction 2 that insert data to the target table
> 2. compile transaction 3 that merge inserts data from the source table to the 
> target table
> 3. run and commit transaction 1
> 4. run and commit transaction 2
> 5. run transaction 3, since it cointains 1 and 2 in its snaphot the 
> isValidTxnListState will be triggered and we do a partial read of the 
> transaction 1 for the same reasons.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

1 2 >

1 - 100 of 132 matches

Mail list logo