date:20200912

[jira] [Updated] (HIVE-24154) Missing simplification opportunity with IN and EQUALS clauses

2020-09-12 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-24154:
--
Labels: pull-request-available  (was: )

> Missing simplification opportunity with IN and EQUALS clauses
> -
>
> Key: HIVE-24154
> URL: https://issues.apache.org/jira/browse/HIVE-24154
> Project: Hive
>  Issue Type: Improvement
>  Components: CBO
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> For instance, in perf driver CBO query 74, there are several filters that 
> could be simplified further:
> {code}
> HiveFilter(condition=[AND(=($1, 1999), IN($1, 1998, 1999))])
> {code}
> This may lead to incorrect estimates and leads to unnecessary execution time.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24154) Missing simplification opportunity with IN and EQUALS clauses

2020-09-12 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24154?focusedWorklogId=483602=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-483602
 ]

ASF GitHub Bot logged work on HIVE-24154:
-

Author: ASF GitHub Bot
Created on: 13/Sep/20 01:24
Start Date: 13/Sep/20 01:24
Worklog Time Spent: 10m 
  Work Description: jcamachor opened a new pull request #1492:
URL: https://github.com/apache/hive/pull/1492


   …uses
   
   
   
   ### What changes were proposed in this pull request?
   
   
   
   ### Why are the changes needed?
   
   
   
   ### Does this PR introduce _any_ user-facing change?
   
   
   
   ### How was this patch tested?
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 483602)
Remaining Estimate: 0h
Time Spent: 10m

> Missing simplification opportunity with IN and EQUALS clauses
> -
>
> Key: HIVE-24154
> URL: https://issues.apache.org/jira/browse/HIVE-24154
> Project: Hive
>  Issue Type: Improvement
>  Components: CBO
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Minor
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> For instance, in perf driver CBO query 74, there are several filters that 
> could be simplified further:
> {code}
> HiveFilter(condition=[AND(=($1, 1999), IN($1, 1998, 1999))])
> {code}
> This may lead to incorrect estimates and leads to unnecessary execution time.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (HIVE-24154) Missing simplification opportunity with IN and EQUALS clauses

2020-09-12 Thread Jesus Camacho Rodriguez (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez reassigned HIVE-24154:
--


> Missing simplification opportunity with IN and EQUALS clauses
> -
>
> Key: HIVE-24154
> URL: https://issues.apache.org/jira/browse/HIVE-24154
> Project: Hive
>  Issue Type: Improvement
>  Components: CBO
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Minor
>
> For instance, in perf driver CBO query 74, there are several filters that 
> could be simplified further:
> {code}
> HiveFilter(condition=[AND(=($1, 1999), IN($1, 1998, 1999))])
> {code}
> This may lead to incorrect estimates and leads to unnecessary execution time.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23731) Review of AvroInstance Cache

2020-09-12 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23731?focusedWorklogId=483598=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-483598
 ]

ASF GitHub Bot logged work on HIVE-23731:
-

Author: ASF GitHub Bot
Created on: 13/Sep/20 00:48
Start Date: 13/Sep/20 00:48
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #1153:
URL: https://github.com/apache/hive/pull/1153


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 483598)
Time Spent: 1h 10m  (was: 1h)

> Review of AvroInstance Cache
> 
>
> Key: HIVE-23731
> URL: https://issues.apache.org/jira/browse/HIVE-23731
> Project: Hive
>  Issue Type: Improvement
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-22634) Improperly SemanticException when filter is optimized to False on a partition table

2020-09-12 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22634?focusedWorklogId=483597=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-483597
 ]

ASF GitHub Bot logged work on HIVE-22634:
-

Author: ASF GitHub Bot
Created on: 13/Sep/20 00:48
Start Date: 13/Sep/20 00:48
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #865:
URL: https://github.com/apache/hive/pull/865


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 483597)
Time Spent: 1h 20m  (was: 1h 10m)

> Improperly SemanticException when filter is optimized to False on a partition 
> table
> ---
>
> Key: HIVE-22634
> URL: https://issues.apache.org/jira/browse/HIVE-22634
> Project: Hive
>  Issue Type: Improvement
>Reporter: EdisonWang
>Assignee: EdisonWang
>Priority: Minor
>  Labels: pull-request-available
> Attachments: HIVE-22634.patch
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> When filter is optimized to False on a partition table, it will throw 
> improperly SemanticException reporting that there is no partition predicate 
> found.
> The step to reproduce is
> {code:java}
> set hive.strict.checks.no.partition.filter=true;
> CREATE TABLE test(id int, name string)PARTITIONED BY (`date` string);
> select * from test where `date` = '20191201' and 1<>1;
> {code}
>  
> The above sql will throw "Queries against partitioned tables without a 
> partition filter"  exception.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23779) BasicStatsTask Info is not getting printed in beeline console

2020-09-12 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23779?focusedWorklogId=483599=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-483599
 ]

ASF GitHub Bot logged work on HIVE-23779:
-

Author: ASF GitHub Bot
Created on: 13/Sep/20 00:48
Start Date: 13/Sep/20 00:48
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #1191:
URL: https://github.com/apache/hive/pull/1191


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 483599)
Time Spent: 1h  (was: 50m)

> BasicStatsTask Info is not getting printed in beeline console
> -
>
> Key: HIVE-23779
> URL: https://issues.apache.org/jira/browse/HIVE-23779
> Project: Hive
>  Issue Type: Bug
>Reporter: Naresh P R
>Assignee: Naresh P R
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> After HIVE-16061, partition basic stats are not getting printed in beeline 
> console.
> {code:java}
> INFO : Partition {dt=2020-06-29} stats: [numFiles=21, numRows=22, 
> totalSize=14607, rawDataSize=0]{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23802) “merge files” job was submited to default queue when set hive.merge.tezfiles to true

2020-09-12 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23802?focusedWorklogId=483600=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-483600
 ]

ASF GitHub Bot logged work on HIVE-23802:
-

Author: ASF GitHub Bot
Created on: 13/Sep/20 00:48
Start Date: 13/Sep/20 00:48
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #1206:
URL: https://github.com/apache/hive/pull/1206


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 483600)
Time Spent: 0.5h  (was: 20m)

> “merge files” job was submited to default queue when set hive.merge.tezfiles 
> to true
> 
>
> Key: HIVE-23802
> URL: https://issues.apache.org/jira/browse/HIVE-23802
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 3.1.0
>Reporter: gaozhan ding
>Assignee: gaozhan ding
>Priority: Major
>  Labels: pull-request-available
> Attachments: 15940042679272.png, HIVE-23802.patch
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> We use tez as the query engine. When hive.merge.tezfiles  set to true，merge 
> files task,  which followed by orginal task,  will be submit to default queue 
> rather then the queue same with orginal task.
> I study this issue for days and found that, every time starting a container, 
> "tez,queue.name" whill be unset in current session. Code are as below:
> {code:java}
> // TezSessionState.startSessionAndContainers()
> // sessionState.getQueueName() comes from cluster wide configured queue names.
>  // sessionState.getConf().get("tez.queue.name") is explicitly set by user in 
> a session.
>  // TezSessionPoolManager sets tez.queue.name if user has specified one or 
> use the one from
>  // cluster wide queue names.
>  // There is no way to differentiate how this was set (user vs system).
>  // Unset this after opening the session so that reopening of session uses 
> the correct queue
>  // names i.e, if client has not died and if the user has explicitly set a 
> queue name
>  // then reopened session will use user specified queue name else default 
> cluster queue names.
>  conf.unset(TezConfiguration.TEZ_QUEUE_NAME);
> {code}
> So after the orgin task was submited to yarn, "tez.queue.name" will be unset. 
> While starting merge file task, it will try use the same session with orgin 
> job, but get false due to tez.queue.name was unset. Seems like we could not 
> unset this property.
> {code:java}
> // TezSessionPoolManager.canWorkWithSameSession()
> if (!session.isDefault()) {
>   String queueName = session.getQueueName();
>   String confQueueName = conf.get(TezConfiguration.TEZ_QUEUE_NAME);
>   LOG.info("Current queue name is " + queueName + " incoming queue name is " 
> + confQueueName);
>   return (queueName == null) ? confQueueName == null : 
> queueName.equals(confQueueName);
> } else {
>   // this session should never be a default session unless something has 
> messed up.
>   throw new HiveException("The pool session " + session + " should have been 
> returned to the pool"); 
> }
> {code}
>    !15940042679272.png!
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23665) Rewrite last_value to first_value to enable streaming results

2020-09-12 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23665?focusedWorklogId=483601=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-483601
 ]

ASF GitHub Bot logged work on HIVE-23665:
-

Author: ASF GitHub Bot
Created on: 13/Sep/20 00:48
Start Date: 13/Sep/20 00:48
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #1177:
URL: https://github.com/apache/hive/pull/1177


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 483601)
Time Spent: 1.5h  (was: 1h 20m)

> Rewrite last_value to first_value to enable streaming results
> -
>
> Key: HIVE-23665
> URL: https://issues.apache.org/jira/browse/HIVE-23665
> Project: Hive
>  Issue Type: Bug
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-23665.1.patch, HIVE-23665.2.patch, 
> HIVE-23665.3.patch
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Rewrite last_value to first_value to enable streaming results
> last_value cannot be streamed because the intermediate results need to be 
> buffered to determine the window result till we get the last row in the 
> window. But if we can rewrite to first_value we can stream the results, 
> although the order of results will not be guaranteed (also not important)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23413) Create a new config to skip all locks

2020-09-12 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23413?focusedWorklogId=483581=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-483581
 ]

ASF GitHub Bot logged work on HIVE-23413:
-

Author: ASF GitHub Bot
Created on: 12/Sep/20 22:06
Start Date: 12/Sep/20 22:06
Worklog Time Spent: 10m 
  Work Description: pvargacl commented on pull request #1220:
URL: https://github.com/apache/hive/pull/1220#issuecomment-690929320







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 483581)
Time Spent: 2h 10m  (was: 2h)

> Create a new config to skip all locks
> -
>
> Key: HIVE-23413
> URL: https://issues.apache.org/jira/browse/HIVE-23413
> Project: Hive
>  Issue Type: Improvement
>Reporter: Peter Varga
>Assignee: Peter Varga
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-23413.1.patch, HIVE-23413.2.patch
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> From time-to-time some query is blocked on locks which should not.
> To have a quick workaround for this we should have a config which the user 
> can set in the session to disable acquiring/checking locks, so we can provide 
> it immediately and then later investigate and fix the root cause.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23413) Create a new config to skip all locks

2020-09-12 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23413?focusedWorklogId=483545=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-483545
 ]

ASF GitHub Bot logged work on HIVE-23413:
-

Author: ASF GitHub Bot
Created on: 12/Sep/20 22:03
Start Date: 12/Sep/20 22:03
Worklog Time Spent: 10m 
  Work Description: pvary commented on pull request #1220:
URL: https://github.com/apache/hive/pull/1220#issuecomment-690931518







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 483545)
Time Spent: 2h  (was: 1h 50m)

> Create a new config to skip all locks
> -
>
> Key: HIVE-23413
> URL: https://issues.apache.org/jira/browse/HIVE-23413
> Project: Hive
>  Issue Type: Improvement
>Reporter: Peter Varga
>Assignee: Peter Varga
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-23413.1.patch, HIVE-23413.2.patch
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> From time-to-time some query is blocked on locks which should not.
> To have a quick workaround for this we should have a config which the user 
> can set in the session to disable acquiring/checking locks, so we can provide 
> it immediately and then later investigate and fix the root cause.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-22782) Consolidate metastore call to fetch constraints

2020-09-12 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22782?focusedWorklogId=483521=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-483521
 ]

ASF GitHub Bot logged work on HIVE-22782:
-

Author: ASF GitHub Bot
Created on: 12/Sep/20 22:02
Start Date: 12/Sep/20 22:02
Worklog Time Spent: 10m 
  Work Description: sankarh commented on a change in pull request #1419:
URL: https://github.com/apache/hive/pull/1419#discussion_r486978810



##
File path: ql/src/java/org/apache/hadoop/hive/ql/exec/repl/ReplDumpTask.java
##
@@ -1155,22 +1150,19 @@ void dumpConstraintMetadata(String dbName, String 
tblName, Path dbRoot, Hive hiv
   Path constraintsRoot = new Path(dbRoot, 
ReplUtils.CONSTRAINTS_ROOT_DIR_NAME);
   Path commonConstraintsFile = new Path(constraintsRoot, 
ConstraintFileType.COMMON.getPrefix() + tblName);
   Path fkConstraintsFile = new Path(constraintsRoot, 
ConstraintFileType.FOREIGNKEY.getPrefix() + tblName);
-  List pks = hiveDb.getPrimaryKeyList(dbName, tblName);
-  List fks = hiveDb.getForeignKeyList(dbName, tblName);
-  List uks = hiveDb.getUniqueConstraintList(dbName, 
tblName);
-  List nns = hiveDb.getNotNullConstraintList(dbName, 
tblName);
-  if ((pks != null && !pks.isEmpty()) || (uks != null && !uks.isEmpty())
-  || (nns != null && !nns.isEmpty())) {
+  SQLAllTableConstraints tableConstraints = 
hiveDb.getTableConstraints(dbName,tblName);
+  if ((tableConstraints.getPrimaryKeys() != null && 
!tableConstraints.getPrimaryKeys().isEmpty()) || 
(tableConstraints.getUniqueConstraints() != null && 
!tableConstraints.getUniqueConstraints().isEmpty())

Review comment:
   Can add utility method to check for null and empty of given list. Used 
multiple times. Also use local variables to reduce the code.

##
File path: ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java
##
@@ -5661,184 +5663,79 @@ public void dropConstraint(String dbName, String 
tableName, String constraintNam
 }
   }
 
-  public List getDefaultConstraintList(String dbName, 
String tblName) throws HiveException, NoSuchObjectException {
+  public SQLAllTableConstraints getTableConstraints(String dbName, String 
tblName) throws HiveException, NoSuchObjectException {
 try {
-  return getMSC().getDefaultConstraints(new 
DefaultConstraintsRequest(getDefaultCatalog(conf), dbName, tblName));
+  AllTableConstraintsRequest tableConstraintsRequest = new 
AllTableConstraintsRequest();
+  tableConstraintsRequest.setDbName(dbName);
+  tableConstraintsRequest.setTblName(tblName);
+  tableConstraintsRequest.setCatName(getDefaultCatalog(conf));
+  return getMSC().getAllTableConstraints(tableConstraintsRequest);
 } catch (NoSuchObjectException e) {
   throw e;
 } catch (Exception e) {
   throw new HiveException(e);
 }
   }
-
-  public List getCheckConstraintList(String dbName, String 
tblName) throws HiveException, NoSuchObjectException {
-try {
-  return getMSC().getCheckConstraints(new 
CheckConstraintsRequest(getDefaultCatalog(conf),
-  dbName, tblName));
-} catch (NoSuchObjectException e) {
-  throw e;
-} catch (Exception e) {
-  throw new HiveException(e);
-}
+  public TableConstraintsInfo getAllTableConstraints(String dbName, String 
tblName) throws HiveException {
+return getTableConstraints(dbName, tblName, false, false);
   }
 
-  /**
-   * Get all primary key columns associated with the table.
-   *
-   * @param dbName Database Name
-   * @param tblName Table Name
-   * @return Primary Key associated with the table.
-   * @throws HiveException
-   */
-  public PrimaryKeyInfo getPrimaryKeys(String dbName, String tblName) throws 
HiveException {
-return getPrimaryKeys(dbName, tblName, false);
+  public TableConstraintsInfo getReliableAndEnableTableConstraints(String 
dbName, String tblName) throws HiveException {
+return getTableConstraints(dbName, tblName, true, true);
   }
 
-  /**
-   * Get primary key columns associated with the table that are available for 
optimization.
-   *
-   * @param dbName Database Name
-   * @param tblName Table Name
-   * @return Primary Key associated with the table.
-   * @throws HiveException
-   */
-  public PrimaryKeyInfo getReliablePrimaryKeys(String dbName, String tblName) 
throws HiveException {
-return getPrimaryKeys(dbName, tblName, true);
-  }
-
-  private PrimaryKeyInfo getPrimaryKeys(String dbName, String tblName, boolean 
onlyReliable)
+  private TableConstraintsInfo getTableConstraints(String dbName, String 
tblName, boolean reliable, boolean enable)

Review comment:
   nit: Use "fetchReliable" and "fetchEnabled" instead of "reliable" and 
"enable" as it sound like flag to enable something.

##
File path: ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java
##
@@ -116,22

[jira] [Work logged] (HIVE-24151) MultiDelimitSerDe shifts data if strings contain non-ASCII characters

2020-09-12 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24151?focusedWorklogId=483531=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-483531
 ]

ASF GitHub Bot logged work on HIVE-24151:
-

Author: ASF GitHub Bot
Created on: 12/Sep/20 22:02
Start Date: 12/Sep/20 22:02
Worklog Time Spent: 10m 
  Work Description: szlta opened a new pull request #1490:
URL: https://github.com/apache/hive/pull/1490


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 483531)
Time Spent: 1h 50m  (was: 1h 40m)

> MultiDelimitSerDe shifts data if strings contain non-ASCII characters
> -
>
> Key: HIVE-24151
> URL: https://issues.apache.org/jira/browse/HIVE-24151
> Project: Hive
>  Issue Type: Bug
>Reporter: Ádám Szita
>Assignee: Ádám Szita
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> HIVE-22360 intended to fix another MultiDelimitSerde problem (with NULL last 
> columns) but introduced a regression: the approach of the fix is pretty much 
> all wrong, as the existing logic that operated on bytes got replaced by regex 
> matcher logic which deals in character positions, rather than byte positions. 
> As some non ASCII characters consist of more than 1 byte, the whole record 
> may get shifted due to this.
> With this ticket I'm going to restore the old logic, and apply the proper fix 
> on that, but keeping (and extending) the test cases added with HIVE-22360 so 
> that we have a solution for both issues.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24138) Llap external client flow is broken due to netty shading

2020-09-12 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24138?focusedWorklogId=483508=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-483508
 ]

ASF GitHub Bot logged work on HIVE-24138:
-

Author: ASF GitHub Bot
Created on: 12/Sep/20 22:01
Start Date: 12/Sep/20 22:01
Worklog Time Spent: 10m 
  Work Description: ayushtkn opened a new pull request #1491:
URL: https://github.com/apache/hive/pull/1491


   https://issues.apache.org/jira/browse/HIVE-24138



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 483508)
Time Spent: 0.5h  (was: 20m)

> Llap external client flow is broken due to netty shading
> 
>
> Key: HIVE-24138
> URL: https://issues.apache.org/jira/browse/HIVE-24138
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Reporter: Shubham Chaurasia
>Assignee: Ayush Saxena
>Priority: Critical
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> We shaded netty in hive-exec in - 
> https://issues.apache.org/jira/browse/HIVE-23073
> This breaks LLAP external client flow on LLAP daemon side - 
> LLAP daemon stacktrace - 
> {code}
> 2020-09-09T18:22:13,413  INFO [TezTR-222977_4_0_0_0_0 
> (497418324441977_0004_0_00_00_0)] llap.LlapOutputFormat: Returning 
> writer for: attempt_497418324441977_0004_0_00_00_0
> 2020-09-09T18:22:13,419 ERROR [TezTR-222977_4_0_0_0_0 
> (497418324441977_0004_0_00_00_0)] tez.MapRecordSource: 
> java.lang.NoSuchMethodError: 
> org.apache.arrow.memory.BufferAllocator.buffer(I)Lorg/apache/hive/io/netty/buffer/ArrowBuf;
>   at 
> org.apache.hadoop.hive.llap.WritableByteChannelAdapter.write(WritableByteChannelAdapter.java:96)
>   at org.apache.arrow.vector.ipc.WriteChannel.write(WriteChannel.java:74)
>   at org.apache.arrow.vector.ipc.WriteChannel.write(WriteChannel.java:57)
>   at 
> org.apache.arrow.vector.ipc.WriteChannel.writeIntLittleEndian(WriteChannel.java:89)
>   at 
> org.apache.arrow.vector.ipc.message.MessageSerializer.serialize(MessageSerializer.java:88)
>   at 
> org.apache.arrow.vector.ipc.ArrowWriter.ensureStarted(ArrowWriter.java:130)
>   at 
> org.apache.arrow.vector.ipc.ArrowWriter.writeBatch(ArrowWriter.java:102)
>   at 
> org.apache.hadoop.hive.llap.LlapArrowRecordWriter.write(LlapArrowRecordWriter.java:85)
>   at 
> org.apache.hadoop.hive.llap.LlapArrowRecordWriter.write(LlapArrowRecordWriter.java:46)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.filesink.VectorFileSinkArrowOperator.process(VectorFileSinkArrowOperator.java:137)
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.vectorForward(Operator.java:969)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.process(VectorSelectOperator.java:158)
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.vectorForward(Operator.java:969)
>   at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:172)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.deliverVectorizedRowBatch(VectorMapOperator.java:809)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:842)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:92)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:76)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:426)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:267)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:250)
>   at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:75)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:62)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1876)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:62)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:38)

[jira] [Work logged] (HIVE-24145) Fix preemption issues in reducers and file sink operators

2020-09-12 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24145?focusedWorklogId=483513=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-483513
 ]

ASF GitHub Bot logged work on HIVE-24145:
-

Author: ASF GitHub Bot
Created on: 12/Sep/20 22:01
Start Date: 12/Sep/20 22:01
Worklog Time Spent: 10m 
  Work Description: rbalamohan commented on a change in pull request #1485:
URL: https://github.com/apache/hive/pull/1485#discussion_r486786544



##
File path: ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java
##
@@ -216,29 +216,47 @@ public FSPaths(Path specPath, boolean isMmTable, boolean 
isDirectInsert, boolean
 }
 
 public void closeWriters(boolean abort) throws HiveException {
+  Exception exception = null;
   for (int idx = 0; idx < outWriters.length; idx++) {
 if (outWriters[idx] != null) {
   try {
 outWriters[idx].close(abort);
 updateProgress();
   } catch (IOException e) {
-throw new HiveException(e);
+exception = e;
+LOG.error("Error closing " + outWriters[idx].toString(), e);
+// continue closing others
   }
 }
   }
-  try {
+  for (int i = 0; i < updaters.length; i++) {
+if (updaters[i] != null) {
+  SerDeStats stats = updaters[i].getStats();
+  // Ignore 0 row files except in case of insert overwrite
+  if (isDirectInsert && (stats.getRowCount() > 0 || 
isInsertOverwrite)) {
+outPathsCommitted[i] = updaters[i].getUpdatedFilePath();
+  }
+  try {
+updaters[i].close(abort);
+  } catch (IOException e) {
+exception = e;
+LOG.error("Error closing " + updaters[i].toString(), e);
+// continue closing others
+  }
+}
+  }
+  // Made an attempt to close all writers.
+  if (exception != null) {
 for (int i = 0; i < updaters.length; i++) {
   if (updaters[i] != null) {
-SerDeStats stats = updaters[i].getStats();
-// Ignore 0 row files except in case of insert overwrite
-if (isDirectInsert && (stats.getRowCount() > 0 || 
isInsertOverwrite)) {
-  outPathsCommitted[i] = updaters[i].getUpdatedFilePath();
+try {
+  fs.delete(updaters[i].getUpdatedFilePath(), true);
+} catch (IOException e) {
+  e.printStackTrace();

Review comment:
   LOG?

##
File path: 
ql/src/java/org/apache/hadoop/hive/ql/optimizer/SortedDynPartitionOptimizer.java
##
@@ -284,6 +285,11 @@ public Object process(Node nd, Stack stack, 
NodeProcessorCtx procCtx,
   // Create ReduceSink operator
   ReduceSinkOperator rsOp = getReduceSinkOp(partitionPositions, 
sortPositions, sortOrder, sortNullOrder,
   allRSCols, bucketColumns, numBuckets, fsParent, 
fsOp.getConf().getWriteType());
+  // we have to make sure not to reorder the child operators as it might 
cause weird behavior in the tasks at
+  // the same level. when there is auto stats gather at the same level as 
another operation then it might
+  // cause unnecessary preemption. Maintaining the order here to avoid 
such preemption and possible errors

Review comment:
   Plz add TEZ-3296 as ref if possible.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 483513)
Time Spent: 1h 10m  (was: 1h)

> Fix preemption issues in reducers and file sink operators
> -
>
> Key: HIVE-24145
> URL: https://issues.apache.org/jira/browse/HIVE-24145
> Project: Hive
>  Issue Type: Bug
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> There are two issues because of preemption:
>  # Reducers are getting reordered as part of optimizations because of which 
> more preemption happen
>  # Preemption in the middle of writing can cause the file to not close and 
> lead to errors when we read the file later



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24150) Refactor CommitTxnRequest field order

2020-09-12 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24150?focusedWorklogId=483492=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-483492
 ]

ASF GitHub Bot logged work on HIVE-24150:
-

Author: ASF GitHub Bot
Created on: 12/Sep/20 22:00
Start Date: 12/Sep/20 22:00
Worklog Time Spent: 10m 
  Work Description: deniskuzZ opened a new pull request #1489:
URL: https://github.com/apache/hive/pull/1489


   
   
   ### What changes were proposed in this pull request?
   
   Refactor CommitTxnRequest field order (keyValue and exclWriteEnabled).
   
   ### Why are the changes needed?
   
   HIVE-24125 introduced backward incompatible change.
   
   ### Does this PR introduce _any_ user-facing change?
   
   
   
   ### How was this patch tested?
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 483492)
Time Spent: 0.5h  (was: 20m)

> Refactor CommitTxnRequest field order
> -
>
> Key: HIVE-24150
> URL: https://issues.apache.org/jira/browse/HIVE-24150
> Project: Hive
>  Issue Type: Bug
>Reporter: Denys Kuzmenko
>Assignee: Denys Kuzmenko
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Refactor CommitTxnRequest field order (keyValue and exclWriteEnabled). 
> HIVE-24125 introduced backward incompatible change.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24147) Table column names are not extracted correctly in Hive JDBC storage handler

2020-09-12 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24147?focusedWorklogId=483459=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-483459
 ]

ASF GitHub Bot logged work on HIVE-24147:
-

Author: ASF GitHub Bot
Created on: 12/Sep/20 21:57
Start Date: 12/Sep/20 21:57
Worklog Time Spent: 10m 
  Work Description: jcamachor opened a new pull request #1486:
URL: https://github.com/apache/hive/pull/1486







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 483459)
Time Spent: 40m  (was: 0.5h)

> Table column names are not extracted correctly in Hive JDBC storage handler
> ---
>
> Key: HIVE-24147
> URL: https://issues.apache.org/jira/browse/HIVE-24147
> Project: Hive
>  Issue Type: Bug
>  Components: JDBC storage handler
>Affects Versions: 4.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> It seems the `ResultSetMetaData` for the query used to retrieve the table 
> columns names contains fully qualified names, instead of possibly supporting 
> the {{getTableName}} method. This ends up throwing the storage handler off 
> and leading to exceptions, both in CBO path and non-CBO path.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24143) Include convention in JDBC converter operator in Calcite plan

2020-09-12 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24143?focusedWorklogId=483441=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-483441
 ]

ASF GitHub Bot logged work on HIVE-24143:
-

Author: ASF GitHub Bot
Created on: 12/Sep/20 21:56
Start Date: 12/Sep/20 21:56
Worklog Time Spent: 10m 
  Work Description: vineetgarg02 commented on pull request #1482:
URL: https://github.com/apache/hive/pull/1482#issuecomment-691326259







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 483441)
Time Spent: 2h 50m  (was: 2h 40m)

> Include convention in JDBC converter operator in Calcite plan
> -
>
> Key: HIVE-24143
> URL: https://issues.apache.org/jira/browse/HIVE-24143
> Project: Hive
>  Issue Type: Improvement
>  Components: CBO
>Affects Versions: 4.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> Among others, it will be useful to debug the dialect being chosen for query 
> generation. For instance:
> {code}
>  HiveProject(jdbc_type_conversion_table1.ikey=[$0], 
> jdbc_type_conversion_table1.bkey=[$1], jdbc_type_conversion_table1.fkey=[$2], 
> jdbc_type_conversion_table1.dkey=[$3], 
> jdbc_type_conversion_table1.chkey=[$4], 
> jdbc_type_conversion_table1.dekey=[$5], 
> jdbc_type_conversion_table1.dtkey=[$6], jdbc_type_conversion_table1.tkey=[$7])
>   HiveProject(ikey=[$0], bkey=[$1], fkey=[$2], dkey=[$3], chkey=[$4], 
> dekey=[$5], dtkey=[$6], tkey=[$7])
> ->HiveJdbcConverter(convention=[JDBC.DERBY])
>   JdbcHiveTableScan(table=[[default, jdbc_type_conversion_table1]], 
> table:alias=[jdbc_type_conversion_table1])
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24143) Include convention in JDBC converter operator in Calcite plan

2020-09-12 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24143?focusedWorklogId=483430=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-483430
 ]

ASF GitHub Bot logged work on HIVE-24143:
-

Author: ASF GitHub Bot
Created on: 12/Sep/20 21:55
Start Date: 12/Sep/20 21:55
Worklog Time Spent: 10m 
  Work Description: jcamachor merged pull request #1482:
URL: https://github.com/apache/hive/pull/1482







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 483430)
Time Spent: 2h 40m  (was: 2.5h)

> Include convention in JDBC converter operator in Calcite plan
> -
>
> Key: HIVE-24143
> URL: https://issues.apache.org/jira/browse/HIVE-24143
> Project: Hive
>  Issue Type: Improvement
>  Components: CBO
>Affects Versions: 4.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> Among others, it will be useful to debug the dialect being chosen for query 
> generation. For instance:
> {code}
>  HiveProject(jdbc_type_conversion_table1.ikey=[$0], 
> jdbc_type_conversion_table1.bkey=[$1], jdbc_type_conversion_table1.fkey=[$2], 
> jdbc_type_conversion_table1.dkey=[$3], 
> jdbc_type_conversion_table1.chkey=[$4], 
> jdbc_type_conversion_table1.dekey=[$5], 
> jdbc_type_conversion_table1.dtkey=[$6], jdbc_type_conversion_table1.tkey=[$7])
>   HiveProject(ikey=[$0], bkey=[$1], fkey=[$2], dkey=[$3], chkey=[$4], 
> dekey=[$5], dtkey=[$6], tkey=[$7])
> ->HiveJdbcConverter(convention=[JDBC.DERBY])
>   JdbcHiveTableScan(table=[[default, jdbc_type_conversion_table1]], 
> table:alias=[jdbc_type_conversion_table1])
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23841) Field writers is an HashSet, i.e., not thread-safe. Field writers is typically protected by synchronization on lock, but not in 1 location.

2020-09-12 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23841?focusedWorklogId=483426=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-483426
 ]

ASF GitHub Bot logged work on HIVE-23841:
-

Author: ASF GitHub Bot
Created on: 12/Sep/20 21:54
Start Date: 12/Sep/20 21:54
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #1248:
URL: https://github.com/apache/hive/pull/1248#issuecomment-691367907







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 483426)
Time Spent: 1h  (was: 50m)

> Field writers is an HashSet, i.e., not thread-safe.  Field writers is 
> typically protected by synchronization on lock, but not in 1 location.
> 
>
> Key: HIVE-23841
> URL: https://issues.apache.org/jira/browse/HIVE-23841
> Project: Hive
>  Issue Type: Bug
> Environment: Any environment
>Reporter: Adrian Nistor
>Priority: Major
>  Labels: patch-available, pull-request-available
> Attachments: HIVE-23841.patch
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> I also submitted a pull request on github at:
>  
> [https://github.com/apache/hive/pull/1248]
>  
> (same patch)
> h1. Description
>  
> Field {{writers}} is a {{HashSet}} ([line 
> 70|https://github.com/apache/hive/blob/c93d7797329103d6c509bada68b6da7f907b3dee/ql/src/java/org/apache/hadoop/hive/llap/LlapOutputFormatService.java#L70]),
>  i.e., not thread-safe.
> Accesses to field {{writers}} are protected by synchronization on {{lock}}, 
> e.g., at lines: 
> [141-144|https://github.com/apache/hive/blob/c93d7797329103d6c509bada68b6da7f907b3dee/ql/src/java/org/apache/hadoop/hive/llap/LlapOutputFormatService.java#L141-L144],
>  
> [212-213|https://github.com/apache/hive/blob/c93d7797329103d6c509bada68b6da7f907b3dee/ql/src/java/org/apache/hadoop/hive/llap/LlapOutputFormatService.java#L212-L213],
>  and 
> [212-215|https://github.com/apache/hive/blob/c93d7797329103d6c509bada68b6da7f907b3dee/ql/src/java/org/apache/hadoop/hive/llap/LlapOutputFormatService.java#L212-L215].
> However, the {{writers.remove()}} at [line 
> 249|https://github.com/apache/hive/blob/c93d7797329103d6c509bada68b6da7f907b3dee/ql/src/java/org/apache/hadoop/hive/llap/LlapOutputFormatService.java#L249]
>  is protected by synchronization on {{INSTANCE}}, *not* on {{lock}}.
> Synchronizing on 2 different objects does not ensure mutual exclusion. This 
> is because 2 threads synchronizing on different objects can still execute in 
> parallel at the same time.
> Note that lines 
> [215|https://github.com/apache/hive/blob/c93d7797329103d6c509bada68b6da7f907b3dee/ql/src/java/org/apache/hadoop/hive/llap/LlapOutputFormatService.java#L215]
>  and 
> [249|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/llap/LlapOutputFormatService.java#L249]
>  are modifying {{writers}} with {{put()}} and {{remove()}}, respectively.
> h1. The Code for This Fix
> This fix is very simple: just change {{synchronized (INSTANCE)}} to 
> {{synchronized (lock)}}, just like the methods containing the other lines 
> listed above.[]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24151) MultiDelimitSerDe shifts data if strings contain non-ASCII characters

2020-09-12 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24151?focusedWorklogId=483384=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-483384
 ]

ASF GitHub Bot logged work on HIVE-24151:
-

Author: ASF GitHub Bot
Created on: 12/Sep/20 21:51
Start Date: 12/Sep/20 21:51
Worklog Time Spent: 10m 
  Work Description: szlta commented on pull request #1490:
URL: https://github.com/apache/hive/pull/1490#issuecomment-691122063







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 483384)
Time Spent: 1h 40m  (was: 1.5h)

> MultiDelimitSerDe shifts data if strings contain non-ASCII characters
> -
>
> Key: HIVE-24151
> URL: https://issues.apache.org/jira/browse/HIVE-24151
> Project: Hive
>  Issue Type: Bug
>Reporter: Ádám Szita
>Assignee: Ádám Szita
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> HIVE-22360 intended to fix another MultiDelimitSerde problem (with NULL last 
> columns) but introduced a regression: the approach of the fix is pretty much 
> all wrong, as the existing logic that operated on bytes got replaced by regex 
> matcher logic which deals in character positions, rather than byte positions. 
> As some non ASCII characters consist of more than 1 byte, the whole record 
> may get shifted due to this.
> With this ticket I'm going to restore the old logic, and apply the proper fix 
> on that, but keeping (and extending) the test cases added with HIVE-22360 so 
> that we have a solution for both issues.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-22290) ObjectStore.cleanWriteNotificationEvents and ObjectStore.cleanupEvents OutOfMemory on large number of pending events

2020-09-12 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22290?focusedWorklogId=483327=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-483327
 ]

ASF GitHub Bot logged work on HIVE-22290:
-

Author: ASF GitHub Bot
Created on: 12/Sep/20 21:47
Start Date: 12/Sep/20 21:47
Worklog Time Spent: 10m 
  Work Description: nareshpr opened a new pull request #1484:
URL: https://github.com/apache/hive/pull/1484







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 483327)
Time Spent: 0.5h  (was: 20m)

> ObjectStore.cleanWriteNotificationEvents and ObjectStore.cleanupEvents 
> OutOfMemory on large number of pending events
> 
>
> Key: HIVE-22290
> URL: https://issues.apache.org/jira/browse/HIVE-22290
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog, repl
>Affects Versions: 4.0.0
>Reporter: Thomas Prelle
>Assignee: Naresh P R
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> As in [https://jira.apache.org/jira/browse/HIVE-19430] if there are large 
> number of events that haven't been cleaned up for some reason, then 
> ObjectStore.cleanWriteNotificationEvents() and ObjectStore.cleanupEvents can 
> run out of memory while it loads all the events to be deleted.
> It should fetch events in batches.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24022) Optimise HiveMetaStoreAuthorizer.createHiveMetaStoreAuthorizer

2020-09-12 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24022?focusedWorklogId=483324=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-483324
 ]

ASF GitHub Bot logged work on HIVE-24022:
-

Author: ASF GitHub Bot
Created on: 12/Sep/20 21:46
Start Date: 12/Sep/20 21:46
Worklog Time Spent: 10m 
  Work Description: nrg4878 commented on pull request #1385:
URL: https://github.com/apache/hive/pull/1385#issuecomment-691213984







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 483324)
Time Spent: 1h  (was: 50m)

> Optimise HiveMetaStoreAuthorizer.createHiveMetaStoreAuthorizer
> --
>
> Key: HIVE-24022
> URL: https://issues.apache.org/jira/browse/HIVE-24022
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Sam An
>Priority: Minor
>  Labels: performance, pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> For a table with 3000+ partitions, analyze table takes a lot longer time as 
> HiveMetaStoreAuthorizer tries to create HiveConf for every partition request.
>  
> [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/metastore/HiveMetaStoreAuthorizer.java#L319]
>  
> [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/metastore/HiveMetaStoreAuthorizer.java#L447]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24084) Push Aggregates thru joins in case it re-groups previously unique columns

2020-09-12 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24084?focusedWorklogId=483311=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-483311
 ]

ASF GitHub Bot logged work on HIVE-24084:
-

Author: ASF GitHub Bot
Created on: 12/Sep/20 21:45
Start Date: 12/Sep/20 21:45
Worklog Time Spent: 10m 
  Work Description: jcamachor commented on a change in pull request #1439:
URL: https://github.com/apache/hive/pull/1439#discussion_r487130820



##
File path: 
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveAggregateJoinTransposeRule.java
##
@@ -303,6 +305,90 @@ public void onMatch(RelOptRuleCall call) {
 }
   }
 
+  /**
+   * Determines weather the give grouping is unique.
+   *
+   * Consider a join which might produce non-unique rows; but later the 
results are aggregated again.
+   * This method determines if there are sufficient columns in the grouping 
which have been present previously as unique column(s).
+   */
+  private boolean isGroupingUnique(RelNode input, ImmutableBitSet groups) {
+if (groups.isEmpty()) {
+  return false;
+}
+RelMetadataQuery mq = input.getCluster().getMetadataQuery();
+Set uKeys = mq.getUniqueKeys(input);
+for (ImmutableBitSet u : uKeys) {
+  if (groups.contains(u)) {
+return true;
+  }
+}
+if (input instanceof Join) {
+  Join join = (Join) input;
+  RexBuilder rexBuilder = input.getCluster().getRexBuilder();
+  SimpleConditionInfo cond = new SimpleConditionInfo(join.getCondition(), 
rexBuilder);
+
+  if (cond.valid) {
+ImmutableBitSet newGroup = 
groups.intersect(ImmutableBitSet.fromBitSet(cond.fields));
+RelNode l = join.getLeft();
+RelNode r = join.getRight();
+
+int joinFieldCount = join.getRowType().getFieldCount();
+int lFieldCount = l.getRowType().getFieldCount();
+
+ImmutableBitSet groupL = newGroup.get(0, lFieldCount);
+ImmutableBitSet groupR = newGroup.get(lFieldCount, 
joinFieldCount).shift(-lFieldCount);
+
+if (isGroupingUnique(l, groupL)) {

Review comment:
   OK. If it turns out there are many changes and it may need some time to 
be fixed, feel free to defer to follow-up JIRA and let's merge this one.
   

##
File path: ql/src/test/queries/clientpositive/tpch18.q
##
@@ -0,0 +1,133 @@
+--! qt:dataset:tpch_0_001.customer
+--! qt:dataset:tpch_0_001.lineitem
+--! qt:dataset:tpch_0_001.nation
+--! qt:dataset:tpch_0_001.orders
+--! qt:dataset:tpch_0_001.part
+--! qt:dataset:tpch_0_001.partsupp
+--! qt:dataset:tpch_0_001.region
+--! qt:dataset:tpch_0_001.supplier
+
+
+use tpch_0_001;
+
+set hive.transpose.aggr.join=true;
+set hive.transpose.aggr.join.unique=true;
+set hive.mapred.mode=nonstrict;
+
+create view q18_tmp_cached as
+select
+   l_orderkey,
+   sum(l_quantity) as t_sum_quantity
+from
+   lineitem
+where
+   l_orderkey is not null
+group by
+   l_orderkey;
+
+
+
+explain cbo select
+c_name,
+c_custkey,
+o_orderkey,
+o_orderdate,
+o_totalprice,
+sum(l_quantity)
+from
+   customer,
+   orders,
+   q18_tmp_cached t,
+   lineitem l
+where
+c_custkey = o_custkey
+and o_orderkey = t.l_orderkey
+and o_orderkey is not null
+and t.t_sum_quantity > 300
+and o_orderkey = l.l_orderkey
+and l.l_orderkey is not null
+group by
+c_name,
+c_custkey,
+o_orderkey,
+o_orderdate,
+o_totalprice
+order by
+o_totalprice desc,
+o_orderdate
+limit 100;
+
+
+
+select 'add constraints';
+
+alter table orders add constraint pk_o primary key (o_orderkey) disable 
novalidate rely;
+alter table customer add constraint pk_c primary key (c_custkey) disable 
novalidate rely;
+

Review comment:
   Thanks @kgyrtkirk .
   
   This seems to need further exploration, we thought 
https://issues.apache.org/jira/browse/HIVE-24087 was going to help here. 
@vineetgarg02 , could you take a look at this once this patch is merged? Maybe 
the shape of the plan is slightly different to the one we anticipated.

##
File path: 
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveAggregateJoinTransposeRule.java
##
@@ -303,6 +305,90 @@ public void onMatch(RelOptRuleCall call) {
 }
   }
 
+  /**
+   * Determines weather the give grouping is unique.
+   *
+   * Consider a join which might produce non-unique rows; but later the 
results are aggregated again.
+   * This method determines if there are sufficient columns in the grouping 
which have been present previously as unique column(s).
+   */
+  private boolean isGroupingUnique(RelNode input, ImmutableBitSet groups) {
+if (groups.isEmpty()) {
+  return false;
+}
+RelMetadataQuery mq = input.getCluster().getMetadataQuery();
+Set uKeys = mq.getUniqueKeys(input);
+for (ImmutableBitSet u : uKeys) {
+  if (groups.contains(u)) {
+return true;
+  }
+}
+if (input

[jira] [Work logged] (HIVE-24097) correct NPE exception in HiveMetastoreAuthorizer

2020-09-12 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24097?focusedWorklogId=483282=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-483282
 ]

ASF GitHub Bot logged work on HIVE-24097:
-

Author: ASF GitHub Bot
Created on: 12/Sep/20 21:43
Start Date: 12/Sep/20 21:43
Worklog Time Spent: 10m 
  Work Description: nrg4878 commented on pull request #1448:
URL: https://github.com/apache/hive/pull/1448#issuecomment-691254659







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 483282)
Time Spent: 40m  (was: 0.5h)

> correct NPE exception in HiveMetastoreAuthorizer
> 
>
> Key: HIVE-24097
> URL: https://issues.apache.org/jira/browse/HIVE-24097
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 4.0.0
>Reporter: Sam An
>Assignee: Sam An
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> In some testing, we found it's possible to have NPE if the preEventType does 
> not fall within the several the HMS currently checks. This makes the 
> AuthzContext a null pointer. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24144) getIdentifierQuoteString in HiveDatabaseMetaData returns incorrect value

2020-09-12 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24144?focusedWorklogId=483280=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-483280
 ]

ASF GitHub Bot logged work on HIVE-24144:
-

Author: ASF GitHub Bot
Created on: 12/Sep/20 21:40
Start Date: 12/Sep/20 21:40
Worklog Time Spent: 10m 
  Work Description: jcamachor opened a new pull request #1487:
URL: https://github.com/apache/hive/pull/1487







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 483280)
Time Spent: 20m  (was: 10m)

> getIdentifierQuoteString in HiveDatabaseMetaData returns incorrect value
> 
>
> Key: HIVE-24144
> URL: https://issues.apache.org/jira/browse/HIVE-24144
> Project: Hive
>  Issue Type: Bug
>  Components: JDBC, JDBC storage handler
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> {code}
>   public String getIdentifierQuoteString() throws SQLException {
> return " ";
>   }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24127) Dump events from default catalog only

2020-09-12 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24127?focusedWorklogId=483241=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-483241
 ]

ASF GitHub Bot logged work on HIVE-24127:
-

Author: ASF GitHub Bot
Created on: 12/Sep/20 21:37
Start Date: 12/Sep/20 21:37
Worklog Time Spent: 10m 
  Work Description: pkumarsinha commented on a change in pull request #1478:
URL: https://github.com/apache/hive/pull/1478#discussion_r486826515



##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/messaging/event/filters/CatalogFilter.java
##
@@ -0,0 +1,39 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hive.metastore.messaging.event.filters;
+
+import org.apache.hadoop.hive.metastore.api.NotificationEvent;
+
+/**
+ * Utility function that constructs a notification filter to match a given 
catalog name.
+ */
+public class CatalogFilter extends BasicFilter {
+  private final String catalogName;
+
+  public CatalogFilter(final String catalogName) {
+this.catalogName = catalogName;
+  }
+
+  @Override
+  boolean shouldAccept(final NotificationEvent event) {
+if (catalogName == null || event.getCatName() == null || 
catalogName.equalsIgnoreCase(event.getCatName())) {

Review comment:
   catalogName should never be null. If not configured also, it must be 
default which is "hive". I think we should let it fail if filter is initialized 
with catalogName being null .

##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/messaging/event/filters/CatalogFilter.java
##
@@ -0,0 +1,39 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hive.metastore.messaging.event.filters;
+
+import org.apache.hadoop.hive.metastore.api.NotificationEvent;
+
+/**
+ * Utility function that constructs a notification filter to match a given 
catalog name.
+ */
+public class CatalogFilter extends BasicFilter {
+  private final String catalogName;
+
+  public CatalogFilter(final String catalogName) {
+this.catalogName = catalogName;
+  }
+
+  @Override
+  boolean shouldAccept(final NotificationEvent event) {
+if (catalogName == null || event.getCatName() == null || 
catalogName.equalsIgnoreCase(event.getCatName())) {

Review comment:
   catalogName should never be null. If not configured also, it must be 
default which is "hive". I think we should let it fail if filter is initialized 
with catalogName being null .

##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/messaging/event/filters/CatalogFilter.java
##
@@ -0,0 +1,39 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ *

[jira] [Work logged] (HIVE-24143) Include convention in JDBC converter operator in Calcite plan

2020-09-12 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24143?focusedWorklogId=483227=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-483227
 ]

ASF GitHub Bot logged work on HIVE-24143:
-

Author: ASF GitHub Bot
Created on: 12/Sep/20 21:35
Start Date: 12/Sep/20 21:35
Worklog Time Spent: 10m 
  Work Description: vineetgarg02 commented on a change in pull request 
#1482:
URL: https://github.com/apache/hive/pull/1482#discussion_r487292490



##
File path: 
ql/src/test/results/clientpositive/llap/external_jdbc_table_perf.q.out
##
@@ -6115,32 +6112,39 @@ WHERE "d_year" = 1999 AND "d_moy" BETWEEN 1 AND 3 AND 
"d_date_sk" IS NOT NULL) A
 Reduce Operator Tree:
   Merge Join Operator
 condition map:
- Anti Join 0 to 1
+ Left Outer Join 0 to 1

Review comment:
   Also not sure what caused this change in plan.

##
File path: ql/src/test/queries/clientpositive/external_jdbc_table2.q
##
@@ -7,43 +6,43 @@ CREATE TEMPORARY FUNCTION dboutput AS 
'org.apache.hadoop.hive.contrib.genericudf
 
 FROM src
 SELECT
-dboutput 
('jdbc:derby:;databaseName=${system:test.tmp.dir}/test_derby_auth1;create=true','user1','passwd1',
+dboutput 
('jdbc:derby:;databaseName=${system:test.tmp.dir}/test_derby_auth1_1;create=true','user1','passwd1',

Review comment:
   Curious as to why this changed?

##
File path: 
ql/src/test/results/clientpositive/llap/external_jdbc_table_perf.q.out
##
@@ -6115,32 +6112,39 @@ WHERE "d_year" = 1999 AND "d_moy" BETWEEN 1 AND 3 AND 
"d_date_sk" IS NOT NULL) A
 Reduce Operator Tree:
   Merge Join Operator
 condition map:
- Anti Join 0 to 1
+ Left Outer Join 0 to 1

Review comment:
   Got it, make sense

##
File path: 
ql/src/test/results/clientpositive/llap/external_jdbc_table_perf.q.out
##
@@ -6115,32 +6112,39 @@ WHERE "d_year" = 1999 AND "d_moy" BETWEEN 1 AND 3 AND 
"d_date_sk" IS NOT NULL) A
 Reduce Operator Tree:
   Merge Join Operator
 condition map:
- Anti Join 0 to 1
+ Left Outer Join 0 to 1

Review comment:
   Also not sure what caused this change in plan.

##
File path: ql/src/test/queries/clientpositive/external_jdbc_table2.q
##
@@ -7,43 +6,43 @@ CREATE TEMPORARY FUNCTION dboutput AS 
'org.apache.hadoop.hive.contrib.genericudf
 
 FROM src
 SELECT
-dboutput 
('jdbc:derby:;databaseName=${system:test.tmp.dir}/test_derby_auth1;create=true','user1','passwd1',
+dboutput 
('jdbc:derby:;databaseName=${system:test.tmp.dir}/test_derby_auth1_1;create=true','user1','passwd1',

Review comment:
   Curious as to why this changed?

##
File path: 
ql/src/test/results/clientpositive/llap/external_jdbc_table_perf.q.out
##
@@ -6115,32 +6112,39 @@ WHERE "d_year" = 1999 AND "d_moy" BETWEEN 1 AND 3 AND 
"d_date_sk" IS NOT NULL) A
 Reduce Operator Tree:
   Merge Join Operator
 condition map:
- Anti Join 0 to 1
+ Left Outer Join 0 to 1

Review comment:
   Got it, make sense

##
File path: 
ql/src/test/results/clientpositive/llap/external_jdbc_table_perf.q.out
##
@@ -6115,32 +6112,39 @@ WHERE "d_year" = 1999 AND "d_moy" BETWEEN 1 AND 3 AND 
"d_date_sk" IS NOT NULL) A
 Reduce Operator Tree:
   Merge Join Operator
 condition map:
- Anti Join 0 to 1
+ Left Outer Join 0 to 1

Review comment:
   Also not sure what caused this change in plan.

##
File path: ql/src/test/queries/clientpositive/external_jdbc_table2.q
##
@@ -7,43 +6,43 @@ CREATE TEMPORARY FUNCTION dboutput AS 
'org.apache.hadoop.hive.contrib.genericudf
 
 FROM src
 SELECT
-dboutput 
('jdbc:derby:;databaseName=${system:test.tmp.dir}/test_derby_auth1;create=true','user1','passwd1',
+dboutput 
('jdbc:derby:;databaseName=${system:test.tmp.dir}/test_derby_auth1_1;create=true','user1','passwd1',

Review comment:
   Curious as to why this changed?

##
File path: 
ql/src/test/results/clientpositive/llap/external_jdbc_table_perf.q.out
##
@@ -6115,32 +6112,39 @@ WHERE "d_year" = 1999 AND "d_moy" BETWEEN 1 AND 3 AND 
"d_date_sk" IS NOT NULL) A
 Reduce Operator Tree:
   Merge Join Operator
 condition map:
- Anti Join 0 to 1
+ Left Outer Join 0 to 1

Review comment:
   Got it, make sense

##
File path: 
ql/src/test/results/clientpositive/llap/external_jdbc_table_perf.q.out
##
@@ -6115,32 +6112,39 @@ WHERE "d_year" = 1999 AND "d_moy" BETWEEN 1 AND 3 AND 
"d_date_sk" IS NOT NULL) A
 Reduce Operator

[jira] [Work logged] (HIVE-24035) Add Jenkinsfile for branch-2.3

2020-09-12 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24035?focusedWorklogId=483211=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-483211
 ]

ASF GitHub Bot logged work on HIVE-24035:
-

Author: ASF GitHub Bot
Created on: 12/Sep/20 21:32
Start Date: 12/Sep/20 21:32
Worklog Time Spent: 10m 
  Work Description: sunchao commented on pull request #1398:
URL: https://github.com/apache/hive/pull/1398#issuecomment-691243264







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 483211)
Time Spent: 2h 50m  (was: 2h 40m)

> Add Jenkinsfile for branch-2.3
> --
>
> Key: HIVE-24035
> URL: https://issues.apache.org/jira/browse/HIVE-24035
> Project: Hive
>  Issue Type: Test
>Reporter: Chao Sun
>Assignee: Chao Sun
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> To enable precommit tests for github PR, we need to have a Jenkinsfile in the 
> repo. This is already done for master and branch-2. This adds the same for 
> branch-2.3



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24149) HiveStreamingConnection doesn't close HMS connection

2020-09-12 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24149?focusedWorklogId=483197=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-483197
 ]

ASF GitHub Bot logged work on HIVE-24149:
-

Author: ASF GitHub Bot
Created on: 12/Sep/20 21:31
Start Date: 12/Sep/20 21:31
Worklog Time Spent: 10m 
  Work Description: zeroflag opened a new pull request #1488:
URL: https://github.com/apache/hive/pull/1488







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 483197)
Time Spent: 0.5h  (was: 20m)

> HiveStreamingConnection doesn't close HMS connection
> 
>
> Key: HIVE-24149
> URL: https://issues.apache.org/jira/browse/HIVE-24149
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Attila Magyar
>Assignee: Attila Magyar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> There 3 HMS connections used by HiveStreamingConnection. One for TX one for 
> hearbeat and for notifications. The close method only closes the first 2 
> leaving the last one open which eventually overloads HMS and it becomes 
> unresponsive.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24151) MultiDelimitSerDe shifts data if strings contain non-ASCII characters

2020-09-12 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24151?focusedWorklogId=483199=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-483199
 ]

ASF GitHub Bot logged work on HIVE-24151:
-

Author: ASF GitHub Bot
Created on: 12/Sep/20 21:31
Start Date: 12/Sep/20 21:31
Worklog Time Spent: 10m 
  Work Description: szlta edited a comment on pull request #1490:
URL: https://github.com/apache/hive/pull/1490#issuecomment-691122063







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 483199)
Time Spent: 1.5h  (was: 1h 20m)

> MultiDelimitSerDe shifts data if strings contain non-ASCII characters
> -
>
> Key: HIVE-24151
> URL: https://issues.apache.org/jira/browse/HIVE-24151
> Project: Hive
>  Issue Type: Bug
>Reporter: Ádám Szita
>Assignee: Ádám Szita
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> HIVE-22360 intended to fix another MultiDelimitSerde problem (with NULL last 
> columns) but introduced a regression: the approach of the fix is pretty much 
> all wrong, as the existing logic that operated on bytes got replaced by regex 
> matcher logic which deals in character positions, rather than byte positions. 
> As some non ASCII characters consist of more than 1 byte, the whole record 
> may get shifted due to this.
> With this ticket I'm going to restore the old logic, and apply the proper fix 
> on that, but keeping (and extending) the test cases added with HIVE-22360 so 
> that we have a solution for both issues.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24145) Fix preemption issues in reducers and file sink operators

2020-09-12 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24145?focusedWorklogId=483190=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-483190
 ]

ASF GitHub Bot logged work on HIVE-24145:
-

Author: ASF GitHub Bot
Created on: 12/Sep/20 21:31
Start Date: 12/Sep/20 21:31
Worklog Time Spent: 10m 
  Work Description: ramesh0201 opened a new pull request #1485:
URL: https://github.com/apache/hive/pull/1485







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 483190)
Time Spent: 1h  (was: 50m)

> Fix preemption issues in reducers and file sink operators
> -
>
> Key: HIVE-24145
> URL: https://issues.apache.org/jira/browse/HIVE-24145
> Project: Hive
>  Issue Type: Bug
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> There are two issues because of preemption:
>  # Reducers are getting reordered as part of optimizations because of which 
> more preemption happen
>  # Preemption in the middle of writing can cause the file to not close and 
> lead to errors when we read the file later



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24143) Include convention in JDBC converter operator in Calcite plan

2020-09-12 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24143?focusedWorklogId=483151=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-483151
 ]

ASF GitHub Bot logged work on HIVE-24143:
-

Author: ASF GitHub Bot
Created on: 12/Sep/20 21:21
Start Date: 12/Sep/20 21:21
Worklog Time Spent: 10m 
  Work Description: jcamachor commented on a change in pull request #1482:
URL: https://github.com/apache/hive/pull/1482#discussion_r487299089



##
File path: ql/src/test/queries/clientpositive/external_jdbc_table2.q
##
@@ -7,43 +6,43 @@ CREATE TEMPORARY FUNCTION dboutput AS 
'org.apache.hadoop.hive.contrib.genericudf
 
 FROM src
 SELECT
-dboutput 
('jdbc:derby:;databaseName=${system:test.tmp.dir}/test_derby_auth1;create=true','user1','passwd1',
+dboutput 
('jdbc:derby:;databaseName=${system:test.tmp.dir}/test_derby_auth1_1;create=true','user1','passwd1',

Review comment:
   I changed it to avoid a clash with other test in the temp directory, 
which I believe was causing HIVE-23910.

##
File path: 
ql/src/test/results/clientpositive/llap/external_jdbc_table_perf.q.out
##
@@ -6115,32 +6112,39 @@ WHERE "d_year" = 1999 AND "d_moy" BETWEEN 1 AND 3 AND 
"d_date_sk" IS NOT NULL) A
 Reduce Operator Tree:
   Merge Join Operator
 condition map:
- Anti Join 0 to 1
+ Left Outer Join 0 to 1

Review comment:
   This change is not actually related to this patch. Note that the test 
was disabled by default (HIVE-23910); it seems maybe a preliminary version of 
HIVE-23716 changed these q files and it should have not.

##
File path: ql/src/test/queries/clientpositive/external_jdbc_table2.q
##
@@ -7,43 +6,43 @@ CREATE TEMPORARY FUNCTION dboutput AS 
'org.apache.hadoop.hive.contrib.genericudf
 
 FROM src
 SELECT
-dboutput 
('jdbc:derby:;databaseName=${system:test.tmp.dir}/test_derby_auth1;create=true','user1','passwd1',
+dboutput 
('jdbc:derby:;databaseName=${system:test.tmp.dir}/test_derby_auth1_1;create=true','user1','passwd1',

Review comment:
   I changed it to avoid a clash with other test in the temp directory, 
which I believe was causing HIVE-23910.

##
File path: 
ql/src/test/results/clientpositive/llap/external_jdbc_table_perf.q.out
##
@@ -6115,32 +6112,39 @@ WHERE "d_year" = 1999 AND "d_moy" BETWEEN 1 AND 3 AND 
"d_date_sk" IS NOT NULL) A
 Reduce Operator Tree:
   Merge Join Operator
 condition map:
- Anti Join 0 to 1
+ Left Outer Join 0 to 1

Review comment:
   This change is not actually related to this patch. Note that the test 
was disabled by default (HIVE-23910); it seems maybe a preliminary version of 
HIVE-23716 changed these q files and it should have not.

##
File path: ql/src/test/queries/clientpositive/external_jdbc_table2.q
##
@@ -7,43 +6,43 @@ CREATE TEMPORARY FUNCTION dboutput AS 
'org.apache.hadoop.hive.contrib.genericudf
 
 FROM src
 SELECT
-dboutput 
('jdbc:derby:;databaseName=${system:test.tmp.dir}/test_derby_auth1;create=true','user1','passwd1',
+dboutput 
('jdbc:derby:;databaseName=${system:test.tmp.dir}/test_derby_auth1_1;create=true','user1','passwd1',

Review comment:
   I changed it to avoid a clash with other test in the temp directory, 
which I believe was causing HIVE-23910.

##
File path: 
ql/src/test/results/clientpositive/llap/external_jdbc_table_perf.q.out
##
@@ -6115,32 +6112,39 @@ WHERE "d_year" = 1999 AND "d_moy" BETWEEN 1 AND 3 AND 
"d_date_sk" IS NOT NULL) A
 Reduce Operator Tree:
   Merge Join Operator
 condition map:
- Anti Join 0 to 1
+ Left Outer Join 0 to 1

Review comment:
   This change is not actually related to this patch. Note that the test 
was disabled by default (HIVE-23910); it seems maybe a preliminary version of 
HIVE-23716 changed these q files and it should have not.

##
File path: ql/src/test/queries/clientpositive/external_jdbc_table2.q
##
@@ -7,43 +6,43 @@ CREATE TEMPORARY FUNCTION dboutput AS 
'org.apache.hadoop.hive.contrib.genericudf
 
 FROM src
 SELECT
-dboutput 
('jdbc:derby:;databaseName=${system:test.tmp.dir}/test_derby_auth1;create=true','user1','passwd1',
+dboutput 
('jdbc:derby:;databaseName=${system:test.tmp.dir}/test_derby_auth1_1;create=true','user1','passwd1',

Review comment:
   I changed it to avoid a clash with other test in the temp directory, 
which I believe was causing HIVE-23910.

##
File path: 
ql/src/test/results/clientpositive/llap/external_jdbc_table_perf.q.out
##
@@ -6115,32 +6112,39 @@ WHERE "d_year" = 1999 AND "d_moy" BETWEEN 1 AND 3 AND 
"d_date_sk" IS NOT NULL) A
 Reduce Operator Tree:
   Merge Join Operator

[jira] [Work logged] (HIVE-24084) Push Aggregates thru joins in case it re-groups previously unique columns

2020-09-12 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24084?focusedWorklogId=483143=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-483143
 ]

ASF GitHub Bot logged work on HIVE-24084:
-

Author: ASF GitHub Bot
Created on: 12/Sep/20 21:20
Start Date: 12/Sep/20 21:20
Worklog Time Spent: 10m 
  Work Description: kgyrtkirk commented on a change in pull request #1439:
URL: https://github.com/apache/hive/pull/1439#discussion_r487096103



##
File path: ql/src/test/queries/clientpositive/tpch18.q
##
@@ -0,0 +1,133 @@
+--! qt:dataset:tpch_0_001.customer
+--! qt:dataset:tpch_0_001.lineitem
+--! qt:dataset:tpch_0_001.nation
+--! qt:dataset:tpch_0_001.orders
+--! qt:dataset:tpch_0_001.part
+--! qt:dataset:tpch_0_001.partsupp
+--! qt:dataset:tpch_0_001.region
+--! qt:dataset:tpch_0_001.supplier
+
+
+use tpch_0_001;
+
+set hive.transpose.aggr.join=true;
+set hive.transpose.aggr.join.unique=true;
+set hive.mapred.mode=nonstrict;
+
+create view q18_tmp_cached as
+select
+   l_orderkey,
+   sum(l_quantity) as t_sum_quantity
+from
+   lineitem
+where
+   l_orderkey is not null
+group by
+   l_orderkey;
+
+
+
+explain cbo select
+c_name,
+c_custkey,
+o_orderkey,
+o_orderdate,
+o_totalprice,
+sum(l_quantity)
+from
+   customer,
+   orders,
+   q18_tmp_cached t,
+   lineitem l
+where
+c_custkey = o_custkey
+and o_orderkey = t.l_orderkey
+and o_orderkey is not null
+and t.t_sum_quantity > 300
+and o_orderkey = l.l_orderkey
+and l.l_orderkey is not null
+group by
+c_name,
+c_custkey,
+o_orderkey,
+o_orderdate,
+o_totalprice
+order by
+o_totalprice desc,
+o_orderdate
+limit 100;
+
+
+
+select 'add constraints';
+
+alter table orders add constraint pk_o primary key (o_orderkey) disable 
novalidate rely;
+alter table customer add constraint pk_c primary key (c_custkey) disable 
novalidate rely;
+

Review comment:
   I've added both constraints - it only removed the IS NOT NULL filter
   it seems to me that 1 of the sum() is used as an output and the other is 
being used to filter by >300 - so both of them is being "used"

##
File path: 
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveAggregateJoinTransposeRule.java
##
@@ -303,6 +305,90 @@ public void onMatch(RelOptRuleCall call) {
 }
   }
 
+  /**
+   * Determines weather the give grouping is unique.
+   *
+   * Consider a join which might produce non-unique rows; but later the 
results are aggregated again.
+   * This method determines if there are sufficient columns in the grouping 
which have been present previously as unique column(s).
+   */
+  private boolean isGroupingUnique(RelNode input, ImmutableBitSet groups) {
+if (groups.isEmpty()) {
+  return false;
+}
+RelMetadataQuery mq = input.getCluster().getMetadataQuery();
+Set uKeys = mq.getUniqueKeys(input);
+for (ImmutableBitSet u : uKeys) {
+  if (groups.contains(u)) {
+return true;
+  }
+}
+if (input instanceof Join) {
+  Join join = (Join) input;
+  RexBuilder rexBuilder = input.getCluster().getRexBuilder();
+  SimpleConditionInfo cond = new SimpleConditionInfo(join.getCondition(), 
rexBuilder);
+
+  if (cond.valid) {
+ImmutableBitSet newGroup = 
groups.intersect(ImmutableBitSet.fromBitSet(cond.fields));
+RelNode l = join.getLeft();
+RelNode r = join.getRight();
+
+int joinFieldCount = join.getRowType().getFieldCount();
+int lFieldCount = l.getRowType().getFieldCount();
+
+ImmutableBitSet groupL = newGroup.get(0, lFieldCount);
+ImmutableBitSet groupR = newGroup.get(lFieldCount, 
joinFieldCount).shift(-lFieldCount);
+
+if (isGroupingUnique(l, groupL)) {

Review comment:
   That could be done; and I'm sure it was true in this - but this logic 
will work better if it could walk down as many joins as it could - we might 
have an aggregate on top in the meantime a bunch of joins under it...so I feel 
that it will be beneficial to retain it.
   I feeled tempted to write a RelMd handler - however I don't think I could 
just introduce a new one easily.
   RelShuttle doesn't look like a good match - I'll leave it as a set of 
`instanceof` calls for now.
   
   I'll upload a new patch to see if digging deeper in the tree could do more 
or not.

##
File path: ql/src/test/queries/clientpositive/tpch18.q
##
@@ -0,0 +1,133 @@
+--! qt:dataset:tpch_0_001.customer
+--! qt:dataset:tpch_0_001.lineitem
+--! qt:dataset:tpch_0_001.nation
+--! qt:dataset:tpch_0_001.orders
+--! qt:dataset:tpch_0_001.part
+--! qt:dataset:tpch_0_001.partsupp
+--! qt:dataset:tpch_0_001.region
+--! qt:dataset:tpch_0_001.supplier
+
+
+use tpch_0_001;
+
+set hive.transpose.aggr.join=true;
+set hive.transpose.aggr.join.unique=true;
+set hive.mapred.mode=nonstrict;
+
+create view q18_tmp_cached as
+select
+

[jira] [Work logged] (HIVE-24151) MultiDelimitSerDe shifts data if strings contain non-ASCII characters

2020-09-12 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24151?focusedWorklogId=483125=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-483125
 ]

ASF GitHub Bot logged work on HIVE-24151:
-

Author: ASF GitHub Bot
Created on: 12/Sep/20 21:15
Start Date: 12/Sep/20 21:15
Worklog Time Spent: 10m 
  Work Description: szlta opened a new pull request #1490:
URL: https://github.com/apache/hive/pull/1490







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 483125)
Time Spent: 1h 20m  (was: 1h 10m)

> MultiDelimitSerDe shifts data if strings contain non-ASCII characters
> -
>
> Key: HIVE-24151
> URL: https://issues.apache.org/jira/browse/HIVE-24151
> Project: Hive
>  Issue Type: Bug
>Reporter: Ádám Szita
>Assignee: Ádám Szita
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> HIVE-22360 intended to fix another MultiDelimitSerde problem (with NULL last 
> columns) but introduced a regression: the approach of the fix is pretty much 
> all wrong, as the existing logic that operated on bytes got replaced by regex 
> matcher logic which deals in character positions, rather than byte positions. 
> As some non ASCII characters consist of more than 1 byte, the whole record 
> may get shifted due to this.
> With this ticket I'm going to restore the old logic, and apply the proper fix 
> on that, but keeping (and extending) the test cases added with HIVE-22360 so 
> that we have a solution for both issues.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24147) Table column names are not extracted correctly in Hive JDBC storage handler

2020-09-12 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24147?focusedWorklogId=483096=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-483096
 ]

ASF GitHub Bot logged work on HIVE-24147:
-

Author: ASF GitHub Bot
Created on: 12/Sep/20 20:56
Start Date: 12/Sep/20 20:56
Worklog Time Spent: 10m 
  Work Description: jcamachor opened a new pull request #1486:
URL: https://github.com/apache/hive/pull/1486







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 483096)
Time Spent: 0.5h  (was: 20m)

> Table column names are not extracted correctly in Hive JDBC storage handler
> ---
>
> Key: HIVE-24147
> URL: https://issues.apache.org/jira/browse/HIVE-24147
> Project: Hive
>  Issue Type: Bug
>  Components: JDBC storage handler
>Affects Versions: 4.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> It seems the `ResultSetMetaData` for the query used to retrieve the table 
> columns names contains fully qualified names, instead of possibly supporting 
> the {{getTableName}} method. This ends up throwing the storage handler off 
> and leading to exceptions, both in CBO path and non-CBO path.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24143) Include convention in JDBC converter operator in Calcite plan

2020-09-12 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24143?focusedWorklogId=483081=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-483081
 ]

ASF GitHub Bot logged work on HIVE-24143:
-

Author: ASF GitHub Bot
Created on: 12/Sep/20 20:53
Start Date: 12/Sep/20 20:53
Worklog Time Spent: 10m 
  Work Description: vineetgarg02 commented on pull request #1482:
URL: https://github.com/apache/hive/pull/1482#issuecomment-691326259







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 483081)
Time Spent: 2h 10m  (was: 2h)

> Include convention in JDBC converter operator in Calcite plan
> -
>
> Key: HIVE-24143
> URL: https://issues.apache.org/jira/browse/HIVE-24143
> Project: Hive
>  Issue Type: Improvement
>  Components: CBO
>Affects Versions: 4.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> Among others, it will be useful to debug the dialect being chosen for query 
> generation. For instance:
> {code}
>  HiveProject(jdbc_type_conversion_table1.ikey=[$0], 
> jdbc_type_conversion_table1.bkey=[$1], jdbc_type_conversion_table1.fkey=[$2], 
> jdbc_type_conversion_table1.dkey=[$3], 
> jdbc_type_conversion_table1.chkey=[$4], 
> jdbc_type_conversion_table1.dekey=[$5], 
> jdbc_type_conversion_table1.dtkey=[$6], jdbc_type_conversion_table1.tkey=[$7])
>   HiveProject(ikey=[$0], bkey=[$1], fkey=[$2], dkey=[$3], chkey=[$4], 
> dekey=[$5], dtkey=[$6], tkey=[$7])
> ->HiveJdbcConverter(convention=[JDBC.DERBY])
>   JdbcHiveTableScan(table=[[default, jdbc_type_conversion_table1]], 
> table:alias=[jdbc_type_conversion_table1])
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24146) Cleanup TaskExecutionException in GenericUDTFExplode

2020-09-12 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24146?focusedWorklogId=483075=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-483075
 ]

ASF GitHub Bot logged work on HIVE-24146:
-

Author: ASF GitHub Bot
Created on: 12/Sep/20 20:52
Start Date: 12/Sep/20 20:52
Worklog Time Spent: 10m 
  Work Description: dengzhhu653 opened a new pull request #1483:
URL: https://github.com/apache/hive/pull/1483







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 483075)
Time Spent: 0.5h  (was: 20m)

> Cleanup TaskExecutionException in GenericUDTFExplode
> 
>
> Key: HIVE-24146
> URL: https://issues.apache.org/jira/browse/HIVE-24146
> Project: Hive
>  Issue Type: Improvement
>  Components: UDF
>Reporter: Zhihua Deng
>Assignee: Zhihua Deng
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> - Remove TaskExecutionException, which may be not used anymore;
> - Remove the default handling in GenericUDTFExplode#process, which has been 
> verified during the function initializing.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24143) Include convention in JDBC converter operator in Calcite plan

2020-09-12 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24143?focusedWorklogId=483067=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-483067
 ]

ASF GitHub Bot logged work on HIVE-24143:
-

Author: ASF GitHub Bot
Created on: 12/Sep/20 20:52
Start Date: 12/Sep/20 20:52
Worklog Time Spent: 10m 
  Work Description: jcamachor merged pull request #1482:
URL: https://github.com/apache/hive/pull/1482







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 483067)
Time Spent: 2h  (was: 1h 50m)

> Include convention in JDBC converter operator in Calcite plan
> -
>
> Key: HIVE-24143
> URL: https://issues.apache.org/jira/browse/HIVE-24143
> Project: Hive
>  Issue Type: Improvement
>  Components: CBO
>Affects Versions: 4.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> Among others, it will be useful to debug the dialect being chosen for query 
> generation. For instance:
> {code}
>  HiveProject(jdbc_type_conversion_table1.ikey=[$0], 
> jdbc_type_conversion_table1.bkey=[$1], jdbc_type_conversion_table1.fkey=[$2], 
> jdbc_type_conversion_table1.dkey=[$3], 
> jdbc_type_conversion_table1.chkey=[$4], 
> jdbc_type_conversion_table1.dekey=[$5], 
> jdbc_type_conversion_table1.dtkey=[$6], jdbc_type_conversion_table1.tkey=[$7])
>   HiveProject(ikey=[$0], bkey=[$1], fkey=[$2], dkey=[$3], chkey=[$4], 
> dekey=[$5], dtkey=[$6], tkey=[$7])
> ->HiveJdbcConverter(convention=[JDBC.DERBY])
>   JdbcHiveTableScan(table=[[default, jdbc_type_conversion_table1]], 
> table:alias=[jdbc_type_conversion_table1])
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23841) Field writers is an HashSet, i.e., not thread-safe. Field writers is typically protected by synchronization on lock, but not in 1 location.

2020-09-12 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23841?focusedWorklogId=483063=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-483063
 ]

ASF GitHub Bot logged work on HIVE-23841:
-

Author: ASF GitHub Bot
Created on: 12/Sep/20 20:51
Start Date: 12/Sep/20 20:51
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #1248:
URL: https://github.com/apache/hive/pull/1248#issuecomment-691367907







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 483063)
Time Spent: 50m  (was: 40m)

> Field writers is an HashSet, i.e., not thread-safe.  Field writers is 
> typically protected by synchronization on lock, but not in 1 location.
> 
>
> Key: HIVE-23841
> URL: https://issues.apache.org/jira/browse/HIVE-23841
> Project: Hive
>  Issue Type: Bug
> Environment: Any environment
>Reporter: Adrian Nistor
>Priority: Major
>  Labels: patch-available, pull-request-available
> Attachments: HIVE-23841.patch
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> I also submitted a pull request on github at:
>  
> [https://github.com/apache/hive/pull/1248]
>  
> (same patch)
> h1. Description
>  
> Field {{writers}} is a {{HashSet}} ([line 
> 70|https://github.com/apache/hive/blob/c93d7797329103d6c509bada68b6da7f907b3dee/ql/src/java/org/apache/hadoop/hive/llap/LlapOutputFormatService.java#L70]),
>  i.e., not thread-safe.
> Accesses to field {{writers}} are protected by synchronization on {{lock}}, 
> e.g., at lines: 
> [141-144|https://github.com/apache/hive/blob/c93d7797329103d6c509bada68b6da7f907b3dee/ql/src/java/org/apache/hadoop/hive/llap/LlapOutputFormatService.java#L141-L144],
>  
> [212-213|https://github.com/apache/hive/blob/c93d7797329103d6c509bada68b6da7f907b3dee/ql/src/java/org/apache/hadoop/hive/llap/LlapOutputFormatService.java#L212-L213],
>  and 
> [212-215|https://github.com/apache/hive/blob/c93d7797329103d6c509bada68b6da7f907b3dee/ql/src/java/org/apache/hadoop/hive/llap/LlapOutputFormatService.java#L212-L215].
> However, the {{writers.remove()}} at [line 
> 249|https://github.com/apache/hive/blob/c93d7797329103d6c509bada68b6da7f907b3dee/ql/src/java/org/apache/hadoop/hive/llap/LlapOutputFormatService.java#L249]
>  is protected by synchronization on {{INSTANCE}}, *not* on {{lock}}.
> Synchronizing on 2 different objects does not ensure mutual exclusion. This 
> is because 2 threads synchronizing on different objects can still execute in 
> parallel at the same time.
> Note that lines 
> [215|https://github.com/apache/hive/blob/c93d7797329103d6c509bada68b6da7f907b3dee/ql/src/java/org/apache/hadoop/hive/llap/LlapOutputFormatService.java#L215]
>  and 
> [249|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/llap/LlapOutputFormatService.java#L249]
>  are modifying {{writers}} with {{put()}} and {{remove()}}, respectively.
> h1. The Code for This Fix
> This fix is very simple: just change {{synchronized (INSTANCE)}} to 
> {{synchronized (lock)}}, just like the methods containing the other lines 
> listed above.[]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24151) MultiDelimitSerDe shifts data if strings contain non-ASCII characters

2020-09-12 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24151?focusedWorklogId=483017=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-483017
 ]

ASF GitHub Bot logged work on HIVE-24151:
-

Author: ASF GitHub Bot
Created on: 12/Sep/20 20:47
Start Date: 12/Sep/20 20:47
Worklog Time Spent: 10m 
  Work Description: szlta commented on pull request #1490:
URL: https://github.com/apache/hive/pull/1490#issuecomment-691122063







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 483017)
Time Spent: 1h 10m  (was: 1h)

> MultiDelimitSerDe shifts data if strings contain non-ASCII characters
> -
>
> Key: HIVE-24151
> URL: https://issues.apache.org/jira/browse/HIVE-24151
> Project: Hive
>  Issue Type: Bug
>Reporter: Ádám Szita
>Assignee: Ádám Szita
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> HIVE-22360 intended to fix another MultiDelimitSerde problem (with NULL last 
> columns) but introduced a regression: the approach of the fix is pretty much 
> all wrong, as the existing logic that operated on bytes got replaced by regex 
> matcher logic which deals in character positions, rather than byte positions. 
> As some non ASCII characters consist of more than 1 byte, the whole record 
> may get shifted due to this.
> With this ticket I'm going to restore the old logic, and apply the proper fix 
> on that, but keeping (and extending) the test cases added with HIVE-22360 so 
> that we have a solution for both issues.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23413) Create a new config to skip all locks

2020-09-12 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23413?focusedWorklogId=483002=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-483002
 ]

ASF GitHub Bot logged work on HIVE-23413:
-

Author: ASF GitHub Bot
Created on: 12/Sep/20 20:45
Start Date: 12/Sep/20 20:45
Worklog Time Spent: 10m 
  Work Description: pvargacl commented on pull request #1220:
URL: https://github.com/apache/hive/pull/1220#issuecomment-690929320







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 483002)
Time Spent: 1h 50m  (was: 1h 40m)

> Create a new config to skip all locks
> -
>
> Key: HIVE-23413
> URL: https://issues.apache.org/jira/browse/HIVE-23413
> Project: Hive
>  Issue Type: Improvement
>Reporter: Peter Varga
>Assignee: Peter Varga
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-23413.1.patch, HIVE-23413.2.patch
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> From time-to-time some query is blocked on locks which should not.
> To have a quick workaround for this we should have a config which the user 
> can set in the session to disable acquiring/checking locks, so we can provide 
> it immediately and then later investigate and fix the root cause.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23413) Create a new config to skip all locks

2020-09-12 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23413?focusedWorklogId=482961=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-482961
 ]

ASF GitHub Bot logged work on HIVE-23413:
-

Author: ASF GitHub Bot
Created on: 12/Sep/20 20:38
Start Date: 12/Sep/20 20:38
Worklog Time Spent: 10m 
  Work Description: pvary commented on pull request #1220:
URL: https://github.com/apache/hive/pull/1220#issuecomment-690931518







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 482961)
Time Spent: 1h 40m  (was: 1.5h)

> Create a new config to skip all locks
> -
>
> Key: HIVE-23413
> URL: https://issues.apache.org/jira/browse/HIVE-23413
> Project: Hive
>  Issue Type: Improvement
>Reporter: Peter Varga
>Assignee: Peter Varga
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-23413.1.patch, HIVE-23413.2.patch
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> From time-to-time some query is blocked on locks which should not.
> To have a quick workaround for this we should have a config which the user 
> can set in the session to disable acquiring/checking locks, so we can provide 
> it immediately and then later investigate and fix the root cause.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-22290) ObjectStore.cleanWriteNotificationEvents and ObjectStore.cleanupEvents OutOfMemory on large number of pending events

2020-09-12 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22290?focusedWorklogId=482929=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-482929
 ]

ASF GitHub Bot logged work on HIVE-22290:
-

Author: ASF GitHub Bot
Created on: 12/Sep/20 20:34
Start Date: 12/Sep/20 20:34
Worklog Time Spent: 10m 
  Work Description: nareshpr opened a new pull request #1484:
URL: https://github.com/apache/hive/pull/1484







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 482929)
Time Spent: 20m  (was: 10m)

> ObjectStore.cleanWriteNotificationEvents and ObjectStore.cleanupEvents 
> OutOfMemory on large number of pending events
> 
>
> Key: HIVE-22290
> URL: https://issues.apache.org/jira/browse/HIVE-22290
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog, repl
>Affects Versions: 4.0.0
>Reporter: Thomas Prelle
>Assignee: Naresh P R
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> As in [https://jira.apache.org/jira/browse/HIVE-19430] if there are large 
> number of events that haven't been cleaned up for some reason, then 
> ObjectStore.cleanWriteNotificationEvents() and ObjectStore.cleanupEvents can 
> run out of memory while it loads all the events to be deleted.
> It should fetch events in batches.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24022) Optimise HiveMetaStoreAuthorizer.createHiveMetaStoreAuthorizer

2020-09-12 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24022?focusedWorklogId=482924=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-482924
 ]

ASF GitHub Bot logged work on HIVE-24022:
-

Author: ASF GitHub Bot
Created on: 12/Sep/20 20:33
Start Date: 12/Sep/20 20:33
Worklog Time Spent: 10m 
  Work Description: nrg4878 commented on pull request #1385:
URL: https://github.com/apache/hive/pull/1385#issuecomment-691213984







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 482924)
Time Spent: 50m  (was: 40m)

> Optimise HiveMetaStoreAuthorizer.createHiveMetaStoreAuthorizer
> --
>
> Key: HIVE-24022
> URL: https://issues.apache.org/jira/browse/HIVE-24022
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Sam An
>Priority: Minor
>  Labels: performance, pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> For a table with 3000+ partitions, analyze table takes a lot longer time as 
> HiveMetaStoreAuthorizer tries to create HiveConf for every partition request.
>  
> [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/metastore/HiveMetaStoreAuthorizer.java#L319]
>  
> [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/metastore/HiveMetaStoreAuthorizer.java#L447]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24084) Push Aggregates thru joins in case it re-groups previously unique columns

2020-09-12 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24084?focusedWorklogId=482911=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-482911
 ]

ASF GitHub Bot logged work on HIVE-24084:
-

Author: ASF GitHub Bot
Created on: 12/Sep/20 20:32
Start Date: 12/Sep/20 20:32
Worklog Time Spent: 10m 
  Work Description: jcamachor commented on a change in pull request #1439:
URL: https://github.com/apache/hive/pull/1439#discussion_r487130820



##
File path: 
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveAggregateJoinTransposeRule.java
##
@@ -303,6 +305,90 @@ public void onMatch(RelOptRuleCall call) {
 }
   }
 
+  /**
+   * Determines weather the give grouping is unique.
+   *
+   * Consider a join which might produce non-unique rows; but later the 
results are aggregated again.
+   * This method determines if there are sufficient columns in the grouping 
which have been present previously as unique column(s).
+   */
+  private boolean isGroupingUnique(RelNode input, ImmutableBitSet groups) {
+if (groups.isEmpty()) {
+  return false;
+}
+RelMetadataQuery mq = input.getCluster().getMetadataQuery();
+Set uKeys = mq.getUniqueKeys(input);
+for (ImmutableBitSet u : uKeys) {
+  if (groups.contains(u)) {
+return true;
+  }
+}
+if (input instanceof Join) {
+  Join join = (Join) input;
+  RexBuilder rexBuilder = input.getCluster().getRexBuilder();
+  SimpleConditionInfo cond = new SimpleConditionInfo(join.getCondition(), 
rexBuilder);
+
+  if (cond.valid) {
+ImmutableBitSet newGroup = 
groups.intersect(ImmutableBitSet.fromBitSet(cond.fields));
+RelNode l = join.getLeft();
+RelNode r = join.getRight();
+
+int joinFieldCount = join.getRowType().getFieldCount();
+int lFieldCount = l.getRowType().getFieldCount();
+
+ImmutableBitSet groupL = newGroup.get(0, lFieldCount);
+ImmutableBitSet groupR = newGroup.get(lFieldCount, 
joinFieldCount).shift(-lFieldCount);
+
+if (isGroupingUnique(l, groupL)) {

Review comment:
   OK. If it turns out there are many changes and it may need some time to 
be fixed, feel free to defer to follow-up JIRA and let's merge this one.
   

##
File path: ql/src/test/queries/clientpositive/tpch18.q
##
@@ -0,0 +1,133 @@
+--! qt:dataset:tpch_0_001.customer
+--! qt:dataset:tpch_0_001.lineitem
+--! qt:dataset:tpch_0_001.nation
+--! qt:dataset:tpch_0_001.orders
+--! qt:dataset:tpch_0_001.part
+--! qt:dataset:tpch_0_001.partsupp
+--! qt:dataset:tpch_0_001.region
+--! qt:dataset:tpch_0_001.supplier
+
+
+use tpch_0_001;
+
+set hive.transpose.aggr.join=true;
+set hive.transpose.aggr.join.unique=true;
+set hive.mapred.mode=nonstrict;
+
+create view q18_tmp_cached as
+select
+   l_orderkey,
+   sum(l_quantity) as t_sum_quantity
+from
+   lineitem
+where
+   l_orderkey is not null
+group by
+   l_orderkey;
+
+
+
+explain cbo select
+c_name,
+c_custkey,
+o_orderkey,
+o_orderdate,
+o_totalprice,
+sum(l_quantity)
+from
+   customer,
+   orders,
+   q18_tmp_cached t,
+   lineitem l
+where
+c_custkey = o_custkey
+and o_orderkey = t.l_orderkey
+and o_orderkey is not null
+and t.t_sum_quantity > 300
+and o_orderkey = l.l_orderkey
+and l.l_orderkey is not null
+group by
+c_name,
+c_custkey,
+o_orderkey,
+o_orderdate,
+o_totalprice
+order by
+o_totalprice desc,
+o_orderdate
+limit 100;
+
+
+
+select 'add constraints';
+
+alter table orders add constraint pk_o primary key (o_orderkey) disable 
novalidate rely;
+alter table customer add constraint pk_c primary key (c_custkey) disable 
novalidate rely;
+

Review comment:
   Thanks @kgyrtkirk .
   
   This seems to need further exploration, we thought 
https://issues.apache.org/jira/browse/HIVE-24087 was going to help here. 
@vineetgarg02 , could you take a look at this once this patch is merged? Maybe 
the shape of the plan is slightly different to the one we anticipated.

##
File path: 
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveAggregateJoinTransposeRule.java
##
@@ -303,6 +305,90 @@ public void onMatch(RelOptRuleCall call) {
 }
   }
 
+  /**
+   * Determines weather the give grouping is unique.
+   *
+   * Consider a join which might produce non-unique rows; but later the 
results are aggregated again.
+   * This method determines if there are sufficient columns in the grouping 
which have been present previously as unique column(s).
+   */
+  private boolean isGroupingUnique(RelNode input, ImmutableBitSet groups) {
+if (groups.isEmpty()) {
+  return false;
+}
+RelMetadataQuery mq = input.getCluster().getMetadataQuery();
+Set uKeys = mq.getUniqueKeys(input);
+for (ImmutableBitSet u : uKeys) {
+  if (groups.contains(u)) {
+return true;
+  }
+}
+if (input

[jira] [Work logged] (HIVE-24097) correct NPE exception in HiveMetastoreAuthorizer

2020-09-12 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24097?focusedWorklogId=482879=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-482879
 ]

ASF GitHub Bot logged work on HIVE-24097:
-

Author: ASF GitHub Bot
Created on: 12/Sep/20 20:29
Start Date: 12/Sep/20 20:29
Worklog Time Spent: 10m 
  Work Description: nrg4878 commented on pull request #1448:
URL: https://github.com/apache/hive/pull/1448#issuecomment-691254659







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 482879)
Time Spent: 0.5h  (was: 20m)

> correct NPE exception in HiveMetastoreAuthorizer
> 
>
> Key: HIVE-24097
> URL: https://issues.apache.org/jira/browse/HIVE-24097
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 4.0.0
>Reporter: Sam An
>Assignee: Sam An
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> In some testing, we found it's possible to have NPE if the preEventType does 
> not fall within the several the HMS currently checks. This makes the 
> AuthzContext a null pointer. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24144) getIdentifierQuoteString in HiveDatabaseMetaData returns incorrect value

2020-09-12 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24144?focusedWorklogId=482877=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-482877
 ]

ASF GitHub Bot logged work on HIVE-24144:
-

Author: ASF GitHub Bot
Created on: 12/Sep/20 20:28
Start Date: 12/Sep/20 20:28
Worklog Time Spent: 10m 
  Work Description: jcamachor opened a new pull request #1487:
URL: https://github.com/apache/hive/pull/1487







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 482877)
Remaining Estimate: 0h
Time Spent: 10m

> getIdentifierQuoteString in HiveDatabaseMetaData returns incorrect value
> 
>
> Key: HIVE-24144
> URL: https://issues.apache.org/jira/browse/HIVE-24144
> Project: Hive
>  Issue Type: Bug
>  Components: JDBC, JDBC storage handler
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> {code}
>   public String getIdentifierQuoteString() throws SQLException {
> return " ";
>   }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24127) Dump events from default catalog only

2020-09-12 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24127?focusedWorklogId=482836=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-482836
 ]

ASF GitHub Bot logged work on HIVE-24127:
-

Author: ASF GitHub Bot
Created on: 12/Sep/20 20:25
Start Date: 12/Sep/20 20:25
Worklog Time Spent: 10m 
  Work Description: pkumarsinha commented on a change in pull request #1478:
URL: https://github.com/apache/hive/pull/1478#discussion_r486826515



##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/messaging/event/filters/CatalogFilter.java
##
@@ -0,0 +1,39 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hive.metastore.messaging.event.filters;
+
+import org.apache.hadoop.hive.metastore.api.NotificationEvent;
+
+/**
+ * Utility function that constructs a notification filter to match a given 
catalog name.
+ */
+public class CatalogFilter extends BasicFilter {
+  private final String catalogName;
+
+  public CatalogFilter(final String catalogName) {
+this.catalogName = catalogName;
+  }
+
+  @Override
+  boolean shouldAccept(final NotificationEvent event) {
+if (catalogName == null || event.getCatName() == null || 
catalogName.equalsIgnoreCase(event.getCatName())) {

Review comment:
   catalogName should never be null. If not configured also, it must be 
default which is "hive". I think we should let it fail if filter is initialized 
with catalogName being null .

##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/messaging/event/filters/CatalogFilter.java
##
@@ -0,0 +1,39 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hive.metastore.messaging.event.filters;
+
+import org.apache.hadoop.hive.metastore.api.NotificationEvent;
+
+/**
+ * Utility function that constructs a notification filter to match a given 
catalog name.
+ */
+public class CatalogFilter extends BasicFilter {
+  private final String catalogName;
+
+  public CatalogFilter(final String catalogName) {
+this.catalogName = catalogName;
+  }
+
+  @Override
+  boolean shouldAccept(final NotificationEvent event) {
+if (catalogName == null || event.getCatName() == null || 
catalogName.equalsIgnoreCase(event.getCatName())) {

Review comment:
   catalogName should never be null. If not configured also, it must be 
default which is "hive". I think we should let it fail if filter is initialized 
with catalogName being null .





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 482836)
Time Spent: 0.5h  (was: 20m)

> Dump events from default catalog only
> -
>
> Key: HIVE-24127
> URL: https://issues.apache.org/jira/browse/HIVE-24127
> Project: Hive
>  Issue Type: Task
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
>  Labels:

[jira] [Work logged] (HIVE-24143) Include convention in JDBC converter operator in Calcite plan

2020-09-12 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24143?focusedWorklogId=482822=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-482822
 ]

ASF GitHub Bot logged work on HIVE-24143:
-

Author: ASF GitHub Bot
Created on: 12/Sep/20 20:24
Start Date: 12/Sep/20 20:24
Worklog Time Spent: 10m 
  Work Description: vineetgarg02 commented on a change in pull request 
#1482:
URL: https://github.com/apache/hive/pull/1482#discussion_r487292490



##
File path: 
ql/src/test/results/clientpositive/llap/external_jdbc_table_perf.q.out
##
@@ -6115,32 +6112,39 @@ WHERE "d_year" = 1999 AND "d_moy" BETWEEN 1 AND 3 AND 
"d_date_sk" IS NOT NULL) A
 Reduce Operator Tree:
   Merge Join Operator
 condition map:
- Anti Join 0 to 1
+ Left Outer Join 0 to 1

Review comment:
   Also not sure what caused this change in plan.

##
File path: ql/src/test/queries/clientpositive/external_jdbc_table2.q
##
@@ -7,43 +6,43 @@ CREATE TEMPORARY FUNCTION dboutput AS 
'org.apache.hadoop.hive.contrib.genericudf
 
 FROM src
 SELECT
-dboutput 
('jdbc:derby:;databaseName=${system:test.tmp.dir}/test_derby_auth1;create=true','user1','passwd1',
+dboutput 
('jdbc:derby:;databaseName=${system:test.tmp.dir}/test_derby_auth1_1;create=true','user1','passwd1',

Review comment:
   Curious as to why this changed?

##
File path: 
ql/src/test/results/clientpositive/llap/external_jdbc_table_perf.q.out
##
@@ -6115,32 +6112,39 @@ WHERE "d_year" = 1999 AND "d_moy" BETWEEN 1 AND 3 AND 
"d_date_sk" IS NOT NULL) A
 Reduce Operator Tree:
   Merge Join Operator
 condition map:
- Anti Join 0 to 1
+ Left Outer Join 0 to 1

Review comment:
   Got it, make sense

##
File path: 
ql/src/test/results/clientpositive/llap/external_jdbc_table_perf.q.out
##
@@ -6115,32 +6112,39 @@ WHERE "d_year" = 1999 AND "d_moy" BETWEEN 1 AND 3 AND 
"d_date_sk" IS NOT NULL) A
 Reduce Operator Tree:
   Merge Join Operator
 condition map:
- Anti Join 0 to 1
+ Left Outer Join 0 to 1

Review comment:
   Also not sure what caused this change in plan.

##
File path: ql/src/test/queries/clientpositive/external_jdbc_table2.q
##
@@ -7,43 +6,43 @@ CREATE TEMPORARY FUNCTION dboutput AS 
'org.apache.hadoop.hive.contrib.genericudf
 
 FROM src
 SELECT
-dboutput 
('jdbc:derby:;databaseName=${system:test.tmp.dir}/test_derby_auth1;create=true','user1','passwd1',
+dboutput 
('jdbc:derby:;databaseName=${system:test.tmp.dir}/test_derby_auth1_1;create=true','user1','passwd1',

Review comment:
   Curious as to why this changed?

##
File path: 
ql/src/test/results/clientpositive/llap/external_jdbc_table_perf.q.out
##
@@ -6115,32 +6112,39 @@ WHERE "d_year" = 1999 AND "d_moy" BETWEEN 1 AND 3 AND 
"d_date_sk" IS NOT NULL) A
 Reduce Operator Tree:
   Merge Join Operator
 condition map:
- Anti Join 0 to 1
+ Left Outer Join 0 to 1

Review comment:
   Got it, make sense





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 482822)
Time Spent: 1h 50m  (was: 1h 40m)

> Include convention in JDBC converter operator in Calcite plan
> -
>
> Key: HIVE-24143
> URL: https://issues.apache.org/jira/browse/HIVE-24143
> Project: Hive
>  Issue Type: Improvement
>  Components: CBO
>Affects Versions: 4.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> Among others, it will be useful to debug the dialect being chosen for query 
> generation. For instance:
> {code}
>  HiveProject(jdbc_type_conversion_table1.ikey=[$0], 
> jdbc_type_conversion_table1.bkey=[$1], jdbc_type_conversion_table1.fkey=[$2], 
> jdbc_type_conversion_table1.dkey=[$3], 
> jdbc_type_conversion_table1.chkey=[$4], 
> jdbc_type_conversion_table1.dekey=[$5], 
> jdbc_type_conversion_table1.dtkey=[$6], jdbc_type_conversion_table1.tkey=[$7])
>   HiveProject(ikey=[$0], bkey=[$1], fkey=[$2], dkey=[$3], chkey=[$4], 
> dekey=[$5], dtkey=[$6], tkey=[$7])
> ->

[jira] [Work logged] (HIVE-24035) Add Jenkinsfile for branch-2.3

2020-09-12 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24035?focusedWorklogId=482804=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-482804
 ]

ASF GitHub Bot logged work on HIVE-24035:
-

Author: ASF GitHub Bot
Created on: 12/Sep/20 20:23
Start Date: 12/Sep/20 20:23
Worklog Time Spent: 10m 
  Work Description: sunchao commented on pull request #1398:
URL: https://github.com/apache/hive/pull/1398#issuecomment-691243264







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 482804)
Time Spent: 2h 40m  (was: 2.5h)

> Add Jenkinsfile for branch-2.3
> --
>
> Key: HIVE-24035
> URL: https://issues.apache.org/jira/browse/HIVE-24035
> Project: Hive
>  Issue Type: Test
>Reporter: Chao Sun
>Assignee: Chao Sun
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> To enable precommit tests for github PR, we need to have a Jenkinsfile in the 
> repo. This is already done for master and branch-2. This adds the same for 
> branch-2.3



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24149) HiveStreamingConnection doesn't close HMS connection

2020-09-12 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24149?focusedWorklogId=482789=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-482789
 ]

ASF GitHub Bot logged work on HIVE-24149:
-

Author: ASF GitHub Bot
Created on: 12/Sep/20 20:22
Start Date: 12/Sep/20 20:22
Worklog Time Spent: 10m 
  Work Description: zeroflag opened a new pull request #1488:
URL: https://github.com/apache/hive/pull/1488







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 482789)
Time Spent: 20m  (was: 10m)

> HiveStreamingConnection doesn't close HMS connection
> 
>
> Key: HIVE-24149
> URL: https://issues.apache.org/jira/browse/HIVE-24149
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Attila Magyar
>Assignee: Attila Magyar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> There 3 HMS connections used by HiveStreamingConnection. One for TX one for 
> hearbeat and for notifications. The close method only closes the first 2 
> leaving the last one open which eventually overloads HMS and it becomes 
> unresponsive.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24151) MultiDelimitSerDe shifts data if strings contain non-ASCII characters

2020-09-12 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24151?focusedWorklogId=482791=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-482791
 ]

ASF GitHub Bot logged work on HIVE-24151:
-

Author: ASF GitHub Bot
Created on: 12/Sep/20 20:22
Start Date: 12/Sep/20 20:22
Worklog Time Spent: 10m 
  Work Description: szlta edited a comment on pull request #1490:
URL: https://github.com/apache/hive/pull/1490#issuecomment-691122063







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 482791)
Time Spent: 1h  (was: 50m)

> MultiDelimitSerDe shifts data if strings contain non-ASCII characters
> -
>
> Key: HIVE-24151
> URL: https://issues.apache.org/jira/browse/HIVE-24151
> Project: Hive
>  Issue Type: Bug
>Reporter: Ádám Szita
>Assignee: Ádám Szita
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> HIVE-22360 intended to fix another MultiDelimitSerde problem (with NULL last 
> columns) but introduced a regression: the approach of the fix is pretty much 
> all wrong, as the existing logic that operated on bytes got replaced by regex 
> matcher logic which deals in character positions, rather than byte positions. 
> As some non ASCII characters consist of more than 1 byte, the whole record 
> may get shifted due to this.
> With this ticket I'm going to restore the old logic, and apply the proper fix 
> on that, but keeping (and extending) the test cases added with HIVE-22360 so 
> that we have a solution for both issues.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24145) Fix preemption issues in reducers and file sink operators

2020-09-12 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24145?focusedWorklogId=482783=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-482783
 ]

ASF GitHub Bot logged work on HIVE-24145:
-

Author: ASF GitHub Bot
Created on: 12/Sep/20 20:21
Start Date: 12/Sep/20 20:21
Worklog Time Spent: 10m 
  Work Description: ramesh0201 opened a new pull request #1485:
URL: https://github.com/apache/hive/pull/1485







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 482783)
Time Spent: 50m  (was: 40m)

> Fix preemption issues in reducers and file sink operators
> -
>
> Key: HIVE-24145
> URL: https://issues.apache.org/jira/browse/HIVE-24145
> Project: Hive
>  Issue Type: Bug
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> There are two issues because of preemption:
>  # Reducers are getting reordered as part of optimizations because of which 
> more preemption happen
>  # Preemption in the middle of writing can cause the file to not close and 
> lead to errors when we read the file later



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-22782) Consolidate metastore call to fetch constraints

2020-09-12 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22782?focusedWorklogId=482765=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-482765
 ]

ASF GitHub Bot logged work on HIVE-22782:
-

Author: ASF GitHub Bot
Created on: 12/Sep/20 20:18
Start Date: 12/Sep/20 20:18
Worklog Time Spent: 10m 
  Work Description: ashish-kumar-sharma commented on a change in pull 
request #1419:
URL: https://github.com/apache/hive/pull/1419#discussion_r487366456



##
File path: ql/src/java/org/apache/hadoop/hive/ql/exec/repl/ReplDumpTask.java
##
@@ -1155,22 +1150,19 @@ void dumpConstraintMetadata(String dbName, String 
tblName, Path dbRoot, Hive hiv
   Path constraintsRoot = new Path(dbRoot, 
ReplUtils.CONSTRAINTS_ROOT_DIR_NAME);
   Path commonConstraintsFile = new Path(constraintsRoot, 
ConstraintFileType.COMMON.getPrefix() + tblName);
   Path fkConstraintsFile = new Path(constraintsRoot, 
ConstraintFileType.FOREIGNKEY.getPrefix() + tblName);
-  List pks = hiveDb.getPrimaryKeyList(dbName, tblName);
-  List fks = hiveDb.getForeignKeyList(dbName, tblName);
-  List uks = hiveDb.getUniqueConstraintList(dbName, 
tblName);
-  List nns = hiveDb.getNotNullConstraintList(dbName, 
tblName);
-  if ((pks != null && !pks.isEmpty()) || (uks != null && !uks.isEmpty())
-  || (nns != null && !nns.isEmpty())) {
+  SQLAllTableConstraints tableConstraints = 
hiveDb.getTableConstraints(dbName,tblName);

Review comment:
   Done

##
File path: ql/src/java/org/apache/hadoop/hive/ql/exec/repl/ReplDumpTask.java
##
@@ -1155,22 +1150,19 @@ void dumpConstraintMetadata(String dbName, String 
tblName, Path dbRoot, Hive hiv
   Path constraintsRoot = new Path(dbRoot, 
ReplUtils.CONSTRAINTS_ROOT_DIR_NAME);
   Path commonConstraintsFile = new Path(constraintsRoot, 
ConstraintFileType.COMMON.getPrefix() + tblName);
   Path fkConstraintsFile = new Path(constraintsRoot, 
ConstraintFileType.FOREIGNKEY.getPrefix() + tblName);
-  List pks = hiveDb.getPrimaryKeyList(dbName, tblName);
-  List fks = hiveDb.getForeignKeyList(dbName, tblName);
-  List uks = hiveDb.getUniqueConstraintList(dbName, 
tblName);
-  List nns = hiveDb.getNotNullConstraintList(dbName, 
tblName);
-  if ((pks != null && !pks.isEmpty()) || (uks != null && !uks.isEmpty())
-  || (nns != null && !nns.isEmpty())) {
+  SQLAllTableConstraints tableConstraints = 
hiveDb.getTableConstraints(dbName,tblName);
+  if ((tableConstraints.getPrimaryKeys() != null && 
!tableConstraints.getPrimaryKeys().isEmpty()) || 
(tableConstraints.getUniqueConstraints() != null && 
!tableConstraints.getUniqueConstraints().isEmpty())

Review comment:
   Yes, code is very redundant. I have replaced it with 
CollectionsUtils.isNotEmpty() which does the same check i.e not null and 
isEmpty()

##
File path: ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java
##
@@ -5661,184 +5663,79 @@ public void dropConstraint(String dbName, String 
tableName, String constraintNam
 }
   }
 
-  public List getDefaultConstraintList(String dbName, 
String tblName) throws HiveException, NoSuchObjectException {
+  public SQLAllTableConstraints getTableConstraints(String dbName, String 
tblName) throws HiveException, NoSuchObjectException {
 try {
-  return getMSC().getDefaultConstraints(new 
DefaultConstraintsRequest(getDefaultCatalog(conf), dbName, tblName));
+  AllTableConstraintsRequest tableConstraintsRequest = new 
AllTableConstraintsRequest();
+  tableConstraintsRequest.setDbName(dbName);
+  tableConstraintsRequest.setTblName(tblName);
+  tableConstraintsRequest.setCatName(getDefaultCatalog(conf));
+  return getMSC().getAllTableConstraints(tableConstraintsRequest);
 } catch (NoSuchObjectException e) {
   throw e;
 } catch (Exception e) {
   throw new HiveException(e);
 }
   }
-
-  public List getCheckConstraintList(String dbName, String 
tblName) throws HiveException, NoSuchObjectException {
-try {
-  return getMSC().getCheckConstraints(new 
CheckConstraintsRequest(getDefaultCatalog(conf),
-  dbName, tblName));
-} catch (NoSuchObjectException e) {
-  throw e;
-} catch (Exception e) {
-  throw new HiveException(e);
-}
+  public TableConstraintsInfo getAllTableConstraints(String dbName, String 
tblName) throws HiveException {
+return getTableConstraints(dbName, tblName, false, false);
   }
 
-  /**
-   * Get all primary key columns associated with the table.
-   *
-   * @param dbName Database Name
-   * @param tblName Table Name
-   * @return Primary Key associated with the table.
-   * @throws HiveException
-   */
-  public PrimaryKeyInfo getPrimaryKeys(String dbName, String tblName) throws 
HiveException {
-return getPrimaryKeys(dbName, tblName, false);
+  public TableConstraintsInfo

[jira] [Work logged] (HIVE-24143) Include convention in JDBC converter operator in Calcite plan

2020-09-12 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24143?focusedWorklogId=482740=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-482740
 ]

ASF GitHub Bot logged work on HIVE-24143:
-

Author: ASF GitHub Bot
Created on: 12/Sep/20 20:16
Start Date: 12/Sep/20 20:16
Worklog Time Spent: 10m 
  Work Description: jcamachor commented on a change in pull request #1482:
URL: https://github.com/apache/hive/pull/1482#discussion_r487299089



##
File path: ql/src/test/queries/clientpositive/external_jdbc_table2.q
##
@@ -7,43 +6,43 @@ CREATE TEMPORARY FUNCTION dboutput AS 
'org.apache.hadoop.hive.contrib.genericudf
 
 FROM src
 SELECT
-dboutput 
('jdbc:derby:;databaseName=${system:test.tmp.dir}/test_derby_auth1;create=true','user1','passwd1',
+dboutput 
('jdbc:derby:;databaseName=${system:test.tmp.dir}/test_derby_auth1_1;create=true','user1','passwd1',

Review comment:
   I changed it to avoid a clash with other test in the temp directory, 
which I believe was causing HIVE-23910.

##
File path: 
ql/src/test/results/clientpositive/llap/external_jdbc_table_perf.q.out
##
@@ -6115,32 +6112,39 @@ WHERE "d_year" = 1999 AND "d_moy" BETWEEN 1 AND 3 AND 
"d_date_sk" IS NOT NULL) A
 Reduce Operator Tree:
   Merge Join Operator
 condition map:
- Anti Join 0 to 1
+ Left Outer Join 0 to 1

Review comment:
   This change is not actually related to this patch. Note that the test 
was disabled by default (HIVE-23910); it seems maybe a preliminary version of 
HIVE-23716 changed these q files and it should have not.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 482740)
Time Spent: 1h 40m  (was: 1.5h)

> Include convention in JDBC converter operator in Calcite plan
> -
>
> Key: HIVE-24143
> URL: https://issues.apache.org/jira/browse/HIVE-24143
> Project: Hive
>  Issue Type: Improvement
>  Components: CBO
>Affects Versions: 4.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> Among others, it will be useful to debug the dialect being chosen for query 
> generation. For instance:
> {code}
>  HiveProject(jdbc_type_conversion_table1.ikey=[$0], 
> jdbc_type_conversion_table1.bkey=[$1], jdbc_type_conversion_table1.fkey=[$2], 
> jdbc_type_conversion_table1.dkey=[$3], 
> jdbc_type_conversion_table1.chkey=[$4], 
> jdbc_type_conversion_table1.dekey=[$5], 
> jdbc_type_conversion_table1.dtkey=[$6], jdbc_type_conversion_table1.tkey=[$7])
>   HiveProject(ikey=[$0], bkey=[$1], fkey=[$2], dkey=[$3], chkey=[$4], 
> dekey=[$5], dtkey=[$6], tkey=[$7])
> ->HiveJdbcConverter(convention=[JDBC.DERBY])
>   JdbcHiveTableScan(table=[[default, jdbc_type_conversion_table1]], 
> table:alias=[jdbc_type_conversion_table1])
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24084) Push Aggregates thru joins in case it re-groups previously unique columns

2020-09-12 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24084?focusedWorklogId=482731=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-482731
 ]

ASF GitHub Bot logged work on HIVE-24084:
-

Author: ASF GitHub Bot
Created on: 12/Sep/20 20:15
Start Date: 12/Sep/20 20:15
Worklog Time Spent: 10m 
  Work Description: kgyrtkirk commented on a change in pull request #1439:
URL: https://github.com/apache/hive/pull/1439#discussion_r487096103



##
File path: ql/src/test/queries/clientpositive/tpch18.q
##
@@ -0,0 +1,133 @@
+--! qt:dataset:tpch_0_001.customer
+--! qt:dataset:tpch_0_001.lineitem
+--! qt:dataset:tpch_0_001.nation
+--! qt:dataset:tpch_0_001.orders
+--! qt:dataset:tpch_0_001.part
+--! qt:dataset:tpch_0_001.partsupp
+--! qt:dataset:tpch_0_001.region
+--! qt:dataset:tpch_0_001.supplier
+
+
+use tpch_0_001;
+
+set hive.transpose.aggr.join=true;
+set hive.transpose.aggr.join.unique=true;
+set hive.mapred.mode=nonstrict;
+
+create view q18_tmp_cached as
+select
+   l_orderkey,
+   sum(l_quantity) as t_sum_quantity
+from
+   lineitem
+where
+   l_orderkey is not null
+group by
+   l_orderkey;
+
+
+
+explain cbo select
+c_name,
+c_custkey,
+o_orderkey,
+o_orderdate,
+o_totalprice,
+sum(l_quantity)
+from
+   customer,
+   orders,
+   q18_tmp_cached t,
+   lineitem l
+where
+c_custkey = o_custkey
+and o_orderkey = t.l_orderkey
+and o_orderkey is not null
+and t.t_sum_quantity > 300
+and o_orderkey = l.l_orderkey
+and l.l_orderkey is not null
+group by
+c_name,
+c_custkey,
+o_orderkey,
+o_orderdate,
+o_totalprice
+order by
+o_totalprice desc,
+o_orderdate
+limit 100;
+
+
+
+select 'add constraints';
+
+alter table orders add constraint pk_o primary key (o_orderkey) disable 
novalidate rely;
+alter table customer add constraint pk_c primary key (c_custkey) disable 
novalidate rely;
+

Review comment:
   I've added both constraints - it only removed the IS NOT NULL filter
   it seems to me that 1 of the sum() is used as an output and the other is 
being used to filter by >300 - so both of them is being "used"

##
File path: 
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveAggregateJoinTransposeRule.java
##
@@ -303,6 +305,90 @@ public void onMatch(RelOptRuleCall call) {
 }
   }
 
+  /**
+   * Determines weather the give grouping is unique.
+   *
+   * Consider a join which might produce non-unique rows; but later the 
results are aggregated again.
+   * This method determines if there are sufficient columns in the grouping 
which have been present previously as unique column(s).
+   */
+  private boolean isGroupingUnique(RelNode input, ImmutableBitSet groups) {
+if (groups.isEmpty()) {
+  return false;
+}
+RelMetadataQuery mq = input.getCluster().getMetadataQuery();
+Set uKeys = mq.getUniqueKeys(input);
+for (ImmutableBitSet u : uKeys) {
+  if (groups.contains(u)) {
+return true;
+  }
+}
+if (input instanceof Join) {
+  Join join = (Join) input;
+  RexBuilder rexBuilder = input.getCluster().getRexBuilder();
+  SimpleConditionInfo cond = new SimpleConditionInfo(join.getCondition(), 
rexBuilder);
+
+  if (cond.valid) {
+ImmutableBitSet newGroup = 
groups.intersect(ImmutableBitSet.fromBitSet(cond.fields));
+RelNode l = join.getLeft();
+RelNode r = join.getRight();
+
+int joinFieldCount = join.getRowType().getFieldCount();
+int lFieldCount = l.getRowType().getFieldCount();
+
+ImmutableBitSet groupL = newGroup.get(0, lFieldCount);
+ImmutableBitSet groupR = newGroup.get(lFieldCount, 
joinFieldCount).shift(-lFieldCount);
+
+if (isGroupingUnique(l, groupL)) {

Review comment:
   That could be done; and I'm sure it was true in this - but this logic 
will work better if it could walk down as many joins as it could - we might 
have an aggregate on top in the meantime a bunch of joins under it...so I feel 
that it will be beneficial to retain it.
   I feeled tempted to write a RelMd handler - however I don't think I could 
just introduce a new one easily.
   RelShuttle doesn't look like a good match - I'll leave it as a set of 
`instanceof` calls for now.
   
   I'll upload a new patch to see if digging deeper in the tree could do more 
or not.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 482731)
Time Spent: 3h 10m  (was: 3h)

> Push Aggregates thru joins in case it re-groups previously unique columns
>

[jira] [Work logged] (HIVE-24151) MultiDelimitSerDe shifts data if strings contain non-ASCII characters

2020-09-12 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24151?focusedWorklogId=482711=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-482711
 ]

ASF GitHub Bot logged work on HIVE-24151:
-

Author: ASF GitHub Bot
Created on: 12/Sep/20 20:14
Start Date: 12/Sep/20 20:14
Worklog Time Spent: 10m 
  Work Description: szlta opened a new pull request #1490:
URL: https://github.com/apache/hive/pull/1490


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 482711)
Time Spent: 50m  (was: 40m)

> MultiDelimitSerDe shifts data if strings contain non-ASCII characters
> -
>
> Key: HIVE-24151
> URL: https://issues.apache.org/jira/browse/HIVE-24151
> Project: Hive
>  Issue Type: Bug
>Reporter: Ádám Szita
>Assignee: Ádám Szita
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> HIVE-22360 intended to fix another MultiDelimitSerde problem (with NULL last 
> columns) but introduced a regression: the approach of the fix is pretty much 
> all wrong, as the existing logic that operated on bytes got replaced by regex 
> matcher logic which deals in character positions, rather than byte positions. 
> As some non ASCII characters consist of more than 1 byte, the whole record 
> may get shifted due to this.
> With this ticket I'm going to restore the old logic, and apply the proper fix 
> on that, but keeping (and extending) the test cases added with HIVE-22360 so 
> that we have a solution for both issues.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-22782) Consolidate metastore call to fetch constraints

2020-09-12 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22782?focusedWorklogId=482700=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-482700
 ]

ASF GitHub Bot logged work on HIVE-22782:
-

Author: ASF GitHub Bot
Created on: 12/Sep/20 20:13
Start Date: 12/Sep/20 20:13
Worklog Time Spent: 10m 
  Work Description: sankarh commented on a change in pull request #1419:
URL: https://github.com/apache/hive/pull/1419#discussion_r486978810



##
File path: ql/src/java/org/apache/hadoop/hive/ql/exec/repl/ReplDumpTask.java
##
@@ -1155,22 +1150,19 @@ void dumpConstraintMetadata(String dbName, String 
tblName, Path dbRoot, Hive hiv
   Path constraintsRoot = new Path(dbRoot, 
ReplUtils.CONSTRAINTS_ROOT_DIR_NAME);
   Path commonConstraintsFile = new Path(constraintsRoot, 
ConstraintFileType.COMMON.getPrefix() + tblName);
   Path fkConstraintsFile = new Path(constraintsRoot, 
ConstraintFileType.FOREIGNKEY.getPrefix() + tblName);
-  List pks = hiveDb.getPrimaryKeyList(dbName, tblName);
-  List fks = hiveDb.getForeignKeyList(dbName, tblName);
-  List uks = hiveDb.getUniqueConstraintList(dbName, 
tblName);
-  List nns = hiveDb.getNotNullConstraintList(dbName, 
tblName);
-  if ((pks != null && !pks.isEmpty()) || (uks != null && !uks.isEmpty())
-  || (nns != null && !nns.isEmpty())) {
+  SQLAllTableConstraints tableConstraints = 
hiveDb.getTableConstraints(dbName,tblName);
+  if ((tableConstraints.getPrimaryKeys() != null && 
!tableConstraints.getPrimaryKeys().isEmpty()) || 
(tableConstraints.getUniqueConstraints() != null && 
!tableConstraints.getUniqueConstraints().isEmpty())

Review comment:
   Can add utility method to check for null and empty of given list. Used 
multiple times. Also use local variables to reduce the code.

##
File path: ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java
##
@@ -5661,184 +5663,79 @@ public void dropConstraint(String dbName, String 
tableName, String constraintNam
 }
   }
 
-  public List getDefaultConstraintList(String dbName, 
String tblName) throws HiveException, NoSuchObjectException {
+  public SQLAllTableConstraints getTableConstraints(String dbName, String 
tblName) throws HiveException, NoSuchObjectException {
 try {
-  return getMSC().getDefaultConstraints(new 
DefaultConstraintsRequest(getDefaultCatalog(conf), dbName, tblName));
+  AllTableConstraintsRequest tableConstraintsRequest = new 
AllTableConstraintsRequest();
+  tableConstraintsRequest.setDbName(dbName);
+  tableConstraintsRequest.setTblName(tblName);
+  tableConstraintsRequest.setCatName(getDefaultCatalog(conf));
+  return getMSC().getAllTableConstraints(tableConstraintsRequest);
 } catch (NoSuchObjectException e) {
   throw e;
 } catch (Exception e) {
   throw new HiveException(e);
 }
   }
-
-  public List getCheckConstraintList(String dbName, String 
tblName) throws HiveException, NoSuchObjectException {
-try {
-  return getMSC().getCheckConstraints(new 
CheckConstraintsRequest(getDefaultCatalog(conf),
-  dbName, tblName));
-} catch (NoSuchObjectException e) {
-  throw e;
-} catch (Exception e) {
-  throw new HiveException(e);
-}
+  public TableConstraintsInfo getAllTableConstraints(String dbName, String 
tblName) throws HiveException {
+return getTableConstraints(dbName, tblName, false, false);
   }
 
-  /**
-   * Get all primary key columns associated with the table.
-   *
-   * @param dbName Database Name
-   * @param tblName Table Name
-   * @return Primary Key associated with the table.
-   * @throws HiveException
-   */
-  public PrimaryKeyInfo getPrimaryKeys(String dbName, String tblName) throws 
HiveException {
-return getPrimaryKeys(dbName, tblName, false);
+  public TableConstraintsInfo getReliableAndEnableTableConstraints(String 
dbName, String tblName) throws HiveException {
+return getTableConstraints(dbName, tblName, true, true);
   }
 
-  /**
-   * Get primary key columns associated with the table that are available for 
optimization.
-   *
-   * @param dbName Database Name
-   * @param tblName Table Name
-   * @return Primary Key associated with the table.
-   * @throws HiveException
-   */
-  public PrimaryKeyInfo getReliablePrimaryKeys(String dbName, String tblName) 
throws HiveException {
-return getPrimaryKeys(dbName, tblName, true);
-  }
-
-  private PrimaryKeyInfo getPrimaryKeys(String dbName, String tblName, boolean 
onlyReliable)
+  private TableConstraintsInfo getTableConstraints(String dbName, String 
tblName, boolean reliable, boolean enable)

Review comment:
   nit: Use "fetchReliable" and "fetchEnabled" instead of "reliable" and 
"enable" as it sound like flag to enable something.

##
File path: ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java
##
@@ -116,22

[jira] [Work logged] (HIVE-24145) Fix preemption issues in reducers and file sink operators

2020-09-12 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24145?focusedWorklogId=482691=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-482691
 ]

ASF GitHub Bot logged work on HIVE-24145:
-

Author: ASF GitHub Bot
Created on: 12/Sep/20 20:12
Start Date: 12/Sep/20 20:12
Worklog Time Spent: 10m 
  Work Description: rbalamohan commented on a change in pull request #1485:
URL: https://github.com/apache/hive/pull/1485#discussion_r486786544



##
File path: ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java
##
@@ -216,29 +216,47 @@ public FSPaths(Path specPath, boolean isMmTable, boolean 
isDirectInsert, boolean
 }
 
 public void closeWriters(boolean abort) throws HiveException {
+  Exception exception = null;
   for (int idx = 0; idx < outWriters.length; idx++) {
 if (outWriters[idx] != null) {
   try {
 outWriters[idx].close(abort);
 updateProgress();
   } catch (IOException e) {
-throw new HiveException(e);
+exception = e;
+LOG.error("Error closing " + outWriters[idx].toString(), e);
+// continue closing others
   }
 }
   }
-  try {
+  for (int i = 0; i < updaters.length; i++) {
+if (updaters[i] != null) {
+  SerDeStats stats = updaters[i].getStats();
+  // Ignore 0 row files except in case of insert overwrite
+  if (isDirectInsert && (stats.getRowCount() > 0 || 
isInsertOverwrite)) {
+outPathsCommitted[i] = updaters[i].getUpdatedFilePath();
+  }
+  try {
+updaters[i].close(abort);
+  } catch (IOException e) {
+exception = e;
+LOG.error("Error closing " + updaters[i].toString(), e);
+// continue closing others
+  }
+}
+  }
+  // Made an attempt to close all writers.
+  if (exception != null) {
 for (int i = 0; i < updaters.length; i++) {
   if (updaters[i] != null) {
-SerDeStats stats = updaters[i].getStats();
-// Ignore 0 row files except in case of insert overwrite
-if (isDirectInsert && (stats.getRowCount() > 0 || 
isInsertOverwrite)) {
-  outPathsCommitted[i] = updaters[i].getUpdatedFilePath();
+try {
+  fs.delete(updaters[i].getUpdatedFilePath(), true);
+} catch (IOException e) {
+  e.printStackTrace();

Review comment:
   LOG?

##
File path: 
ql/src/java/org/apache/hadoop/hive/ql/optimizer/SortedDynPartitionOptimizer.java
##
@@ -284,6 +285,11 @@ public Object process(Node nd, Stack stack, 
NodeProcessorCtx procCtx,
   // Create ReduceSink operator
   ReduceSinkOperator rsOp = getReduceSinkOp(partitionPositions, 
sortPositions, sortOrder, sortNullOrder,
   allRSCols, bucketColumns, numBuckets, fsParent, 
fsOp.getConf().getWriteType());
+  // we have to make sure not to reorder the child operators as it might 
cause weird behavior in the tasks at
+  // the same level. when there is auto stats gather at the same level as 
another operation then it might
+  // cause unnecessary preemption. Maintaining the order here to avoid 
such preemption and possible errors

Review comment:
   Plz add TEZ-3296 as ref if possible.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 482691)
Time Spent: 40m  (was: 0.5h)

> Fix preemption issues in reducers and file sink operators
> -
>
> Key: HIVE-24145
> URL: https://issues.apache.org/jira/browse/HIVE-24145
> Project: Hive
>  Issue Type: Bug
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> There are two issues because of preemption:
>  # Reducers are getting reordered as part of optimizations because of which 
> more preemption happen
>  # Preemption in the middle of writing can cause the file to not close and 
> lead to errors when we read the file later



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24138) Llap external client flow is broken due to netty shading

2020-09-12 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24138?focusedWorklogId=482687=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-482687
 ]

ASF GitHub Bot logged work on HIVE-24138:
-

Author: ASF GitHub Bot
Created on: 12/Sep/20 20:12
Start Date: 12/Sep/20 20:12
Worklog Time Spent: 10m 
  Work Description: ayushtkn opened a new pull request #1491:
URL: https://github.com/apache/hive/pull/1491


   https://issues.apache.org/jira/browse/HIVE-24138



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 482687)
Time Spent: 20m  (was: 10m)

> Llap external client flow is broken due to netty shading
> 
>
> Key: HIVE-24138
> URL: https://issues.apache.org/jira/browse/HIVE-24138
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Reporter: Shubham Chaurasia
>Assignee: Ayush Saxena
>Priority: Critical
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> We shaded netty in hive-exec in - 
> https://issues.apache.org/jira/browse/HIVE-23073
> This breaks LLAP external client flow on LLAP daemon side - 
> LLAP daemon stacktrace - 
> {code}
> 2020-09-09T18:22:13,413  INFO [TezTR-222977_4_0_0_0_0 
> (497418324441977_0004_0_00_00_0)] llap.LlapOutputFormat: Returning 
> writer for: attempt_497418324441977_0004_0_00_00_0
> 2020-09-09T18:22:13,419 ERROR [TezTR-222977_4_0_0_0_0 
> (497418324441977_0004_0_00_00_0)] tez.MapRecordSource: 
> java.lang.NoSuchMethodError: 
> org.apache.arrow.memory.BufferAllocator.buffer(I)Lorg/apache/hive/io/netty/buffer/ArrowBuf;
>   at 
> org.apache.hadoop.hive.llap.WritableByteChannelAdapter.write(WritableByteChannelAdapter.java:96)
>   at org.apache.arrow.vector.ipc.WriteChannel.write(WriteChannel.java:74)
>   at org.apache.arrow.vector.ipc.WriteChannel.write(WriteChannel.java:57)
>   at 
> org.apache.arrow.vector.ipc.WriteChannel.writeIntLittleEndian(WriteChannel.java:89)
>   at 
> org.apache.arrow.vector.ipc.message.MessageSerializer.serialize(MessageSerializer.java:88)
>   at 
> org.apache.arrow.vector.ipc.ArrowWriter.ensureStarted(ArrowWriter.java:130)
>   at 
> org.apache.arrow.vector.ipc.ArrowWriter.writeBatch(ArrowWriter.java:102)
>   at 
> org.apache.hadoop.hive.llap.LlapArrowRecordWriter.write(LlapArrowRecordWriter.java:85)
>   at 
> org.apache.hadoop.hive.llap.LlapArrowRecordWriter.write(LlapArrowRecordWriter.java:46)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.filesink.VectorFileSinkArrowOperator.process(VectorFileSinkArrowOperator.java:137)
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.vectorForward(Operator.java:969)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.process(VectorSelectOperator.java:158)
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.vectorForward(Operator.java:969)
>   at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:172)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.deliverVectorizedRowBatch(VectorMapOperator.java:809)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:842)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:92)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:76)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:426)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:267)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:250)
>   at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:75)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:62)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1876)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:62)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:38)
>

[jira] [Work logged] (HIVE-24150) Refactor CommitTxnRequest field order

2020-09-12 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24150?focusedWorklogId=482672=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-482672
 ]

ASF GitHub Bot logged work on HIVE-24150:
-

Author: ASF GitHub Bot
Created on: 12/Sep/20 20:11
Start Date: 12/Sep/20 20:11
Worklog Time Spent: 10m 
  Work Description: deniskuzZ opened a new pull request #1489:
URL: https://github.com/apache/hive/pull/1489


   
   
   ### What changes were proposed in this pull request?
   
   Refactor CommitTxnRequest field order (keyValue and exclWriteEnabled).
   
   ### Why are the changes needed?
   
   HIVE-24125 introduced backward incompatible change.
   
   ### Does this PR introduce _any_ user-facing change?
   
   
   
   ### How was this patch tested?
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 482672)
Time Spent: 20m  (was: 10m)

> Refactor CommitTxnRequest field order
> -
>
> Key: HIVE-24150
> URL: https://issues.apache.org/jira/browse/HIVE-24150
> Project: Hive
>  Issue Type: Bug
>Reporter: Denys Kuzmenko
>Assignee: Denys Kuzmenko
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Refactor CommitTxnRequest field order (keyValue and exclWriteEnabled). 
> HIVE-24125 introduced backward incompatible change.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24147) Table column names are not extracted correctly in Hive JDBC storage handler

2020-09-12 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24147?focusedWorklogId=482637=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-482637
 ]

ASF GitHub Bot logged work on HIVE-24147:
-

Author: ASF GitHub Bot
Created on: 12/Sep/20 20:08
Start Date: 12/Sep/20 20:08
Worklog Time Spent: 10m 
  Work Description: jcamachor opened a new pull request #1486:
URL: https://github.com/apache/hive/pull/1486


   …BC storage handler
   
   
   
   ### What changes were proposed in this pull request?
   
   
   
   ### Why are the changes needed?
   
   
   
   ### Does this PR introduce _any_ user-facing change?
   
   
   
   ### How was this patch tested?
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 482637)
Time Spent: 20m  (was: 10m)

> Table column names are not extracted correctly in Hive JDBC storage handler
> ---
>
> Key: HIVE-24147
> URL: https://issues.apache.org/jira/browse/HIVE-24147
> Project: Hive
>  Issue Type: Bug
>  Components: JDBC storage handler
>Affects Versions: 4.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> It seems the `ResultSetMetaData` for the query used to retrieve the table 
> columns names contains fully qualified names, instead of possibly supporting 
> the {{getTableName}} method. This ends up throwing the storage handler off 
> and leading to exceptions, both in CBO path and non-CBO path.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24143) Include convention in JDBC converter operator in Calcite plan

2020-09-12 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24143?focusedWorklogId=482621=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-482621
 ]

ASF GitHub Bot logged work on HIVE-24143:
-

Author: ASF GitHub Bot
Created on: 12/Sep/20 20:07
Start Date: 12/Sep/20 20:07
Worklog Time Spent: 10m 
  Work Description: vineetgarg02 commented on pull request #1482:
URL: https://github.com/apache/hive/pull/1482#issuecomment-691326259


   +1, looks good to me



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 482621)
Time Spent: 1.5h  (was: 1h 20m)

> Include convention in JDBC converter operator in Calcite plan
> -
>
> Key: HIVE-24143
> URL: https://issues.apache.org/jira/browse/HIVE-24143
> Project: Hive
>  Issue Type: Improvement
>  Components: CBO
>Affects Versions: 4.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Among others, it will be useful to debug the dialect being chosen for query 
> generation. For instance:
> {code}
>  HiveProject(jdbc_type_conversion_table1.ikey=[$0], 
> jdbc_type_conversion_table1.bkey=[$1], jdbc_type_conversion_table1.fkey=[$2], 
> jdbc_type_conversion_table1.dkey=[$3], 
> jdbc_type_conversion_table1.chkey=[$4], 
> jdbc_type_conversion_table1.dekey=[$5], 
> jdbc_type_conversion_table1.dtkey=[$6], jdbc_type_conversion_table1.tkey=[$7])
>   HiveProject(ikey=[$0], bkey=[$1], fkey=[$2], dkey=[$3], chkey=[$4], 
> dekey=[$5], dtkey=[$6], tkey=[$7])
> ->HiveJdbcConverter(convention=[JDBC.DERBY])
>   JdbcHiveTableScan(table=[[default, jdbc_type_conversion_table1]], 
> table:alias=[jdbc_type_conversion_table1])
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24146) Cleanup TaskExecutionException in GenericUDTFExplode

2020-09-12 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24146?focusedWorklogId=482615=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-482615
 ]

ASF GitHub Bot logged work on HIVE-24146:
-

Author: ASF GitHub Bot
Created on: 12/Sep/20 20:06
Start Date: 12/Sep/20 20:06
Worklog Time Spent: 10m 
  Work Description: dengzhhu653 opened a new pull request #1483:
URL: https://github.com/apache/hive/pull/1483


   
   
   ### What changes were proposed in this pull request?
   
   
   
   ### Why are the changes needed?
   
   
   
   ### Does this PR introduce _any_ user-facing change?
   
   
   
   ### How was this patch tested?
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 482615)
Time Spent: 20m  (was: 10m)

> Cleanup TaskExecutionException in GenericUDTFExplode
> 
>
> Key: HIVE-24146
> URL: https://issues.apache.org/jira/browse/HIVE-24146
> Project: Hive
>  Issue Type: Improvement
>  Components: UDF
>Reporter: Zhihua Deng
>Assignee: Zhihua Deng
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> - Remove TaskExecutionException, which may be not used anymore;
> - Remove the default handling in GenericUDTFExplode#process, which has been 
> verified during the function initializing.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24143) Include convention in JDBC converter operator in Calcite plan

2020-09-12 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24143?focusedWorklogId=482607=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-482607
 ]

ASF GitHub Bot logged work on HIVE-24143:
-

Author: ASF GitHub Bot
Created on: 12/Sep/20 20:06
Start Date: 12/Sep/20 20:06
Worklog Time Spent: 10m 
  Work Description: jcamachor merged pull request #1482:
URL: https://github.com/apache/hive/pull/1482


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 482607)
Time Spent: 1h 20m  (was: 1h 10m)

> Include convention in JDBC converter operator in Calcite plan
> -
>
> Key: HIVE-24143
> URL: https://issues.apache.org/jira/browse/HIVE-24143
> Project: Hive
>  Issue Type: Improvement
>  Components: CBO
>Affects Versions: 4.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Among others, it will be useful to debug the dialect being chosen for query 
> generation. For instance:
> {code}
>  HiveProject(jdbc_type_conversion_table1.ikey=[$0], 
> jdbc_type_conversion_table1.bkey=[$1], jdbc_type_conversion_table1.fkey=[$2], 
> jdbc_type_conversion_table1.dkey=[$3], 
> jdbc_type_conversion_table1.chkey=[$4], 
> jdbc_type_conversion_table1.dekey=[$5], 
> jdbc_type_conversion_table1.dtkey=[$6], jdbc_type_conversion_table1.tkey=[$7])
>   HiveProject(ikey=[$0], bkey=[$1], fkey=[$2], dkey=[$3], chkey=[$4], 
> dekey=[$5], dtkey=[$6], tkey=[$7])
> ->HiveJdbcConverter(convention=[JDBC.DERBY])
>   JdbcHiveTableScan(table=[[default, jdbc_type_conversion_table1]], 
> table:alias=[jdbc_type_conversion_table1])
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23841) Field writers is an HashSet, i.e., not thread-safe. Field writers is typically protected by synchronization on lock, but not in 1 location.

2020-09-12 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23841?focusedWorklogId=482603=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-482603
 ]

ASF GitHub Bot logged work on HIVE-23841:
-

Author: ASF GitHub Bot
Created on: 12/Sep/20 20:05
Start Date: 12/Sep/20 20:05
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #1248:
URL: https://github.com/apache/hive/pull/1248#issuecomment-691367907


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 482603)
Time Spent: 40m  (was: 0.5h)

> Field writers is an HashSet, i.e., not thread-safe.  Field writers is 
> typically protected by synchronization on lock, but not in 1 location.
> 
>
> Key: HIVE-23841
> URL: https://issues.apache.org/jira/browse/HIVE-23841
> Project: Hive
>  Issue Type: Bug
> Environment: Any environment
>Reporter: Adrian Nistor
>Priority: Major
>  Labels: patch-available, pull-request-available
> Attachments: HIVE-23841.patch
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> I also submitted a pull request on github at:
>  
> [https://github.com/apache/hive/pull/1248]
>  
> (same patch)
> h1. Description
>  
> Field {{writers}} is a {{HashSet}} ([line 
> 70|https://github.com/apache/hive/blob/c93d7797329103d6c509bada68b6da7f907b3dee/ql/src/java/org/apache/hadoop/hive/llap/LlapOutputFormatService.java#L70]),
>  i.e., not thread-safe.
> Accesses to field {{writers}} are protected by synchronization on {{lock}}, 
> e.g., at lines: 
> [141-144|https://github.com/apache/hive/blob/c93d7797329103d6c509bada68b6da7f907b3dee/ql/src/java/org/apache/hadoop/hive/llap/LlapOutputFormatService.java#L141-L144],
>  
> [212-213|https://github.com/apache/hive/blob/c93d7797329103d6c509bada68b6da7f907b3dee/ql/src/java/org/apache/hadoop/hive/llap/LlapOutputFormatService.java#L212-L213],
>  and 
> [212-215|https://github.com/apache/hive/blob/c93d7797329103d6c509bada68b6da7f907b3dee/ql/src/java/org/apache/hadoop/hive/llap/LlapOutputFormatService.java#L212-L215].
> However, the {{writers.remove()}} at [line 
> 249|https://github.com/apache/hive/blob/c93d7797329103d6c509bada68b6da7f907b3dee/ql/src/java/org/apache/hadoop/hive/llap/LlapOutputFormatService.java#L249]
>  is protected by synchronization on {{INSTANCE}}, *not* on {{lock}}.
> Synchronizing on 2 different objects does not ensure mutual exclusion. This 
> is because 2 threads synchronizing on different objects can still execute in 
> parallel at the same time.
> Note that lines 
> [215|https://github.com/apache/hive/blob/c93d7797329103d6c509bada68b6da7f907b3dee/ql/src/java/org/apache/hadoop/hive/llap/LlapOutputFormatService.java#L215]
>  and 
> [249|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/llap/LlapOutputFormatService.java#L249]
>  are modifying {{writers}} with {{put()}} and {{remove()}}, respectively.
> h1. The Code for This Fix
> This fix is very simple: just change {{synchronized (INSTANCE)}} to 
> {{synchronized (lock)}}, just like the methods containing the other lines 
> listed above.[]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24151) MultiDelimitSerDe shifts data if strings contain non-ASCII characters

2020-09-12 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24151?focusedWorklogId=482552=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-482552
 ]

ASF GitHub Bot logged work on HIVE-24151:
-

Author: ASF GitHub Bot
Created on: 12/Sep/20 20:01
Start Date: 12/Sep/20 20:01
Worklog Time Spent: 10m 
  Work Description: szlta commented on pull request #1490:
URL: https://github.com/apache/hive/pull/1490#issuecomment-691122063


   Since this is a partial revert I'm placing the diff of LazyStruct.java for: 
before HIVE-22360 vs my current commit:
   
   szita@szita-MBP16:~/shadow/CDH/hive$ git diff 
463dae9ee8f694002af492e7d05924423aeaed09:serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyStruct.java
 
5de36f990d89fcd5c3d7d2344a28e16e4c1f8c24:serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyStruct.java
   diff --git 
a/serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyStruct.java 
b/serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyStruct.java
   index f066aaa3bf5..66b15374dda 100644
   --- a/serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyStruct.java
   +++ b/serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyStruct.java
   @@ -22,6 +22,8 @@
import java.util.List;
   
import com.google.common.primitives.Bytes;
   +
   +import org.apache.hadoop.hive.serde2.MultiDelimitSerDe;
import org.apache.hadoop.hive.serde2.SerDeException;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
   @@ -294,10 +296,10 @@ public void parseMultiDelimit(byte[] rawRow, byte[] 
fieldDelimit) {
}
// the indexes of the delimiters
int[] delimitIndexes = findIndexes(rawRow, fieldDelimit);
   -int diff = fieldDelimit.length - 1;
   +int diff = fieldDelimit.length - 
MultiDelimitSerDe.REPLACEMENT_DELIM_LENGTH;
// first field always starts from 0, even when missing
startPosition[0] = 0;
   -for (int i = 1; i < fields.length; i++) {
   +for (int i = 1; i <= fields.length; i++) {
  if (delimitIndexes[i - 1] != -1) {
int start = delimitIndexes[i - 1] + fieldDelimit.length;
startPosition[i] = start - i * diff;
   @@ -305,7 +307,6 @@ public void parseMultiDelimit(byte[] rawRow, byte[] 
fieldDelimit) {
startPosition[i] = length + 1;
  }
}
   -startPosition[fields.length] = length + 1;
Arrays.fill(fieldInited, false);
parsed = true;
  }
   @@ -315,7 +316,7 @@ public void parseMultiDelimit(byte[] rawRow, byte[] 
fieldDelimit) {
if (fields.length <= 1) {
  return new int[0];
}
   -int[] indexes = new int[fields.length - 1];
   +int[] indexes = new int[fields.length];
Arrays.fill(indexes, -1);
indexes[0] = Bytes.indexOf(array, target);
if (indexes[0] == -1) {
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 482552)
Time Spent: 40m  (was: 0.5h)

> MultiDelimitSerDe shifts data if strings contain non-ASCII characters
> -
>
> Key: HIVE-24151
> URL: https://issues.apache.org/jira/browse/HIVE-24151
> Project: Hive
>  Issue Type: Bug
>Reporter: Ádám Szita
>Assignee: Ádám Szita
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> HIVE-22360 intended to fix another MultiDelimitSerde problem (with NULL last 
> columns) but introduced a regression: the approach of the fix is pretty much 
> all wrong, as the existing logic that operated on bytes got replaced by regex 
> matcher logic which deals in character positions, rather than byte positions. 
> As some non ASCII characters consist of more than 1 byte, the whole record 
> may get shifted due to this.
> With this ticket I'm going to restore the old logic, and apply the proper fix 
> on that, but keeping (and extending) the test cases added with HIVE-22360 so 
> that we have a solution for both issues.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24138) Llap external client flow is broken due to netty shading

2020-09-12 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24138?focusedWorklogId=482488=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-482488
 ]

ASF GitHub Bot logged work on HIVE-24138:
-

Author: ASF GitHub Bot
Created on: 12/Sep/20 15:00
Start Date: 12/Sep/20 15:00
Worklog Time Spent: 10m 
  Work Description: ayushtkn opened a new pull request #1491:
URL: https://github.com/apache/hive/pull/1491


   https://issues.apache.org/jira/browse/HIVE-24138



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 482488)
Remaining Estimate: 0h
Time Spent: 10m

> Llap external client flow is broken due to netty shading
> 
>
> Key: HIVE-24138
> URL: https://issues.apache.org/jira/browse/HIVE-24138
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Reporter: Shubham Chaurasia
>Assignee: Ayush Saxena
>Priority: Critical
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> We shaded netty in hive-exec in - 
> https://issues.apache.org/jira/browse/HIVE-23073
> This breaks LLAP external client flow on LLAP daemon side - 
> LLAP daemon stacktrace - 
> {code}
> 2020-09-09T18:22:13,413  INFO [TezTR-222977_4_0_0_0_0 
> (497418324441977_0004_0_00_00_0)] llap.LlapOutputFormat: Returning 
> writer for: attempt_497418324441977_0004_0_00_00_0
> 2020-09-09T18:22:13,419 ERROR [TezTR-222977_4_0_0_0_0 
> (497418324441977_0004_0_00_00_0)] tez.MapRecordSource: 
> java.lang.NoSuchMethodError: 
> org.apache.arrow.memory.BufferAllocator.buffer(I)Lorg/apache/hive/io/netty/buffer/ArrowBuf;
>   at 
> org.apache.hadoop.hive.llap.WritableByteChannelAdapter.write(WritableByteChannelAdapter.java:96)
>   at org.apache.arrow.vector.ipc.WriteChannel.write(WriteChannel.java:74)
>   at org.apache.arrow.vector.ipc.WriteChannel.write(WriteChannel.java:57)
>   at 
> org.apache.arrow.vector.ipc.WriteChannel.writeIntLittleEndian(WriteChannel.java:89)
>   at 
> org.apache.arrow.vector.ipc.message.MessageSerializer.serialize(MessageSerializer.java:88)
>   at 
> org.apache.arrow.vector.ipc.ArrowWriter.ensureStarted(ArrowWriter.java:130)
>   at 
> org.apache.arrow.vector.ipc.ArrowWriter.writeBatch(ArrowWriter.java:102)
>   at 
> org.apache.hadoop.hive.llap.LlapArrowRecordWriter.write(LlapArrowRecordWriter.java:85)
>   at 
> org.apache.hadoop.hive.llap.LlapArrowRecordWriter.write(LlapArrowRecordWriter.java:46)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.filesink.VectorFileSinkArrowOperator.process(VectorFileSinkArrowOperator.java:137)
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.vectorForward(Operator.java:969)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.process(VectorSelectOperator.java:158)
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.vectorForward(Operator.java:969)
>   at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:172)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.deliverVectorizedRowBatch(VectorMapOperator.java:809)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:842)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:92)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:76)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:426)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:267)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:250)
>   at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:75)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:62)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1876)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:62)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:38)
>   at

[jira] [Updated] (HIVE-24138) Llap external client flow is broken due to netty shading

2020-09-12 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-24138:
--
Labels: pull-request-available  (was: )

> Llap external client flow is broken due to netty shading
> 
>
> Key: HIVE-24138
> URL: https://issues.apache.org/jira/browse/HIVE-24138
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Reporter: Shubham Chaurasia
>Assignee: Ayush Saxena
>Priority: Critical
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> We shaded netty in hive-exec in - 
> https://issues.apache.org/jira/browse/HIVE-23073
> This breaks LLAP external client flow on LLAP daemon side - 
> LLAP daemon stacktrace - 
> {code}
> 2020-09-09T18:22:13,413  INFO [TezTR-222977_4_0_0_0_0 
> (497418324441977_0004_0_00_00_0)] llap.LlapOutputFormat: Returning 
> writer for: attempt_497418324441977_0004_0_00_00_0
> 2020-09-09T18:22:13,419 ERROR [TezTR-222977_4_0_0_0_0 
> (497418324441977_0004_0_00_00_0)] tez.MapRecordSource: 
> java.lang.NoSuchMethodError: 
> org.apache.arrow.memory.BufferAllocator.buffer(I)Lorg/apache/hive/io/netty/buffer/ArrowBuf;
>   at 
> org.apache.hadoop.hive.llap.WritableByteChannelAdapter.write(WritableByteChannelAdapter.java:96)
>   at org.apache.arrow.vector.ipc.WriteChannel.write(WriteChannel.java:74)
>   at org.apache.arrow.vector.ipc.WriteChannel.write(WriteChannel.java:57)
>   at 
> org.apache.arrow.vector.ipc.WriteChannel.writeIntLittleEndian(WriteChannel.java:89)
>   at 
> org.apache.arrow.vector.ipc.message.MessageSerializer.serialize(MessageSerializer.java:88)
>   at 
> org.apache.arrow.vector.ipc.ArrowWriter.ensureStarted(ArrowWriter.java:130)
>   at 
> org.apache.arrow.vector.ipc.ArrowWriter.writeBatch(ArrowWriter.java:102)
>   at 
> org.apache.hadoop.hive.llap.LlapArrowRecordWriter.write(LlapArrowRecordWriter.java:85)
>   at 
> org.apache.hadoop.hive.llap.LlapArrowRecordWriter.write(LlapArrowRecordWriter.java:46)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.filesink.VectorFileSinkArrowOperator.process(VectorFileSinkArrowOperator.java:137)
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.vectorForward(Operator.java:969)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.process(VectorSelectOperator.java:158)
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.vectorForward(Operator.java:969)
>   at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:172)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.deliverVectorizedRowBatch(VectorMapOperator.java:809)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:842)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:92)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:76)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:426)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:267)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:250)
>   at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:75)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:62)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1876)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:62)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:38)
>   at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
>   at 
> org.apache.hadoop.hive.llap.daemon.impl.StatsRecordingThreadPool$WrappedCallable.call(StatsRecordingThreadPool.java:118)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> {code}
> Arrow method signature mismatch mainly happens due to the fact that arrow 
> contains some classes which are packaged under {{io.netty.buffer.*}} - 
> {code}
> io.netty.buffer.ArrowBuf
> io.netty.buffer.ExpandableByteBuf
>

[jira] [Work logged] (HIVE-22782) Consolidate metastore call to fetch constraints

2020-09-12 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22782?focusedWorklogId=482447=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-482447
 ]

ASF GitHub Bot logged work on HIVE-22782:
-

Author: ASF GitHub Bot
Created on: 12/Sep/20 07:57
Start Date: 12/Sep/20 07:57
Worklog Time Spent: 10m 
  Work Description: ashish-kumar-sharma commented on a change in pull 
request #1419:
URL: https://github.com/apache/hive/pull/1419#discussion_r487375471



##
File path: ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java
##
@@ -1176,145 +1166,101 @@ public void setStatsStateLikeNewTable() {
*  Note that set apis are used by DESCRIBE only, although get apis return 
RELY or ENABLE
*  constraints DESCRIBE could set all type of constraints
* */
-
-  /* This only return PK which are created with RELY */
-  public PrimaryKeyInfo getPrimaryKeyInfo() {
-if(!this.isPKFetched) {
+  public TableConstraintsInfo getTableConstraintsInfo() {
+if (!this.isTableConstraintsFetched) {
   try {
-pki = Hive.get().getReliablePrimaryKeys(this.getDbName(), 
this.getTableName());
-this.isPKFetched = true;
+tableConstraintsInfo = 
Hive.get().getReliableAndEnableTableConstraints(this.getDbName(), 
this.getTableName());
+this.isTableConstraintsFetched = true;
   } catch (HiveException e) {
-LOG.warn("Cannot retrieve PK info for table : " + this.getTableName()
-+ " ignoring exception: " + e);
+LOG.warn(
+"Cannot retrieve table constraints info for table : " + 
this.getTableName() + " ignoring exception: " + e);
   }
 }
-return pki;
+return tableConstraintsInfo;
   }
 
-  public void setPrimaryKeyInfo(PrimaryKeyInfo pki) {
-this.pki = pki;
-this.isPKFetched = true;
+  /**
+   * TableConstraintsInfo setter
+   * @param tableConstraintsInfo
+   */
+  public void setTableConstraintsInfo(TableConstraintsInfo 
tableConstraintsInfo) {
+this.tableConstraintsInfo = tableConstraintsInfo;
+this.isTableConstraintsFetched = true;
   }
 
-  /* This only return FK constraints which are created with RELY */
-  public ForeignKeyInfo getForeignKeyInfo() {
-if(!isFKFetched) {
-  try {
-fki = Hive.get().getReliableForeignKeys(this.getDbName(), 
this.getTableName());
-this.isFKFetched = true;
-  } catch (HiveException e) {
-LOG.warn(
-"Cannot retrieve FK info for table : " + this.getTableName()
-+ " ignoring exception: " + e);
-  }
+  /**
+   * This only return PK which are created with RELY
+   * @return primary key constraint list
+   */
+  public PrimaryKeyInfo getPrimaryKeyInfo() {
+if (!this.isTableConstraintsFetched) {
+  getTableConstraintsInfo();
 }
-return fki;
+return tableConstraintsInfo.getPrimaryKeyInfo();
   }
 
-  public void setForeignKeyInfo(ForeignKeyInfo fki) {
-this.fki = fki;
-this.isFKFetched = true;
+  /**
+   * This only return FK constraints which are created with RELY
+   * @return foreign key constraint list
+   */
+  public ForeignKeyInfo getForeignKeyInfo() {
+if (!isTableConstraintsFetched) {
+  getTableConstraintsInfo();
+}
+return tableConstraintsInfo.getForeignKeyInfo();
   }
 
-  /* This only return UNIQUE constraint defined with RELY */
+  /**
+   * This only return UNIQUE constraint defined with RELY
+   * @return unique constraint list
+   */
   public UniqueConstraint getUniqueKeyInfo() {
-if(!isUniqueFetched) {
-  try {
-uki = Hive.get().getReliableUniqueConstraints(this.getDbName(), 
this.getTableName());
-this.isUniqueFetched = true;
-  } catch (HiveException e) {
-LOG.warn(
-"Cannot retrieve Unique Key info for table : " + 
this.getTableName()
-+ " ignoring exception: " + e);
-  }
+if (!isTableConstraintsFetched) {
+  getTableConstraintsInfo();
 }
-return uki;
-  }
-
-  public void setUniqueKeyInfo(UniqueConstraint uki) {
-this.uki = uki;
-this.isUniqueFetched = true;
+return tableConstraintsInfo.getUniqueConstraint();
   }
 
-  /* This only return NOT NULL constraint defined with RELY */
+  /**
+   * This only return NOT NULL constraint defined with RELY
+   * @return not null constraint list
+   */
   public NotNullConstraint getNotNullConstraint() {
-if(!isNotNullFetched) {
-  try {
-nnc = Hive.get().getReliableNotNullConstraints(this.getDbName(), 
this.getTableName());
-this.isNotNullFetched = true;
-  } catch (HiveException e) {
-LOG.warn("Cannot retrieve Not Null constraint info for table : "
-+ this.getTableName() + " ignoring exception: " + e);
-  }
+if (!isTableConstraintsFetched) {
+  getTableConstraintsInfo();
 }
-return nnc;
-  }
-
-  public void

[jira] [Work logged] (HIVE-22782) Consolidate metastore call to fetch constraints

2020-09-12 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22782?focusedWorklogId=482446=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-482446
 ]

ASF GitHub Bot logged work on HIVE-22782:
-

Author: ASF GitHub Bot
Created on: 12/Sep/20 07:56
Start Date: 12/Sep/20 07:56
Worklog Time Spent: 10m 
  Work Description: ashish-kumar-sharma commented on a change in pull 
request #1419:
URL: https://github.com/apache/hive/pull/1419#discussion_r487382720



##
File path: ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java
##
@@ -116,22 +116,12 @@
   private transient Boolean outdatedForRewritingMaterializedView;
 
   /** Constraint related objects */
-  private transient PrimaryKeyInfo pki;
-  private transient ForeignKeyInfo fki;
-  private transient UniqueConstraint uki;
-  private transient NotNullConstraint nnc;
-  private transient DefaultConstraint dc;
-  private transient CheckConstraint cc;
+  private transient TableConstraintsInfo tableConstraintsInfo;
 
   /** Constraint related flags
*  This is to track if constraints are retrieved from metastore or not
*/
-  private transient boolean isPKFetched=false;
-  private transient boolean isFKFetched=false;
-  private transient boolean isUniqueFetched=false;
-  private transient boolean isNotNullFetched=false;
-  private transient boolean isDefaultFetched=false;
-  private transient boolean isCheckFetched=false;
+  private transient boolean isTableConstraintsFetched=false;

Review comment:
   Since we have wrapper in place. Now we don't require extra flag to track 
the object is fetched or not. We can have if condition to check wrapper is null 
or not





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 482446)
Time Spent: 4.5h  (was: 4h 20m)

> Consolidate metastore call to fetch constraints
> ---
>
> Key: HIVE-22782
> URL: https://issues.apache.org/jira/browse/HIVE-22782
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Ashish Sharma
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> Currently separate calls are made to metastore to fetch constraints like Pk, 
> fk, not null etc. Since planner always retrieve these constraints we should 
> retrieve all of them in one call.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-22782) Consolidate metastore call to fetch constraints

2020-09-12 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22782?focusedWorklogId=482443=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-482443
 ]

ASF GitHub Bot logged work on HIVE-22782:
-

Author: ASF GitHub Bot
Created on: 12/Sep/20 07:54
Start Date: 12/Sep/20 07:54
Worklog Time Spent: 10m 
  Work Description: ashish-kumar-sharma commented on a change in pull 
request #1419:
URL: https://github.com/apache/hive/pull/1419#discussion_r487382545



##
File path: ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java
##
@@ -116,22 +116,12 @@
   private transient Boolean outdatedForRewritingMaterializedView;
 
   /** Constraint related objects */
-  private transient PrimaryKeyInfo pki;
-  private transient ForeignKeyInfo fki;
-  private transient UniqueConstraint uki;
-  private transient NotNullConstraint nnc;
-  private transient DefaultConstraint dc;
-  private transient CheckConstraint cc;
+  private transient TableConstraintsInfo tableConstraintsInfo;
 
   /** Constraint related flags
*  This is to track if constraints are retrieved from metastore or not
*/
-  private transient boolean isPKFetched=false;
-  private transient boolean isFKFetched=false;
-  private transient boolean isUniqueFetched=false;
-  private transient boolean isNotNullFetched=false;
-  private transient boolean isDefaultFetched=false;
-  private transient boolean isCheckFetched=false;
+  private transient boolean isTableConstraintsFetched=false;

Review comment:
   done





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 482443)
Time Spent: 4h 10m  (was: 4h)

> Consolidate metastore call to fetch constraints
> ---
>
> Key: HIVE-22782
> URL: https://issues.apache.org/jira/browse/HIVE-22782
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Ashish Sharma
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 4h 10m
>  Remaining Estimate: 0h
>
> Currently separate calls are made to metastore to fetch constraints like Pk, 
> fk, not null etc. Since planner always retrieve these constraints we should 
> retrieve all of them in one call.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-22782) Consolidate metastore call to fetch constraints

2020-09-12 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22782?focusedWorklogId=482445=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-482445
 ]

ASF GitHub Bot logged work on HIVE-22782:
-

Author: ASF GitHub Bot
Created on: 12/Sep/20 07:54
Start Date: 12/Sep/20 07:54
Worklog Time Spent: 10m 
  Work Description: ashish-kumar-sharma commented on a change in pull 
request #1419:
URL: https://github.com/apache/hive/pull/1419#discussion_r487382568



##
File path: ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java
##
@@ -1176,145 +1166,101 @@ public void setStatsStateLikeNewTable() {
*  Note that set apis are used by DESCRIBE only, although get apis return 
RELY or ENABLE
*  constraints DESCRIBE could set all type of constraints
* */
-
-  /* This only return PK which are created with RELY */
-  public PrimaryKeyInfo getPrimaryKeyInfo() {
-if(!this.isPKFetched) {
+  public TableConstraintsInfo getTableConstraintsInfo() {
+if (!this.isTableConstraintsFetched) {
   try {
-pki = Hive.get().getReliablePrimaryKeys(this.getDbName(), 
this.getTableName());
-this.isPKFetched = true;
+tableConstraintsInfo = 
Hive.get().getReliableAndEnableTableConstraints(this.getDbName(), 
this.getTableName());
+this.isTableConstraintsFetched = true;
   } catch (HiveException e) {
-LOG.warn("Cannot retrieve PK info for table : " + this.getTableName()
-+ " ignoring exception: " + e);
+LOG.warn(
+"Cannot retrieve table constraints info for table : " + 
this.getTableName() + " ignoring exception: " + e);

Review comment:
   replaced with complete name





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 482445)
Time Spent: 4h 20m  (was: 4h 10m)

> Consolidate metastore call to fetch constraints
> ---
>
> Key: HIVE-22782
> URL: https://issues.apache.org/jira/browse/HIVE-22782
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Ashish Sharma
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 4h 20m
>  Remaining Estimate: 0h
>
> Currently separate calls are made to metastore to fetch constraints like Pk, 
> fk, not null etc. Since planner always retrieve these constraints we should 
> retrieve all of them in one call.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-22782) Consolidate metastore call to fetch constraints

2020-09-12 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22782?focusedWorklogId=482435=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-482435
 ]

ASF GitHub Bot logged work on HIVE-22782:
-

Author: ASF GitHub Bot
Created on: 12/Sep/20 06:27
Start Date: 12/Sep/20 06:27
Worklog Time Spent: 10m 
  Work Description: ashish-kumar-sharma commented on a change in pull 
request #1419:
URL: https://github.com/apache/hive/pull/1419#discussion_r487375675



##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
##
@@ -9200,6 +9200,31 @@ public CheckConstraintsResponse 
get_check_constraints(CheckConstraintsRequest re
   return new CheckConstraintsResponse(ret);
 }
 
+/**
+ * Api to fetch all table constraints at once
+ * @param request it consist of catalog name, database name and table name 
to identify the table in metastore
+ * @return all cnstraint attached to given table
+ * @throws TException
+ */
+@Override
+public AllTableConstraintsResponse 
get_all_table_constraints(AllTableConstraintsRequest request) throws TException 
{
+  String catName = request.isSetCatName() ? request.getCatName() : 
getDefaultCatalog(conf);
+  String dbName = request.getDbName();
+  String tblName = request.getTblName();
+  startTableFunction("get_all_table_constraints", catName, dbName, 
tblName);
+  SQLAllTableConstraints ret = null;
+  Exception ex = null;
+  try {
+ret = getMS().getAllTableConstraints(catName,dbName,tblName);

Review comment:
   fixed





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 482435)
Time Spent: 3h 50m  (was: 3h 40m)

> Consolidate metastore call to fetch constraints
> ---
>
> Key: HIVE-22782
> URL: https://issues.apache.org/jira/browse/HIVE-22782
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Ashish Sharma
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h 50m
>  Remaining Estimate: 0h
>
> Currently separate calls are made to metastore to fetch constraints like Pk, 
> fk, not null etc. Since planner always retrieve these constraints we should 
> retrieve all of them in one call.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-22782) Consolidate metastore call to fetch constraints

2020-09-12 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22782?focusedWorklogId=482437=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-482437
 ]

ASF GitHub Bot logged work on HIVE-22782:
-

Author: ASF GitHub Bot
Created on: 12/Sep/20 06:27
Start Date: 12/Sep/20 06:27
Worklog Time Spent: 10m 
  Work Description: ashish-kumar-sharma commented on a change in pull 
request #1419:
URL: https://github.com/apache/hive/pull/1419#discussion_r487375692



##
File path: 
standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/IMetaStoreClient.java
##
@@ -3631,6 +3631,17 @@ boolean cacheFileMetadata(String dbName, String 
tableName, String partName,
   List getCheckConstraints(CheckConstraintsRequest 
request) throws MetaException,
   NoSuchObjectException, TException;
 
+  /**
+   * Get all constraints of given table
+   * @param request Request info
+   * @return all constraints of this table
+   * @throws MetaException
+   * @throws NoSuchObjectException
+   * @throws TException
+   */
+  SQLAllTableConstraints getAllTableConstraints(AllTableConstraintsRequest 
request)

Review comment:
   removed





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 482437)
Time Spent: 4h  (was: 3h 50m)

> Consolidate metastore call to fetch constraints
> ---
>
> Key: HIVE-22782
> URL: https://issues.apache.org/jira/browse/HIVE-22782
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Ashish Sharma
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> Currently separate calls are made to metastore to fetch constraints like Pk, 
> fk, not null etc. Since planner always retrieve these constraints we should 
> retrieve all of them in one call.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-22782) Consolidate metastore call to fetch constraints

2020-09-12 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22782?focusedWorklogId=482434=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-482434
 ]

ASF GitHub Bot logged work on HIVE-22782:
-

Author: ASF GitHub Bot
Created on: 12/Sep/20 06:26
Start Date: 12/Sep/20 06:26
Worklog Time Spent: 10m 
  Work Description: ashish-kumar-sharma commented on a change in pull 
request #1419:
URL: https://github.com/apache/hive/pull/1419#discussion_r487375660



##
File path: 
ql/src/java/org/apache/hadoop/hive/ql/metadata/TableConstraintsInfo.java
##
@@ -0,0 +1,99 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hive.ql.metadata;
+
+public class TableConstraintsInfo {
+  PrimaryKeyInfo primaryKeyInfo;
+  ForeignKeyInfo foreignKeyInfo;
+  UniqueConstraint uniqueConstraint;
+  DefaultConstraint defaultConstraint;
+  CheckConstraint checkConstraint;
+  NotNullConstraint notNullConstraint;
+
+  public TableConstraintsInfo() {
+  }
+
+  public TableConstraintsInfo(PrimaryKeyInfo primaryKeyInfo, ForeignKeyInfo 
foreignKeyInfo,
+  UniqueConstraint uniqueConstraint, DefaultConstraint defaultConstraint, 
CheckConstraint checkConstraint,
+  NotNullConstraint notNullConstraint) {
+this.primaryKeyInfo = primaryKeyInfo;
+this.foreignKeyInfo = foreignKeyInfo;
+this.uniqueConstraint = uniqueConstraint;
+this.defaultConstraint = defaultConstraint;
+this.checkConstraint = checkConstraint;
+this.notNullConstraint = notNullConstraint;
+  }
+
+  public PrimaryKeyInfo getPrimaryKeyInfo() {
+return primaryKeyInfo;
+  }
+
+  public void setPrimaryKeyInfo(PrimaryKeyInfo primaryKeyInfo) {
+this.primaryKeyInfo = primaryKeyInfo;
+  }
+
+  public ForeignKeyInfo getForeignKeyInfo() {
+return foreignKeyInfo;
+  }
+
+  public void setForeignKeyInfo(ForeignKeyInfo foreignKeyInfo) {
+this.foreignKeyInfo = foreignKeyInfo;
+  }
+
+  public UniqueConstraint getUniqueConstraint() {
+return uniqueConstraint;
+  }
+
+  public void setUniqueConstraint(UniqueConstraint uniqueConstraint) {
+this.uniqueConstraint = uniqueConstraint;
+  }
+
+  public DefaultConstraint getDefaultConstraint() {
+return defaultConstraint;
+  }
+
+  public void setDefaultConstraint(DefaultConstraint defaultConstraint) {
+this.defaultConstraint = defaultConstraint;
+  }
+
+  public CheckConstraint getCheckConstraint() {
+return checkConstraint;
+  }
+
+  public void setCheckConstraint(CheckConstraint checkConstraint) {
+this.checkConstraint = checkConstraint;
+  }
+
+  public NotNullConstraint getNotNullConstraint() {
+return notNullConstraint;
+  }
+
+  public void setNotNullConstraint(NotNullConstraint notNullConstraint) {
+this.notNullConstraint = notNullConstraint;
+  }
+
+  public static boolean isTableConstraintsInfoNotEmpty(TableConstraintsInfo 
info) {

Review comment:
   i was trying to maintain the code uniformity. But since we have a 
wrapper class in places we can remove all static method and follow OOD 
para-dime 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 482434)
Time Spent: 3h 40m  (was: 3.5h)

> Consolidate metastore call to fetch constraints
> ---
>
> Key: HIVE-22782
> URL: https://issues.apache.org/jira/browse/HIVE-22782
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Ashish Sharma
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> Currently separate calls are made to metastore to fetch constraints like Pk, 
> fk, not null etc. Since planner always retrieve

[jira] [Work logged] (HIVE-22782) Consolidate metastore call to fetch constraints

2020-09-12 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22782?focusedWorklogId=482431=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-482431
 ]

ASF GitHub Bot logged work on HIVE-22782:
-

Author: ASF GitHub Bot
Created on: 12/Sep/20 06:24
Start Date: 12/Sep/20 06:24
Worklog Time Spent: 10m 
  Work Description: ashish-kumar-sharma commented on a change in pull 
request #1419:
URL: https://github.com/apache/hive/pull/1419#discussion_r487375420



##
File path: 
standalone-metastore/metastore-common/src/main/thrift/hive_metastore.thrift
##
@@ -720,6 +729,15 @@ struct CheckConstraintsResponse {
   1: required list checkConstraints
 }
 
+struct AllTableConstraintsRequest {
+  1: required string db_name,

Review comment:
   set all variable to camel casing

##
File path: 
standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java
##
@@ -2811,6 +2811,26 @@ public GetFieldsResponse 
getFieldsRequest(GetFieldsRequest req)
 return client.get_check_constraints(req).getCheckConstraints();
   }
 
+  @Override
+  public SQLAllTableConstraints 
getAllTableConstraints(AllTableConstraintsRequest req)
+  throws MetaException, NoSuchObjectException, TException {
+long t1 = System.currentTimeMillis();
+
+try {
+  if (!req.isSetCatName()) {

Review comment:
   removed





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 482431)
Time Spent: 3h 10m  (was: 3h)

> Consolidate metastore call to fetch constraints
> ---
>
> Key: HIVE-22782
> URL: https://issues.apache.org/jira/browse/HIVE-22782
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Ashish Sharma
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> Currently separate calls are made to metastore to fetch constraints like Pk, 
> fk, not null etc. Since planner always retrieve these constraints we should 
> retrieve all of them in one call.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-22782) Consolidate metastore call to fetch constraints

2020-09-12 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22782?focusedWorklogId=482432=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-482432
 ]

ASF GitHub Bot logged work on HIVE-22782:
-

Author: ASF GitHub Bot
Created on: 12/Sep/20 06:24
Start Date: 12/Sep/20 06:24
Worklog Time Spent: 10m 
  Work Description: ashish-kumar-sharma commented on a change in pull 
request #1419:
URL: https://github.com/apache/hive/pull/1419#discussion_r487375443



##
File path: 
standalone-metastore/metastore-common/src/main/thrift/hive_metastore.thrift
##
@@ -122,6 +122,15 @@ struct SQLCheckConstraint {
   9: bool rely_cstr  // Rely/No Rely
 }
 
+struct SQLAllTableConstraints {
+  1: optional list primaryKeys,
+  2: optional list foreignKeys,
+  3: optional list uniqueConstraints,
+  4: optional list notNullConstraints,
+  5: optional list defaultConstraints,
+  6: optional list checkConstraints,

Review comment:
   removed





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 482432)
Time Spent: 3h 20m  (was: 3h 10m)

> Consolidate metastore call to fetch constraints
> ---
>
> Key: HIVE-22782
> URL: https://issues.apache.org/jira/browse/HIVE-22782
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Ashish Sharma
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> Currently separate calls are made to metastore to fetch constraints like Pk, 
> fk, not null etc. Since planner always retrieve these constraints we should 
> retrieve all of them in one call.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-22782) Consolidate metastore call to fetch constraints

2020-09-12 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22782?focusedWorklogId=482433=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-482433
 ]

ASF GitHub Bot logged work on HIVE-22782:
-

Author: ASF GitHub Bot
Created on: 12/Sep/20 06:24
Start Date: 12/Sep/20 06:24
Worklog Time Spent: 10m 
  Work Description: ashish-kumar-sharma commented on a change in pull 
request #1419:
URL: https://github.com/apache/hive/pull/1419#discussion_r487375471



##
File path: ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java
##
@@ -1176,145 +1166,101 @@ public void setStatsStateLikeNewTable() {
*  Note that set apis are used by DESCRIBE only, although get apis return 
RELY or ENABLE
*  constraints DESCRIBE could set all type of constraints
* */
-
-  /* This only return PK which are created with RELY */
-  public PrimaryKeyInfo getPrimaryKeyInfo() {
-if(!this.isPKFetched) {
+  public TableConstraintsInfo getTableConstraintsInfo() {
+if (!this.isTableConstraintsFetched) {
   try {
-pki = Hive.get().getReliablePrimaryKeys(this.getDbName(), 
this.getTableName());
-this.isPKFetched = true;
+tableConstraintsInfo = 
Hive.get().getReliableAndEnableTableConstraints(this.getDbName(), 
this.getTableName());
+this.isTableConstraintsFetched = true;
   } catch (HiveException e) {
-LOG.warn("Cannot retrieve PK info for table : " + this.getTableName()
-+ " ignoring exception: " + e);
+LOG.warn(
+"Cannot retrieve table constraints info for table : " + 
this.getTableName() + " ignoring exception: " + e);
   }
 }
-return pki;
+return tableConstraintsInfo;
   }
 
-  public void setPrimaryKeyInfo(PrimaryKeyInfo pki) {
-this.pki = pki;
-this.isPKFetched = true;
+  /**
+   * TableConstraintsInfo setter
+   * @param tableConstraintsInfo
+   */
+  public void setTableConstraintsInfo(TableConstraintsInfo 
tableConstraintsInfo) {
+this.tableConstraintsInfo = tableConstraintsInfo;
+this.isTableConstraintsFetched = true;
   }
 
-  /* This only return FK constraints which are created with RELY */
-  public ForeignKeyInfo getForeignKeyInfo() {
-if(!isFKFetched) {
-  try {
-fki = Hive.get().getReliableForeignKeys(this.getDbName(), 
this.getTableName());
-this.isFKFetched = true;
-  } catch (HiveException e) {
-LOG.warn(
-"Cannot retrieve FK info for table : " + this.getTableName()
-+ " ignoring exception: " + e);
-  }
+  /**
+   * This only return PK which are created with RELY
+   * @return primary key constraint list
+   */
+  public PrimaryKeyInfo getPrimaryKeyInfo() {
+if (!this.isTableConstraintsFetched) {
+  getTableConstraintsInfo();
 }
-return fki;
+return tableConstraintsInfo.getPrimaryKeyInfo();
   }
 
-  public void setForeignKeyInfo(ForeignKeyInfo fki) {
-this.fki = fki;
-this.isFKFetched = true;
+  /**
+   * This only return FK constraints which are created with RELY
+   * @return foreign key constraint list
+   */
+  public ForeignKeyInfo getForeignKeyInfo() {
+if (!isTableConstraintsFetched) {
+  getTableConstraintsInfo();
+}
+return tableConstraintsInfo.getForeignKeyInfo();
   }
 
-  /* This only return UNIQUE constraint defined with RELY */
+  /**
+   * This only return UNIQUE constraint defined with RELY
+   * @return unique constraint list
+   */
   public UniqueConstraint getUniqueKeyInfo() {
-if(!isUniqueFetched) {
-  try {
-uki = Hive.get().getReliableUniqueConstraints(this.getDbName(), 
this.getTableName());
-this.isUniqueFetched = true;
-  } catch (HiveException e) {
-LOG.warn(
-"Cannot retrieve Unique Key info for table : " + 
this.getTableName()
-+ " ignoring exception: " + e);
-  }
+if (!isTableConstraintsFetched) {
+  getTableConstraintsInfo();
 }
-return uki;
-  }
-
-  public void setUniqueKeyInfo(UniqueConstraint uki) {
-this.uki = uki;
-this.isUniqueFetched = true;
+return tableConstraintsInfo.getUniqueConstraint();
   }
 
-  /* This only return NOT NULL constraint defined with RELY */
+  /**
+   * This only return NOT NULL constraint defined with RELY
+   * @return not null constraint list
+   */
   public NotNullConstraint getNotNullConstraint() {
-if(!isNotNullFetched) {
-  try {
-nnc = Hive.get().getReliableNotNullConstraints(this.getDbName(), 
this.getTableName());
-this.isNotNullFetched = true;
-  } catch (HiveException e) {
-LOG.warn("Cannot retrieve Not Null constraint info for table : "
-+ this.getTableName() + " ignoring exception: " + e);
-  }
+if (!isTableConstraintsFetched) {
+  getTableConstraintsInfo();
 }
-return nnc;
-  }
-
-  public void

[jira] [Work logged] (HIVE-22782) Consolidate metastore call to fetch constraints

2020-09-12 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22782?focusedWorklogId=482430=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-482430
 ]

ASF GitHub Bot logged work on HIVE-22782:
-

Author: ASF GitHub Bot
Created on: 12/Sep/20 06:23
Start Date: 12/Sep/20 06:23
Worklog Time Spent: 10m 
  Work Description: ashish-kumar-sharma commented on a change in pull 
request #1419:
URL: https://github.com/apache/hive/pull/1419#discussion_r487375367



##
File path: 
standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/DummyRawStoreForJdoConnection.java
##
@@ -1014,6 +1015,11 @@ public FileMetadataHandler 
getFileMetadataHandler(FileMetadataExprType type) {
 return null;
   }
 
+  @Override public SQLAllTableConstraints getAllTableConstraints(String 
catName, String db_name, String tbl_name)

Review comment:
   Done

##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/RawStore.java
##
@@ -1485,6 +1485,16 @@ void getFileMetadataByExpr(List fileIds, 
FileMetadataExprType type, byte[]
   List getCheckConstraints(String catName, String db_name,
String tbl_name) throws 
MetaException;
 
+  /**
+   *  Get all constraints of the table
+   * @param catName catalog name
+   * @param db_name database name
+   * @param tbl_name table name
+   * @return all constraints for this table
+   * @throws MetaException error accessing the RDBMS
+   */
+  SQLAllTableConstraints getAllTableConstraints(String catName, String 
db_name, String tbl_name)

Review comment:
   Done

##
File path: 
standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java
##
@@ -2811,6 +2811,26 @@ public GetFieldsResponse 
getFieldsRequest(GetFieldsRequest req)
 return client.get_check_constraints(req).getCheckConstraints();
   }
 
+  @Override
+  public SQLAllTableConstraints 
getAllTableConstraints(AllTableConstraintsRequest req)
+  throws MetaException, NoSuchObjectException, TException {
+long t1 = System.currentTimeMillis();
+
+try {
+  if (!req.isSetCatName()) {
+req.setCatName(getDefaultCatalog(conf));
+  }
+
+  return client.get_all_table_constraints(req).getAllTableConstraints();
+} finally {
+  long diff = System.currentTimeMillis() - t1;

Review comment:
   done





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 482430)
Time Spent: 3h  (was: 2h 50m)

> Consolidate metastore call to fetch constraints
> ---
>
> Key: HIVE-22782
> URL: https://issues.apache.org/jira/browse/HIVE-22782
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Ashish Sharma
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> Currently separate calls are made to metastore to fetch constraints like Pk, 
> fk, not null etc. Since planner always retrieve these constraints we should 
> retrieve all of them in one call.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

80 matches

Mail list logo