[jira] [Work logged] (HIVE-24527) Allow triggering materialized view rewriting for external tables

2020-12-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24527?focusedWorklogId=523358&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-523358
 ]

ASF GitHub Bot logged work on HIVE-24527:
-

Author: ASF GitHub Bot
Created on: 12/Dec/20 01:48
Start Date: 12/Dec/20 01:48
Worklog Time Spent: 10m 
  Work Description: jcamachor opened a new pull request #1769:
URL: https://github.com/apache/hive/pull/1769


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 523358)
Remaining Estimate: 0h
Time Spent: 10m

> Allow triggering materialized view rewriting for external tables
> 
>
> Key: HIVE-24527
> URL: https://issues.apache.org/jira/browse/HIVE-24527
> Project: Hive
>  Issue Type: Sub-task
>  Components: Materialized views
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Although we will not be able to check data staleness, this can be useful for 
> debugging purposes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24527) Allow triggering materialized view rewriting for external tables

2020-12-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24527?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-24527:
--
Labels: pull-request-available  (was: )

> Allow triggering materialized view rewriting for external tables
> 
>
> Key: HIVE-24527
> URL: https://issues.apache.org/jira/browse/HIVE-24527
> Project: Hive
>  Issue Type: Sub-task
>  Components: Materialized views
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Although we will not be able to check data staleness, this can be useful for 
> debugging purposes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24527) Allow triggering materialized view rewriting for external tables

2020-12-11 Thread Jesus Camacho Rodriguez (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24527?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-24527:
---
Status: Patch Available  (was: Open)

> Allow triggering materialized view rewriting for external tables
> 
>
> Key: HIVE-24527
> URL: https://issues.apache.org/jira/browse/HIVE-24527
> Project: Hive
>  Issue Type: Sub-task
>  Components: Materialized views
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>
> Although we will not be able to check data staleness, this can be useful for 
> debugging purposes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-24527) Allow triggering materialized view rewriting for external tables

2020-12-11 Thread Jesus Camacho Rodriguez (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24527?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez reassigned HIVE-24527:
--


> Allow triggering materialized view rewriting for external tables
> 
>
> Key: HIVE-24527
> URL: https://issues.apache.org/jira/browse/HIVE-24527
> Project: Hive
>  Issue Type: Sub-task
>  Components: Materialized views
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>
> Although we will not be able to check data staleness, this can be useful for 
> debugging purposes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24139) VectorGroupByOperator is not flushing hash table entries as needed

2020-12-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24139?focusedWorklogId=523349&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-523349
 ]

ASF GitHub Bot logged work on HIVE-24139:
-

Author: ASF GitHub Bot
Created on: 12/Dec/20 00:51
Start Date: 12/Dec/20 00:51
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #1481:
URL: https://github.com/apache/hive/pull/1481#issuecomment-743539618


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 523349)
Time Spent: 1h  (was: 50m)

> VectorGroupByOperator is not flushing hash table entries as needed
> --
>
> Key: HIVE-24139
> URL: https://issues.apache.org/jira/browse/HIVE-24139
> Project: Hive
>  Issue Type: Bug
>Reporter: Mustafa İman
>Assignee: Mustafa İman
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> After https://issues.apache.org/jira/browse/HIVE-23975 introduced a bug where 
> copyKey mutates some key wrappers while copying. This Jira is to fix it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24228) Support complex types in LLAP

2020-12-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24228?focusedWorklogId=523348&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-523348
 ]

ASF GitHub Bot logged work on HIVE-24228:
-

Author: ASF GitHub Bot
Created on: 12/Dec/20 00:51
Start Date: 12/Dec/20 00:51
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #1551:
URL: https://github.com/apache/hive/pull/1551


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 523348)
Time Spent: 0.5h  (was: 20m)

> Support complex types in LLAP
> -
>
> Key: HIVE-24228
> URL: https://issues.apache.org/jira/browse/HIVE-24228
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Reporter: Yuriy Baltovskyy
>Assignee: Yuriy Baltovskyy
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> The idea of this improvement is to support complex types (arrays, maps, 
> structs) returned from LLAP data reader. This is useful when consuming LLAP 
> data later in Spark.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24514) UpdateMDatabaseURI does not update managed location URI

2020-12-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24514?focusedWorklogId=523331&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-523331
 ]

ASF GitHub Bot logged work on HIVE-24514:
-

Author: ASF GitHub Bot
Created on: 11/Dec/20 22:07
Start Date: 11/Dec/20 22:07
Worklog Time Spent: 10m 
  Work Description: nrg4878 commented on a change in pull request #1761:
URL: https://github.com/apache/hive/pull/1761#discussion_r541346293



##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
##
@@ -8556,6 +8556,9 @@ public UpdateMDatabaseURIRetVal updateMDatabaseURI(URI 
oldLoc, URI newLoc, boole
 updateLocations.put(locationURI.toString(), dbLoc);
 if (!dryRun) {
   mDB.setLocationUri(dbLoc);
+  if 
(org.apache.commons.lang3.StringUtils.isNotBlank(mDB.getManagedLocationUri())) {
+mDB.setManagedLocationUri(dbLoc);

Review comment:
   Looks like we are setting the value of locationURI for managedLocation. 
dbLoc is based on mDB.getLocationURI(). Instead we should have similar logic 
where string substitutions are performed on DB's managedlocation as well.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 523331)
Time Spent: 20m  (was: 10m)

> UpdateMDatabaseURI does not update managed location URI
> ---
>
> Key: HIVE-24514
> URL: https://issues.apache.org/jira/browse/HIVE-24514
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> When FS Root is updated using metatool, if the DB has managed location 
> defined, 
> updateMDatabaseURI API should update the managed location as well. Currently 
> it only updates location uri.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24332) Make AbstractSerDe Superclass of all Classes

2020-12-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24332?focusedWorklogId=523313&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-523313
 ]

ASF GitHub Bot logged work on HIVE-24332:
-

Author: ASF GitHub Bot
Created on: 11/Dec/20 21:23
Start Date: 11/Dec/20 21:23
Worklog Time Spent: 10m 
  Work Description: belugabehr opened a new pull request #1634:
URL: https://github.com/apache/hive/pull/1634


   
   
   ### What changes were proposed in this pull request?
   
   
   
   ### Why are the changes needed?
   
   
   
   ### Does this PR introduce _any_ user-facing change?
   
   
   
   ### How was this patch tested?
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 523313)
Time Spent: 0.5h  (was: 20m)

> Make AbstractSerDe Superclass of all Classes
> 
>
> Key: HIVE-24332
> URL: https://issues.apache.org/jira/browse/HIVE-24332
> Project: Hive
>  Issue Type: Improvement
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Rework how {{AbstractSerDe}}, {{Deserializer}}, and {{Serializer}} classes 
> are designed.
> Simplify, and consolidate more functionality into {{AbstractSerDe}}.  Remove 
> functionality that is not commonly used.  Remove deprecated methods that were 
> deprecated in 3.x (or maybe even older).
> Make it like Java's {{ByteChannel}} that provides implementations for both 
> {{ReadableByteChannel}} and {{WriteableByteChannel}} interfaces.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24332) Make AbstractSerDe Superclass of all Classes

2020-12-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24332?focusedWorklogId=523312&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-523312
 ]

ASF GitHub Bot logged work on HIVE-24332:
-

Author: ASF GitHub Bot
Created on: 11/Dec/20 21:22
Start Date: 11/Dec/20 21:22
Worklog Time Spent: 10m 
  Work Description: belugabehr closed pull request #1634:
URL: https://github.com/apache/hive/pull/1634


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 523312)
Time Spent: 20m  (was: 10m)

> Make AbstractSerDe Superclass of all Classes
> 
>
> Key: HIVE-24332
> URL: https://issues.apache.org/jira/browse/HIVE-24332
> Project: Hive
>  Issue Type: Improvement
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Rework how {{AbstractSerDe}}, {{Deserializer}}, and {{Serializer}} classes 
> are designed.
> Simplify, and consolidate more functionality into {{AbstractSerDe}}.  Remove 
> functionality that is not commonly used.  Remove deprecated methods that were 
> deprecated in 3.x (or maybe even older).
> Make it like Java's {{ByteChannel}} that provides implementations for both 
> {{ReadableByteChannel}} and {{WriteableByteChannel}} interfaces.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HIVE-24488) Make docker host configurable for metastoredb/perf tests

2020-12-11 Thread Zoltan Haindrich (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich resolved HIVE-24488.
-
Fix Version/s: 4.0.0
   Resolution: Fixed

merged into master. Thank you Krisztian for reviewing the changes!

> Make docker host configurable for metastoredb/perf tests
> 
>
> Key: HIVE-24488
> URL: https://issues.apache.org/jira/browse/HIVE-24488
> Project: Hive
>  Issue Type: Improvement
>  Components: Test
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> I tend to develop patches inside containers (hive-dev-box) to be able to work 
> on multiple patches in parallel
> Running tests which do use docker were always a bit problematic for me - when 
> I wanted to do it before: I manually exposed /var/lib/docker and added a 
> rinetd forward by hand (which is not nice)
> ...with the current move to run Perf tests as well against a dockerized 
> metastore exposes this problem a bit more for me.
> I'm also considering to add the ability to use minikube with hive-dev-box ; 
> but that's still needs exploring
> it would be much easier to expose the address of the docker host I'm using...



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24488) Make docker host configurable for metastoredb/perf tests

2020-12-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24488?focusedWorklogId=523295&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-523295
 ]

ASF GitHub Bot logged work on HIVE-24488:
-

Author: ASF GitHub Bot
Created on: 11/Dec/20 20:13
Start Date: 11/Dec/20 20:13
Worklog Time Spent: 10m 
  Work Description: kgyrtkirk merged pull request #1766:
URL: https://github.com/apache/hive/pull/1766


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 523295)
Time Spent: 20m  (was: 10m)

> Make docker host configurable for metastoredb/perf tests
> 
>
> Key: HIVE-24488
> URL: https://issues.apache.org/jira/browse/HIVE-24488
> Project: Hive
>  Issue Type: Improvement
>  Components: Test
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> I tend to develop patches inside containers (hive-dev-box) to be able to work 
> on multiple patches in parallel
> Running tests which do use docker were always a bit problematic for me - when 
> I wanted to do it before: I manually exposed /var/lib/docker and added a 
> rinetd forward by hand (which is not nice)
> ...with the current move to run Perf tests as well against a dockerized 
> metastore exposes this problem a bit more for me.
> I'm also considering to add the ability to use minikube with hive-dev-box ; 
> but that's still needs exploring
> it would be much easier to expose the address of the docker host I'm using...



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24432) Delete Notification Events in Batches

2020-12-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24432?focusedWorklogId=523271&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-523271
 ]

ASF GitHub Bot logged work on HIVE-24432:
-

Author: ASF GitHub Bot
Created on: 11/Dec/20 18:38
Start Date: 11/Dec/20 18:38
Worklog Time Spent: 10m 
  Work Description: belugabehr commented on a change in pull request #1710:
URL: https://github.com/apache/hive/pull/1710#discussion_r541149698



##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
##
@@ -10800,53 +10801,89 @@ public void addNotificationEvent(NotificationEvent 
entry) throws MetaException {
 
   @Override
   public void cleanNotificationEvents(int olderThan) {
-boolean commited = false;
-Query query = null;
+final int eventBatchSize = MetastoreConf.getIntVar(conf, 
MetastoreConf.ConfVars.EVENT_CLEAN_MAX_EVENTS);
+
+final long ageSec = olderThan;
+final Instant now = Instant.now();
+
+final int tooOld = Math.toIntExact(now.getEpochSecond() - ageSec);
+
+final Optional batchSize = (eventBatchSize > 0) ? 
Optional.of(eventBatchSize) : Optional.empty();
+
+final long start = System.nanoTime();
+int deleteCount = doCleanNotificationEvents(tooOld, batchSize);
+
+if (deleteCount == 0) {
+  LOG.info("No Notification events found to be cleaned with eventTime < 
{}", tooOld);
+} else {
+  int batchCount = 0;
+  do {
+batchCount = doCleanNotificationEvents(tooOld, batchSize);
+deleteCount += batchCount;
+  } while (batchCount > 0);
+}
+
+final long finish = System.nanoTime();
+
+LOG.info("Deleted {} notification events older than epoch:{} in {}ms", 
deleteCount, tooOld,
+TimeUnit.NANOSECONDS.toMillis(finish - start));
+  }
+
+  private int doCleanNotificationEvents(final int ageSec, final 
Optional batchSize) {
+final Transaction tx = pm.currentTransaction();
+int eventsCount = 0;
+
 try {
-  openTransaction();
-  long tmp = System.currentTimeMillis() / 1000 - olderThan;
-  int tooOld = (tmp > Integer.MAX_VALUE) ? 0 : (int) tmp;
-  query = pm.newQuery(MNotificationLog.class, "eventTime < tooOld");
-  query.declareParameters("java.lang.Integer tooOld");
+  tx.begin();
 
-  int max_events = MetastoreConf.getIntVar(conf, 
MetastoreConf.ConfVars.EVENT_CLEAN_MAX_EVENTS);
-  max_events = max_events > 0 ? max_events : Integer.MAX_VALUE;
-  query.setRange(0, max_events);
-  query.setOrdering("eventId ascending");
+  try (Query query = pm.newQuery(MNotificationLog.class, "eventTime < 
tooOld")) {
+query.declareParameters("java.lang.Integer tooOld");
+query.setOrdering("eventId ascending");
+if (batchSize.isPresent()) {
+  query.setRange(0, batchSize.get());
+}
 
-  List toBeRemoved = (List) query.execute(tooOld);
-  int iteration = 0;
-  int eventCount = 0;
-  long minEventId = 0;
-  long minEventTime = 0;
-  long maxEventId = 0;
-  long maxEventTime = 0;
-  while (CollectionUtils.isNotEmpty(toBeRemoved)) {
-int listSize = toBeRemoved.size();
-if (iteration == 0) {
-  MNotificationLog firstNotification = toBeRemoved.get(0);
-  minEventId = firstNotification.getEventId();
-  minEventTime = firstNotification.getEventTime();
+List events = (List) query.execute(ageSec);
+if (CollectionUtils.isNotEmpty(events)) {
+  eventsCount = events.size();
+
+  if (LOG.isDebugEnabled()) {
+int minEventTime, maxEventTime;
+long minEventId, maxEventId;
+Iterator iter = events.iterator();
+MNotificationLog firstNotification = iter.next();
+
+minEventTime = maxEventTime = firstNotification.getEventTime();
+minEventId = maxEventId = firstNotification.getEventId();
+
+while (iter.hasNext()) {
+  MNotificationLog notification = iter.next();

Review comment:
   @aasha I updated existing unit tests to enforce small batch sizes (size 
= 1) so that when it deletes records, it does so in batches of 1.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 523271)
Time Spent: 3h 20m  (was: 3h 10m)

> Delete Notification Events in Batches
> -
>
> Key: HIVE-24432
> URL: https://issues.apache.org/jira/browse/HIVE-24432
> Project: Hive
>  Issue Type: Imp

[jira] [Work logged] (HIVE-17709) remove sun.misc.Cleaner references

2020-12-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-17709?focusedWorklogId=523260&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-523260
 ]

ASF GitHub Bot logged work on HIVE-17709:
-

Author: ASF GitHub Bot
Created on: 11/Dec/20 18:00
Start Date: 11/Dec/20 18:00
Worklog Time Spent: 10m 
  Work Description: abstractdog commented on pull request #1739:
URL: https://github.com/apache/hive/pull/1739#issuecomment-743340734


   could you please take a look @ashutoshc , @prasanthj? this is a must-have 
patch on JDK11, and I tested this with MSTR workload
   
   FYI, without the patch, I hit this on JDK11:
   ```
], TaskAttempt 1 failed, info=[Error: Error while running task ( failure ) 
: org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: 
java.io.IOException: java.lang.NoClassDefFoundError: sun/misc/Cleaner
at 
org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:80)
at 
org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:437)
at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:267)
at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:250)
at 
org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374)
at 
org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:75)
at 
org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:62)
at java.base/java.security.AccessController.doPrivileged(Native Method)
at java.base/javax.security.auth.Subject.doAs(Subject.java:423)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1898)
at 
org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:62)
at 
org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:38)
at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
at 
org.apache.hadoop.hive.llap.daemon.impl.StatsRecordingThreadPool$WrappedCallable.call(StatsRecordingThreadPool.java:118)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:834)
Caused by: java.io.IOException: java.io.IOException: 
java.lang.NoClassDefFoundError: sun/misc/Cleaner
at 
org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
at 
org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
at 
org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:367)
at 
org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:79)
at 
org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:33)
at 
org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:117)
at 
org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.next(TezGroupedSplitsInputFormat.java:151)
at org.apache.tez.mapreduce.lib.MRReaderMapred.next(MRReaderMapred.java:115)
at 
org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:68)
... 17 more
Caused by: java.io.IOException: java.lang.NoClassDefFoundError: 
sun/misc/Cleaner
at 
org.apache.hadoop.hive.ql.io.orc.encoded.EncodedReaderImpl.readEncodedColumns(EncodedReaderImpl.java:580)
at 
org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.performDataRead(OrcEncodedDataReader.java:430)
at 
org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:279)
at 
org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:276)
at java.base/java.security.AccessController.doPrivileged(Native Method)
at java.base/javax.security.auth.Subject.doAs(Subject.java:423)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1898)
at 
org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:276)
at 
org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:117)
at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
at 
org.apache.hadoop.hive.llap.io.decode.EncodedDataConsumer$CpuRecordingCallable.call(EncodedDataConsumer.java:88)
at 
org.apache.hadoop.hive.llap.io.decode.EncodedDataConsumer$CpuRecordingCallable.call(EncodedDataConsumer.java:73)
... 5 more

[jira] [Work logged] (HIVE-24526) Get grouped locations of external table data using metatool.

2020-12-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24526?focusedWorklogId=523255&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-523255
 ]

ASF GitHub Bot logged work on HIVE-24526:
-

Author: ASF GitHub Bot
Created on: 11/Dec/20 17:46
Start Date: 11/Dec/20 17:46
Worklog Time Spent: 10m 
  Work Description: ArkoSharma opened a new pull request #1768:
URL: https://github.com/apache/hive/pull/1768


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 523255)
Remaining Estimate: 0h
Time Spent: 10m

> Get grouped locations of external table data using metatool.
> 
>
> Key: HIVE-24526
> URL: https://issues.apache.org/jira/browse/HIVE-24526
> Project: Hive
>  Issue Type: Task
>Reporter: Arko Sharma
>Assignee: Arko Sharma
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Add a functionality to metatool to get a list of locations which cover all 
> external-table data-locations for a database specified by user.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24526) Get grouped locations of external table data using metatool.

2020-12-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-24526:
--
Labels: pull-request-available  (was: )

> Get grouped locations of external table data using metatool.
> 
>
> Key: HIVE-24526
> URL: https://issues.apache.org/jira/browse/HIVE-24526
> Project: Hive
>  Issue Type: Task
>Reporter: Arko Sharma
>Assignee: Arko Sharma
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Add a functionality to metatool to get a list of locations which cover all 
> external-table data-locations for a database specified by user.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24474) Failed compaction always logs TxnAbortedException (again)

2020-12-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24474?focusedWorklogId=523254&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-523254
 ]

ASF GitHub Bot logged work on HIVE-24474:
-

Author: ASF GitHub Bot
Created on: 11/Dec/20 17:37
Start Date: 11/Dec/20 17:37
Worklog Time Spent: 10m 
  Work Description: pvary commented on a change in pull request #1735:
URL: https://github.com/apache/hive/pull/1735#discussion_r541113701



##
File path: ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Worker.java
##
@@ -636,4 +621,66 @@ private static boolean isDynPartAbort(Table t, 
CompactionInfo ci) {
 return t.getPartitionKeys() != null && t.getPartitionKeys().size() > 0
 && ci.partName == null;
   }
+
+  /**
+   * Keep track of the compaction's transaction and its operations.
+   */
+  private class CompactionTxn {

Review comment:
   What if we extend AutoCloseable?
   Then we can use the try withresource without finally magic. And close could 
say - if needed commit.
   
   I think this is the last comment.
   How do you like this code? Is it better?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 523254)
Time Spent: 1h 40m  (was: 1.5h)

> Failed compaction always logs TxnAbortedException (again)
> -
>
> Key: HIVE-24474
> URL: https://issues.apache.org/jira/browse/HIVE-24474
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Karen Coppage
>Assignee: Karen Coppage
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> Re-introduced with HIVE-24096.
> If there is an error during compaction, the compaction's txn is aborted but 
> in the finally clause, we try to commit it (commitTxnIfSet), so Worker throws 
> a TxnAbortedException.
> We should set compactorTxnId to TXN_ID_NOT_SET if the compaction's txn is 
> aborted.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24474) Failed compaction always logs TxnAbortedException (again)

2020-12-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24474?focusedWorklogId=523253&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-523253
 ]

ASF GitHub Bot logged work on HIVE-24474:
-

Author: ASF GitHub Bot
Created on: 11/Dec/20 17:34
Start Date: 11/Dec/20 17:34
Worklog Time Spent: 10m 
  Work Description: pvary commented on a change in pull request #1735:
URL: https://github.com/apache/hive/pull/1735#discussion_r541112397



##
File path: ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Worker.java
##
@@ -579,7 +567,7 @@ protected Boolean findNextCompactionAndExecute(boolean 
computeStats) throws Inte
 } catch (Throwable t) {
   LOG.error("Caught an exception in the main loop of compactor worker " + 
workerName, t);
 } finally {
-  commitTxnIfSet(compactorTxnId);
+  compactionTxn.commit();

Review comment:
   Maybe rename commitIfOpen, or whatever





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 523253)
Time Spent: 1.5h  (was: 1h 20m)

> Failed compaction always logs TxnAbortedException (again)
> -
>
> Key: HIVE-24474
> URL: https://issues.apache.org/jira/browse/HIVE-24474
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Karen Coppage
>Assignee: Karen Coppage
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Re-introduced with HIVE-24096.
> If there is an error during compaction, the compaction's txn is aborted but 
> in the finally clause, we try to commit it (commitTxnIfSet), so Worker throws 
> a TxnAbortedException.
> We should set compactorTxnId to TXN_ID_NOT_SET if the compaction's txn is 
> aborted.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-24526) Get grouped locations of external table data using metatool.

2020-12-11 Thread Arko Sharma (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arko Sharma reassigned HIVE-24526:
--


> Get grouped locations of external table data using metatool.
> 
>
> Key: HIVE-24526
> URL: https://issues.apache.org/jira/browse/HIVE-24526
> Project: Hive
>  Issue Type: Task
>Reporter: Arko Sharma
>Assignee: Arko Sharma
>Priority: Major
>
> Add a functionality to metatool to get a list of locations which cover all 
> external-table data-locations for a database specified by user.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24432) Delete Notification Events in Batches

2020-12-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24432?focusedWorklogId=523204&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-523204
 ]

ASF GitHub Bot logged work on HIVE-24432:
-

Author: ASF GitHub Bot
Created on: 11/Dec/20 14:27
Start Date: 11/Dec/20 14:27
Worklog Time Spent: 10m 
  Work Description: belugabehr commented on a change in pull request #1710:
URL: https://github.com/apache/hive/pull/1710#discussion_r540984939



##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
##
@@ -10800,53 +10801,89 @@ public void addNotificationEvent(NotificationEvent 
entry) throws MetaException {
 
   @Override
   public void cleanNotificationEvents(int olderThan) {
-boolean commited = false;
-Query query = null;
+final int eventBatchSize = MetastoreConf.getIntVar(conf, 
MetastoreConf.ConfVars.EVENT_CLEAN_MAX_EVENTS);
+
+final long ageSec = olderThan;
+final Instant now = Instant.now();
+
+final int tooOld = Math.toIntExact(now.getEpochSecond() - ageSec);
+
+final Optional batchSize = (eventBatchSize > 0) ? 
Optional.of(eventBatchSize) : Optional.empty();
+
+final long start = System.nanoTime();
+int deleteCount = doCleanNotificationEvents(tooOld, batchSize);
+
+if (deleteCount == 0) {
+  LOG.info("No Notification events found to be cleaned with eventTime < 
{}", tooOld);
+} else {
+  int batchCount = 0;
+  do {
+batchCount = doCleanNotificationEvents(tooOld, batchSize);
+deleteCount += batchCount;
+  } while (batchCount > 0);
+}
+
+final long finish = System.nanoTime();
+
+LOG.info("Deleted {} notification events older than epoch:{} in {}ms", 
deleteCount, tooOld,
+TimeUnit.NANOSECONDS.toMillis(finish - start));
+  }
+
+  private int doCleanNotificationEvents(final int ageSec, final 
Optional batchSize) {
+final Transaction tx = pm.currentTransaction();
+int eventsCount = 0;
+
 try {
-  openTransaction();
-  long tmp = System.currentTimeMillis() / 1000 - olderThan;
-  int tooOld = (tmp > Integer.MAX_VALUE) ? 0 : (int) tmp;
-  query = pm.newQuery(MNotificationLog.class, "eventTime < tooOld");
-  query.declareParameters("java.lang.Integer tooOld");
+  tx.begin();
 
-  int max_events = MetastoreConf.getIntVar(conf, 
MetastoreConf.ConfVars.EVENT_CLEAN_MAX_EVENTS);
-  max_events = max_events > 0 ? max_events : Integer.MAX_VALUE;
-  query.setRange(0, max_events);
-  query.setOrdering("eventId ascending");
+  try (Query query = pm.newQuery(MNotificationLog.class, "eventTime < 
tooOld")) {
+query.declareParameters("java.lang.Integer tooOld");
+query.setOrdering("eventId ascending");
+if (batchSize.isPresent()) {
+  query.setRange(0, batchSize.get());
+}
 
-  List toBeRemoved = (List) query.execute(tooOld);
-  int iteration = 0;
-  int eventCount = 0;
-  long minEventId = 0;
-  long minEventTime = 0;
-  long maxEventId = 0;
-  long maxEventTime = 0;
-  while (CollectionUtils.isNotEmpty(toBeRemoved)) {
-int listSize = toBeRemoved.size();
-if (iteration == 0) {
-  MNotificationLog firstNotification = toBeRemoved.get(0);
-  minEventId = firstNotification.getEventId();
-  minEventTime = firstNotification.getEventTime();
+List events = (List) query.execute(ageSec);
+if (CollectionUtils.isNotEmpty(events)) {
+  eventsCount = events.size();
+
+  if (LOG.isDebugEnabled()) {
+int minEventTime, maxEventTime;
+long minEventId, maxEventId;
+Iterator iter = events.iterator();
+MNotificationLog firstNotification = iter.next();
+
+minEventTime = maxEventTime = firstNotification.getEventTime();
+minEventId = maxEventId = firstNotification.getEventId();
+
+while (iter.hasNext()) {
+  MNotificationLog notification = iter.next();

Review comment:
   If you are speaking about the eventId portion only, then yes, one could 
look at the first and last item.  Since the list needs to be iterated anyway, 
might as well just keep it simple and consistent with regards to how min/max is 
determined.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 523204)
Time Spent: 3h 10m  (was: 3h)

> Delete Notification Events in Batches
> -
>
> Key: HIVE-24432
> URL: h

[jira] [Work logged] (HIVE-24432) Delete Notification Events in Batches

2020-12-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24432?focusedWorklogId=523202&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-523202
 ]

ASF GitHub Bot logged work on HIVE-24432:
-

Author: ASF GitHub Bot
Created on: 11/Dec/20 14:23
Start Date: 11/Dec/20 14:23
Worklog Time Spent: 10m 
  Work Description: belugabehr commented on a change in pull request #1710:
URL: https://github.com/apache/hive/pull/1710#discussion_r540981739



##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
##
@@ -10800,53 +10801,89 @@ public void addNotificationEvent(NotificationEvent 
entry) throws MetaException {
 
   @Override
   public void cleanNotificationEvents(int olderThan) {
-boolean commited = false;
-Query query = null;
+final int eventBatchSize = MetastoreConf.getIntVar(conf, 
MetastoreConf.ConfVars.EVENT_CLEAN_MAX_EVENTS);
+
+final long ageSec = olderThan;
+final Instant now = Instant.now();
+
+final int tooOld = Math.toIntExact(now.getEpochSecond() - ageSec);
+
+final Optional batchSize = (eventBatchSize > 0) ? 
Optional.of(eventBatchSize) : Optional.empty();
+
+final long start = System.nanoTime();
+int deleteCount = doCleanNotificationEvents(tooOld, batchSize);
+
+if (deleteCount == 0) {
+  LOG.info("No Notification events found to be cleaned with eventTime < 
{}", tooOld);
+} else {
+  int batchCount = 0;
+  do {
+batchCount = doCleanNotificationEvents(tooOld, batchSize);
+deleteCount += batchCount;
+  } while (batchCount > 0);
+}
+
+final long finish = System.nanoTime();
+
+LOG.info("Deleted {} notification events older than epoch:{} in {}ms", 
deleteCount, tooOld,
+TimeUnit.NANOSECONDS.toMillis(finish - start));
+  }
+
+  private int doCleanNotificationEvents(final int ageSec, final 
Optional batchSize) {
+final Transaction tx = pm.currentTransaction();
+int eventsCount = 0;
+
 try {
-  openTransaction();
-  long tmp = System.currentTimeMillis() / 1000 - olderThan;
-  int tooOld = (tmp > Integer.MAX_VALUE) ? 0 : (int) tmp;
-  query = pm.newQuery(MNotificationLog.class, "eventTime < tooOld");
-  query.declareParameters("java.lang.Integer tooOld");
+  tx.begin();
 
-  int max_events = MetastoreConf.getIntVar(conf, 
MetastoreConf.ConfVars.EVENT_CLEAN_MAX_EVENTS);
-  max_events = max_events > 0 ? max_events : Integer.MAX_VALUE;
-  query.setRange(0, max_events);
-  query.setOrdering("eventId ascending");
+  try (Query query = pm.newQuery(MNotificationLog.class, "eventTime < 
tooOld")) {
+query.declareParameters("java.lang.Integer tooOld");
+query.setOrdering("eventId ascending");
+if (batchSize.isPresent()) {
+  query.setRange(0, batchSize.get());
+}
 
-  List toBeRemoved = (List) query.execute(tooOld);
-  int iteration = 0;
-  int eventCount = 0;
-  long minEventId = 0;
-  long minEventTime = 0;
-  long maxEventId = 0;
-  long maxEventTime = 0;
-  while (CollectionUtils.isNotEmpty(toBeRemoved)) {
-int listSize = toBeRemoved.size();
-if (iteration == 0) {
-  MNotificationLog firstNotification = toBeRemoved.get(0);
-  minEventId = firstNotification.getEventId();
-  minEventTime = firstNotification.getEventTime();
+List events = (List) query.execute(ageSec);
+if (CollectionUtils.isNotEmpty(events)) {
+  eventsCount = events.size();
+
+  if (LOG.isDebugEnabled()) {
+int minEventTime, maxEventTime;
+long minEventId, maxEventId;
+Iterator iter = events.iterator();
+MNotificationLog firstNotification = iter.next();
+
+minEventTime = maxEventTime = firstNotification.getEventTime();
+minEventId = maxEventId = firstNotification.getEventId();
+
+while (iter.hasNext()) {
+  MNotificationLog notification = iter.next();

Review comment:
   Hey, so this goes back to the discussion I had over here... PR #1728 
   
   There is really no guarantee that the eventTime is always increasing.  So, 
the ID is always increasing, but if the min/max eventTime are to be discovered 
for each batch, then it needs to iterate.  I'm happy to remove the logging and 
remove the iteration, but that was already in place and I didn't want to remove.
   
   Edit: This is an out-of-band process that doesn't block anything else, so a 
few extra CPU cycles isn't the end of the world.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructu

[jira] [Work logged] (HIVE-24432) Delete Notification Events in Batches

2020-12-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24432?focusedWorklogId=523200&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-523200
 ]

ASF GitHub Bot logged work on HIVE-24432:
-

Author: ASF GitHub Bot
Created on: 11/Dec/20 14:22
Start Date: 11/Dec/20 14:22
Worklog Time Spent: 10m 
  Work Description: belugabehr commented on a change in pull request #1710:
URL: https://github.com/apache/hive/pull/1710#discussion_r540981739



##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
##
@@ -10800,53 +10801,89 @@ public void addNotificationEvent(NotificationEvent 
entry) throws MetaException {
 
   @Override
   public void cleanNotificationEvents(int olderThan) {
-boolean commited = false;
-Query query = null;
+final int eventBatchSize = MetastoreConf.getIntVar(conf, 
MetastoreConf.ConfVars.EVENT_CLEAN_MAX_EVENTS);
+
+final long ageSec = olderThan;
+final Instant now = Instant.now();
+
+final int tooOld = Math.toIntExact(now.getEpochSecond() - ageSec);
+
+final Optional batchSize = (eventBatchSize > 0) ? 
Optional.of(eventBatchSize) : Optional.empty();
+
+final long start = System.nanoTime();
+int deleteCount = doCleanNotificationEvents(tooOld, batchSize);
+
+if (deleteCount == 0) {
+  LOG.info("No Notification events found to be cleaned with eventTime < 
{}", tooOld);
+} else {
+  int batchCount = 0;
+  do {
+batchCount = doCleanNotificationEvents(tooOld, batchSize);
+deleteCount += batchCount;
+  } while (batchCount > 0);
+}
+
+final long finish = System.nanoTime();
+
+LOG.info("Deleted {} notification events older than epoch:{} in {}ms", 
deleteCount, tooOld,
+TimeUnit.NANOSECONDS.toMillis(finish - start));
+  }
+
+  private int doCleanNotificationEvents(final int ageSec, final 
Optional batchSize) {
+final Transaction tx = pm.currentTransaction();
+int eventsCount = 0;
+
 try {
-  openTransaction();
-  long tmp = System.currentTimeMillis() / 1000 - olderThan;
-  int tooOld = (tmp > Integer.MAX_VALUE) ? 0 : (int) tmp;
-  query = pm.newQuery(MNotificationLog.class, "eventTime < tooOld");
-  query.declareParameters("java.lang.Integer tooOld");
+  tx.begin();
 
-  int max_events = MetastoreConf.getIntVar(conf, 
MetastoreConf.ConfVars.EVENT_CLEAN_MAX_EVENTS);
-  max_events = max_events > 0 ? max_events : Integer.MAX_VALUE;
-  query.setRange(0, max_events);
-  query.setOrdering("eventId ascending");
+  try (Query query = pm.newQuery(MNotificationLog.class, "eventTime < 
tooOld")) {
+query.declareParameters("java.lang.Integer tooOld");
+query.setOrdering("eventId ascending");
+if (batchSize.isPresent()) {
+  query.setRange(0, batchSize.get());
+}
 
-  List toBeRemoved = (List) query.execute(tooOld);
-  int iteration = 0;
-  int eventCount = 0;
-  long minEventId = 0;
-  long minEventTime = 0;
-  long maxEventId = 0;
-  long maxEventTime = 0;
-  while (CollectionUtils.isNotEmpty(toBeRemoved)) {
-int listSize = toBeRemoved.size();
-if (iteration == 0) {
-  MNotificationLog firstNotification = toBeRemoved.get(0);
-  minEventId = firstNotification.getEventId();
-  minEventTime = firstNotification.getEventTime();
+List events = (List) query.execute(ageSec);
+if (CollectionUtils.isNotEmpty(events)) {
+  eventsCount = events.size();
+
+  if (LOG.isDebugEnabled()) {
+int minEventTime, maxEventTime;
+long minEventId, maxEventId;
+Iterator iter = events.iterator();
+MNotificationLog firstNotification = iter.next();
+
+minEventTime = maxEventTime = firstNotification.getEventTime();
+minEventId = maxEventId = firstNotification.getEventId();
+
+while (iter.hasNext()) {
+  MNotificationLog notification = iter.next();

Review comment:
   Hey, so this goes back to the discussion I had over here... PR #1728 
   
   There is really no guarantee that the eventTime is always increasing.  So, 
the ID is always increasing, but if the min/max eventTime are to be discovered 
for each batch, then it needs to iterate.  I'm happy to remove the logging and 
remove the iteration, but that was already in place and I didn't want to remove.
   
   





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 523200)
Time Spent: 2h

[jira] [Work logged] (HIVE-24432) Delete Notification Events in Batches

2020-12-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24432?focusedWorklogId=523199&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-523199
 ]

ASF GitHub Bot logged work on HIVE-24432:
-

Author: ASF GitHub Bot
Created on: 11/Dec/20 14:20
Start Date: 11/Dec/20 14:20
Worklog Time Spent: 10m 
  Work Description: belugabehr commented on a change in pull request #1710:
URL: https://github.com/apache/hive/pull/1710#discussion_r540980013



##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
##
@@ -10800,53 +10801,89 @@ public void addNotificationEvent(NotificationEvent 
entry) throws MetaException {
 
   @Override
   public void cleanNotificationEvents(int olderThan) {

Review comment:
   Will do in separate JIRA after this new code gets some usage.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 523199)
Time Spent: 2h 40m  (was: 2.5h)

> Delete Notification Events in Batches
> -
>
> Key: HIVE-24432
> URL: https://issues.apache.org/jira/browse/HIVE-24432
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 3.2.0
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> Notification events are loaded in batches (reduces memory pressure on the 
> HMS), but all of the deletes happen under a single transactions and, when 
> deleting many records, can put a lot of pressure on the backend database.
> Instead, delete events in batches (in different transactions) as well.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24468) Use Event Time instead of Current Time in Notification Log DB Entry

2020-12-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24468?focusedWorklogId=523195&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-523195
 ]

ASF GitHub Bot logged work on HIVE-24468:
-

Author: ASF GitHub Bot
Created on: 11/Dec/20 14:19
Start Date: 11/Dec/20 14:19
Worklog Time Spent: 10m 
  Work Description: belugabehr opened a new pull request #1728:
URL: https://github.com/apache/hive/pull/1728


   …g DB Entry
   
   
   
   ### What changes were proposed in this pull request?
   
   
   
   ### Why are the changes needed?
   
   
   
   ### Does this PR introduce _any_ user-facing change?
   
   
   
   ### How was this patch tested?
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 523195)
Time Spent: 2h 20m  (was: 2h 10m)

> Use Event Time instead of Current Time in Notification Log DB Entry
> ---
>
> Key: HIVE-24468
> URL: https://issues.apache.org/jira/browse/HIVE-24468
> Project: Hive
>  Issue Type: Improvement
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24468) Use Event Time instead of Current Time in Notification Log DB Entry

2020-12-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24468?focusedWorklogId=523193&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-523193
 ]

ASF GitHub Bot logged work on HIVE-24468:
-

Author: ASF GitHub Bot
Created on: 11/Dec/20 14:14
Start Date: 11/Dec/20 14:14
Worklog Time Spent: 10m 
  Work Description: belugabehr closed pull request #1728:
URL: https://github.com/apache/hive/pull/1728


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 523193)
Time Spent: 2h 10m  (was: 2h)

> Use Event Time instead of Current Time in Notification Log DB Entry
> ---
>
> Key: HIVE-24468
> URL: https://issues.apache.org/jira/browse/HIVE-24468
> Project: Hive
>  Issue Type: Improvement
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24274) Implement Query Text based MaterializedView rewrite

2020-12-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24274?focusedWorklogId=523191&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-523191
 ]

ASF GitHub Bot logged work on HIVE-24274:
-

Author: ASF GitHub Bot
Created on: 11/Dec/20 13:47
Start Date: 11/Dec/20 13:47
Worklog Time Spent: 10m 
  Work Description: kasakrisz commented on a change in pull request #1706:
URL: https://github.com/apache/hive/pull/1706#discussion_r540957859



##
File path: ql/src/test/queries/clientnegative/materialized_view_no_cbo_rewrite.q
##
@@ -1,11 +0,0 @@
-set hive.support.concurrency=true;
-set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;
-set hive.strict.checks.cartesian.product=false;
-set hive.materializedview.rewriting=true;
-
-create table cmv_basetable (a int, b varchar(256), c decimal(10,2))
-stored as orc TBLPROPERTIES ('transactional'='true');
-
-insert into cmv_basetable values (1, 'alfred', 10.30),(2, 'bob', 3.14),(2, 
'bonnie', 172342.2),(3, 'calvin', 978.76),(3, 'charlie', 9.8);
-
-create materialized view cmv_mat_view as select a, b, c from cmv_basetable 
sort by a;

Review comment:
   Could you please describe this.
   I checked locally that If CBO is turned off materialized view creation will 
fail with `SemanticException: Cannot enable automatic rewriting for 
materialized view.` No matter what was the query definition.
   Should a general negative test be added? like
   ```
   set hive.cbo.enable=false;
   create materialized view mat1 as select col from t;
   ```
   I haven't found tests targeting this.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 523191)
Time Spent: 3h 40m  (was: 3.5h)

> Implement Query Text based MaterializedView rewrite
> ---
>
> Key: HIVE-24274
> URL: https://issues.apache.org/jira/browse/HIVE-24274
> Project: Hive
>  Issue Type: Improvement
>Reporter: Krisztian Kasa
>Assignee: Krisztian Kasa
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> Besides the way queries are currently rewritten to use materialized views in 
> Hive this project provides an alternative:
> Compare the query text with the materialized views query text stored. If we 
> found a match the original query's logical plan can be replaced by a scan on 
> the materialized view.
> - Only materialized views which are enabled to rewrite can participate
> - Use existing *HiveMaterializedViewsRegistry* through *Hive* object by 
> adding a lookup method by query text.
> - There might be more than one materialized views which have the same query 
> text. In this case chose the first valid one.
> - Validation can be done by calling 
> *Hive.validateMaterializedViewsFromRegistry()*
> - The scope of this first patch is rewriting queries which entire text can be 
> matched only.
> - Use the expanded query text (fully qualified column and table names) for 
> comparing



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24274) Implement Query Text based MaterializedView rewrite

2020-12-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24274?focusedWorklogId=523185&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-523185
 ]

ASF GitHub Bot logged work on HIVE-24274:
-

Author: ASF GitHub Bot
Created on: 11/Dec/20 13:15
Start Date: 11/Dec/20 13:15
Worklog Time Spent: 10m 
  Work Description: kasakrisz commented on a change in pull request #1706:
URL: https://github.com/apache/hive/pull/1706#discussion_r540938631



##
File path: 
ql/src/test/queries/clientpositive/materialized_view_create_rewrite_dummy.q
##
@@ -6,6 +6,7 @@ set hive.support.concurrency=true;
 set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;
 set hive.strict.checks.cartesian.product=false;
 set hive.materializedview.rewriting=true;
+set hive.materializedview.rewriting.query.text=false;

Review comment:
   This test was written for calcite based rewrite.

##
File path: 
ql/src/test/queries/clientpositive/materialized_view_create_rewrite_multi_db.q
##
@@ -3,6 +3,7 @@ set hive.support.concurrency=true;
 set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;
 set hive.strict.checks.cartesian.product=false;
 set hive.materializedview.rewriting=true;
+set hive.materializedview.rewriting.query.text=false;

Review comment:
   This test was written for calcite based rewrite 

##
File path: ql/src/test/queries/clientpositive/materialized_view_rewrite_1.q
##
@@ -5,6 +5,7 @@ set 
hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;
 set hive.strict.checks.cartesian.product=false;
 set hive.stats.fetch.column.stats=true;
 set hive.materializedview.rewriting=true;
+set hive.materializedview.rewriting.query.text=false;

Review comment:
   This test was written for calcite based rewrite 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 523185)
Time Spent: 3.5h  (was: 3h 20m)

> Implement Query Text based MaterializedView rewrite
> ---
>
> Key: HIVE-24274
> URL: https://issues.apache.org/jira/browse/HIVE-24274
> Project: Hive
>  Issue Type: Improvement
>Reporter: Krisztian Kasa
>Assignee: Krisztian Kasa
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> Besides the way queries are currently rewritten to use materialized views in 
> Hive this project provides an alternative:
> Compare the query text with the materialized views query text stored. If we 
> found a match the original query's logical plan can be replaced by a scan on 
> the materialized view.
> - Only materialized views which are enabled to rewrite can participate
> - Use existing *HiveMaterializedViewsRegistry* through *Hive* object by 
> adding a lookup method by query text.
> - There might be more than one materialized views which have the same query 
> text. In this case chose the first valid one.
> - Validation can be done by calling 
> *Hive.validateMaterializedViewsFromRegistry()*
> - The scope of this first patch is rewriting queries which entire text can be 
> matched only.
> - Use the expanded query text (fully qualified column and table names) for 
> comparing



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24274) Implement Query Text based MaterializedView rewrite

2020-12-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24274?focusedWorklogId=523183&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-523183
 ]

ASF GitHub Bot logged work on HIVE-24274:
-

Author: ASF GitHub Bot
Created on: 11/Dec/20 13:14
Start Date: 11/Dec/20 13:14
Worklog Time Spent: 10m 
  Work Description: kasakrisz commented on a change in pull request #1706:
URL: https://github.com/apache/hive/pull/1706#discussion_r540938046



##
File path: ql/src/test/results/clientpositive/llap/avrotblsjoin.q.out
##
@@ -73,6 +73,7 @@ POSTHOOK: Lineage: table1_1.col1 SCRIPT []
 POSTHOOK: Lineage: table1_1.col2 SCRIPT []
 WARNING: Comparing a bigint and a string may result in a loss of precision.
 WARNING: Comparing a bigint and a string may result in a loss of precision.
+WARNING: Comparing a bigint and a string may result in a loss of precision.

Review comment:
   It is a side effect that unparsing is enable for all queries.
   When the calcite plan is generated `genJoinRelNode` is called and it calls 
`genAllExprNodeDesc` because unparsing is enabled which is required for 
expanded query text.
   ```
   } else if (unparseTranslator != null && 
unparseTranslator.isEnabled()) {
 genAllExprNodeDesc(joinCond, input, jCtx);
   }
   ```





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 523183)
Time Spent: 3h 10m  (was: 3h)

> Implement Query Text based MaterializedView rewrite
> ---
>
> Key: HIVE-24274
> URL: https://issues.apache.org/jira/browse/HIVE-24274
> Project: Hive
>  Issue Type: Improvement
>Reporter: Krisztian Kasa
>Assignee: Krisztian Kasa
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> Besides the way queries are currently rewritten to use materialized views in 
> Hive this project provides an alternative:
> Compare the query text with the materialized views query text stored. If we 
> found a match the original query's logical plan can be replaced by a scan on 
> the materialized view.
> - Only materialized views which are enabled to rewrite can participate
> - Use existing *HiveMaterializedViewsRegistry* through *Hive* object by 
> adding a lookup method by query text.
> - There might be more than one materialized views which have the same query 
> text. In this case chose the first valid one.
> - Validation can be done by calling 
> *Hive.validateMaterializedViewsFromRegistry()*
> - The scope of this first patch is rewriting queries which entire text can be 
> matched only.
> - Use the expanded query text (fully qualified column and table names) for 
> comparing



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24274) Implement Query Text based MaterializedView rewrite

2020-12-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24274?focusedWorklogId=523184&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-523184
 ]

ASF GitHub Bot logged work on HIVE-24274:
-

Author: ASF GitHub Bot
Created on: 11/Dec/20 13:14
Start Date: 11/Dec/20 13:14
Worklog Time Spent: 10m 
  Work Description: kasakrisz commented on a change in pull request #1706:
URL: https://github.com/apache/hive/pull/1706#discussion_r540938046



##
File path: ql/src/test/results/clientpositive/llap/avrotblsjoin.q.out
##
@@ -73,6 +73,7 @@ POSTHOOK: Lineage: table1_1.col1 SCRIPT []
 POSTHOOK: Lineage: table1_1.col2 SCRIPT []
 WARNING: Comparing a bigint and a string may result in a loss of precision.
 WARNING: Comparing a bigint and a string may result in a loss of precision.
+WARNING: Comparing a bigint and a string may result in a loss of precision.

Review comment:
   It is a side effect of unparsing is enable for all queries.
   When the calcite plan is generated `genJoinRelNode` is called and it calls 
`genAllExprNodeDesc` because unparsing is enabled which is required for 
expanded query text.
   ```
   } else if (unparseTranslator != null && 
unparseTranslator.isEnabled()) {
 genAllExprNodeDesc(joinCond, input, jCtx);
   }
   ```





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 523184)
Time Spent: 3h 20m  (was: 3h 10m)

> Implement Query Text based MaterializedView rewrite
> ---
>
> Key: HIVE-24274
> URL: https://issues.apache.org/jira/browse/HIVE-24274
> Project: Hive
>  Issue Type: Improvement
>Reporter: Krisztian Kasa
>Assignee: Krisztian Kasa
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> Besides the way queries are currently rewritten to use materialized views in 
> Hive this project provides an alternative:
> Compare the query text with the materialized views query text stored. If we 
> found a match the original query's logical plan can be replaced by a scan on 
> the materialized view.
> - Only materialized views which are enabled to rewrite can participate
> - Use existing *HiveMaterializedViewsRegistry* through *Hive* object by 
> adding a lookup method by query text.
> - There might be more than one materialized views which have the same query 
> text. In this case chose the first valid one.
> - Validation can be done by calling 
> *Hive.validateMaterializedViewsFromRegistry()*
> - The scope of this first patch is rewriting queries which entire text can be 
> matched only.
> - Use the expanded query text (fully qualified column and table names) for 
> comparing



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24274) Implement Query Text based MaterializedView rewrite

2020-12-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24274?focusedWorklogId=523181&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-523181
 ]

ASF GitHub Bot logged work on HIVE-24274:
-

Author: ASF GitHub Bot
Created on: 11/Dec/20 13:04
Start Date: 11/Dec/20 13:04
Worklog Time Spent: 10m 
  Work Description: kasakrisz commented on a change in pull request #1706:
URL: https://github.com/apache/hive/pull/1706#discussion_r540932337



##
File path: 
ql/src/test/queries/clientpositive/materialized_view_rewrite_by_text_2.q
##
@@ -0,0 +1,65 @@
+-- SORT_QUERY_RESULTS
+
+SET hive.vectorized.execution.enabled=false;
+set hive.server2.materializedviews.registry.impl=DUMMY;
+set hive.support.concurrency=true;
+set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;
+set hive.strict.checks.cartesian.product=false;
+set hive.materializedview.rewriting.query.text=true;

Review comment:
   yes. removed this line.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 523181)
Time Spent: 3h  (was: 2h 50m)

> Implement Query Text based MaterializedView rewrite
> ---
>
> Key: HIVE-24274
> URL: https://issues.apache.org/jira/browse/HIVE-24274
> Project: Hive
>  Issue Type: Improvement
>Reporter: Krisztian Kasa
>Assignee: Krisztian Kasa
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> Besides the way queries are currently rewritten to use materialized views in 
> Hive this project provides an alternative:
> Compare the query text with the materialized views query text stored. If we 
> found a match the original query's logical plan can be replaced by a scan on 
> the materialized view.
> - Only materialized views which are enabled to rewrite can participate
> - Use existing *HiveMaterializedViewsRegistry* through *Hive* object by 
> adding a lookup method by query text.
> - There might be more than one materialized views which have the same query 
> text. In this case chose the first valid one.
> - Validation can be done by calling 
> *Hive.validateMaterializedViewsFromRegistry()*
> - The scope of this first patch is rewriting queries which entire text can be 
> matched only.
> - Use the expanded query text (fully qualified column and table names) for 
> comparing



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24525) Invite reviewers automatically by file name patterns

2020-12-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24525?focusedWorklogId=523147&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-523147
 ]

ASF GitHub Bot logged work on HIVE-24525:
-

Author: ASF GitHub Bot
Created on: 11/Dec/20 10:56
Start Date: 11/Dec/20 10:56
Worklog Time Spent: 10m 
  Work Description: kgyrtkirk opened a new pull request #1767:
URL: https://github.com/apache/hive/pull/1767


   
   
   ### What changes were proposed in this pull request?
   
   
   
   ### Why are the changes needed?
   
   
   
   ### Does this PR introduce _any_ user-facing change?
   
   
   
   ### How was this patch tested?
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 523147)
Remaining Estimate: 0h
Time Spent: 10m

> Invite reviewers automatically by file name patterns
> 
>
> Key: HIVE-24525
> URL: https://issues.apache.org/jira/browse/HIVE-24525
> Project: Hive
>  Issue Type: Improvement
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> I've wrote about this an 
> [email|http://mail-archives.apache.org/mod_mbox/hive-dev/202006.mbox/%3c324a0a23-5841-09fe-a993-1a095035e...@rxd.hu%3e]
>  a long time ago...
> it could help in keeping an eye on some specific parts...eg: thrift and 
> parser changes 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24525) Invite reviewers automatically by file name patterns

2020-12-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-24525:
--
Labels: pull-request-available  (was: )

> Invite reviewers automatically by file name patterns
> 
>
> Key: HIVE-24525
> URL: https://issues.apache.org/jira/browse/HIVE-24525
> Project: Hive
>  Issue Type: Improvement
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> I've wrote about this an 
> [email|http://mail-archives.apache.org/mod_mbox/hive-dev/202006.mbox/%3c324a0a23-5841-09fe-a993-1a095035e...@rxd.hu%3e]
>  a long time ago...
> it could help in keeping an eye on some specific parts...eg: thrift and 
> parser changes 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-24525) Invite reviewers automatically by file name patterns

2020-12-11 Thread Zoltan Haindrich (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich reassigned HIVE-24525:
---


> Invite reviewers automatically by file name patterns
> 
>
> Key: HIVE-24525
> URL: https://issues.apache.org/jira/browse/HIVE-24525
> Project: Hive
>  Issue Type: Improvement
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
>
> I've wrote about this an 
> [email|http://mail-archives.apache.org/mod_mbox/hive-dev/202006.mbox/%3c324a0a23-5841-09fe-a993-1a095035e...@rxd.hu%3e]
>  a long time ago...
> it could help in keeping an eye on some specific parts...eg: thrift and 
> parser changes 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24274) Implement Query Text based MaterializedView rewrite

2020-12-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24274?focusedWorklogId=523142&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-523142
 ]

ASF GitHub Bot logged work on HIVE-24274:
-

Author: ASF GitHub Bot
Created on: 11/Dec/20 10:49
Start Date: 11/Dec/20 10:49
Worklog Time Spent: 10m 
  Work Description: kasakrisz commented on a change in pull request #1706:
URL: https://github.com/apache/hive/pull/1706#discussion_r540858913



##
File path: ql/src/java/org/apache/hadoop/hive/ql/parse/CalcitePlanner.java
##
@@ -2349,6 +2362,50 @@ private RelNode 
applyMaterializedViewRewriting(RelOptPlanner planner, RelNode ba
   return basePlan;
 }
 
+private RelOptMaterialization 
copyMaterializationToNewCluster(RelOptCluster optCluster, RelOptMaterialization 
materialization) {
+  final RelNode viewScan = materialization.tableRel;
+  final RelNode newViewScan = HiveMaterializedViewUtils.copyNodeNewCluster(
+  optCluster, viewScan);
+  return new RelOptMaterialization(newViewScan, materialization.queryRel, 
null,
+  materialization.qualifiedTableName);
+}
+
+private boolean isMaterializedViewRewritingByTextEnabled() {
+  return 
conf.getBoolVar(ConfVars.HIVE_MATERIALIZED_VIEW_ENABLE_AUTO_REWRITING_QUERY_TEXT)
 &&
+  mvRebuildMode == MaterializationRebuildMode.NONE &&
+  !getQB().isMaterializedView() && 
!ctx.isLoadingMaterializedView() && !getQB().isCTAS() &&
+  getQB().getIsQuery() &&
+  getQB().hasTableDefined();
+}
+
+private RelNode applyMaterializedViewRewritingByText(RelNode 
calciteGenPlan, RelOptCluster optCluster) {
+  unparseTranslator.applyTranslations(ctx.getTokenRewriteStream(), 
EXPANDED_QUERY_TOKEN_REWRITE_PROGRAM);
+  String expandedQueryText = ctx.getTokenRewriteStream()
+  .toString(EXPANDED_QUERY_TOKEN_REWRITE_PROGRAM, 
ast.getTokenStartIndex(), ast.getTokenStopIndex());
+  try {
+List relOptMaterializationList = 
db.getMaterialization(
+expandedQueryText, getTablesUsed(calciteGenPlan), getTxnMgr());
+for (RelOptMaterialization relOptMaterialization : 
relOptMaterializationList) {
+  try {
+Table hiveTableMD = extractTable(relOptMaterialization);
+if (db.validateMaterializedViewsFromRegistry(

Review comment:
   added check, and tests





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 523142)
Time Spent: 2h 50m  (was: 2h 40m)

> Implement Query Text based MaterializedView rewrite
> ---
>
> Key: HIVE-24274
> URL: https://issues.apache.org/jira/browse/HIVE-24274
> Project: Hive
>  Issue Type: Improvement
>Reporter: Krisztian Kasa
>Assignee: Krisztian Kasa
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> Besides the way queries are currently rewritten to use materialized views in 
> Hive this project provides an alternative:
> Compare the query text with the materialized views query text stored. If we 
> found a match the original query's logical plan can be replaced by a scan on 
> the materialized view.
> - Only materialized views which are enabled to rewrite can participate
> - Use existing *HiveMaterializedViewsRegistry* through *Hive* object by 
> adding a lookup method by query text.
> - There might be more than one materialized views which have the same query 
> text. In this case chose the first valid one.
> - Validation can be done by calling 
> *Hive.validateMaterializedViewsFromRegistry()*
> - The scope of this first patch is rewriting queries which entire text can be 
> matched only.
> - Use the expanded query text (fully qualified column and table names) for 
> comparing



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-23728) Run metastore verification tests during precommit

2020-12-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23728?focusedWorklogId=523133&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-523133
 ]

ASF GitHub Bot logged work on HIVE-23728:
-

Author: ASF GitHub Bot
Created on: 11/Dec/20 10:30
Start Date: 11/Dec/20 10:30
Worklog Time Spent: 10m 
  Work Description: kgyrtkirk opened a new pull request #1154:
URL: https://github.com/apache/hive/pull/1154


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 523133)
Time Spent: 1h  (was: 50m)

> Run metastore verification tests during precommit
> -
>
> Key: HIVE-23728
> URL: https://issues.apache.org/jira/browse/HIVE-23728
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24503) Optimize vector row serde by avoiding type check at run time

2020-12-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24503?focusedWorklogId=523129&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-523129
 ]

ASF GitHub Bot logged work on HIVE-24503:
-

Author: ASF GitHub Bot
Created on: 11/Dec/20 10:25
Start Date: 11/Dec/20 10:25
Worklog Time Spent: 10m 
  Work Description: pgaref commented on a change in pull request #1753:
URL: https://github.com/apache/hive/pull/1753#discussion_r540839969



##
File path: 
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorSerializeRow.java
##
@@ -328,6 +345,13 @@ private void serializeUnionWrite(
 serializeWrite.finishUnion();
   }
 
+  class VectorSerializeStructWriter extends VectorSerializeWriter {
+@Override
+void serialize(Object colInfo, Field field, int adjustedBatchIndex) throws 
IOException {
+  serializeStructWrite((StructColumnVector)colInfo, field, 
adjustedBatchIndex);
+}
+  }
+
   private void serializeStructWrite(

Review comment:
   Nit: would be cleaner if serializeStructWrite method is be part of the 
VectorSerializeStructWriter class

##
File path: 
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorSerializeRow.java
##
@@ -355,6 +379,13 @@ private void serializeStructWrite(
 serializeWrite.finishStruct();
   }
 
+  class VectorSerializeMapWriter extends VectorSerializeWriter {
+@Override
+void serialize(Object colInfo, Field field, int adjustedBatchIndex) throws 
IOException {
+  serializeMapWrite((MapColumnVector)colInfo, field, adjustedBatchIndex);
+}
+  }
+
   private void serializeMapWrite(

Review comment:
   Nit: serializeMapWrite part of VectorSerializeMapWriter class?

##
File path: 
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorDeserializeRow.java
##
@@ -564,173 +621,337 @@ public void init() throws HiveException {
 init(0);
   }
 
-  private void storePrimitiveRowColumn(ColumnVector colVector, Field field,
-  int batchIndex, boolean canRetainByteRef) throws IOException {
-
-switch (field.getPrimitiveCategory()) {
-case VOID:
+  class VectorVoidDeserializer extends VectorBatchDeserializer {
+@Override
+void store(ColumnVector colVector, Field field, int batchIndex, boolean 
canRetainByteRef)
+throws IOException {
   VectorizedBatchUtil.setNullColIsNullValue(colVector, batchIndex);
-  return;
-case BOOLEAN:
-  ((LongColumnVector) colVector).vector[batchIndex] = 
(deserializeRead.currentBoolean ? 1 : 0);
-  break;
-case BYTE:
+}
+
+@Override
+Object convert(ColumnVector batch, int batchIndex, Field field) throws 
IOException {
+  return convertVoid();
+}
+  }
+
+  class VectorBooleanDeserializer extends VectorBatchDeserializer {
+@Override
+void store(ColumnVector colVector, Field field, int batchIndex, boolean 
canRetainByteRef)
+throws IOException {
+  ((LongColumnVector) colVector).vector[batchIndex] =
+  (deserializeRead.currentBoolean ? 1 : 0);
+}
+
+@Override
+Object convert(ColumnVector batch, int batchIndex, Field field) throws 
IOException {
+  return convertBoolean(field.getConversionWritable());
+}
+  }
+
+  class VectorByteDeserializer extends VectorBatchDeserializer {
+@Override
+void store(ColumnVector colVector, Field field, int batchIndex, boolean 
canRetainByteRef)
+throws IOException {
   ((LongColumnVector) colVector).vector[batchIndex] = 
deserializeRead.currentByte;
-  break;
-case SHORT:
+}
+
+@Override
+Object convert(ColumnVector batch, int batchIndex, Field field) throws 
IOException {
+  return convertByte(field.getConversionWritable());
+}
+  }
+
+  class VectorShortDeserializer extends VectorBatchDeserializer {
+@Override
+void store(ColumnVector colVector, Field field, int batchIndex, boolean 
canRetainByteRef)
+throws IOException {
   ((LongColumnVector) colVector).vector[batchIndex] = 
deserializeRead.currentShort;
-  break;
-case INT:
+}
+
+@Override
+Object convert(ColumnVector batch, int batchIndex, Field field) throws 
IOException {
+  return convertShort(field.getConversionWritable());
+}
+  }
+
+  class VectorIntDeserializer extends VectorBatchDeserializer {
+@Override
+void store(ColumnVector colVector, Field field, int batchIndex, boolean 
canRetainByteRef)
+throws IOException {
   ((LongColumnVector) colVector).vector[batchIndex] = 
deserializeRead.currentInt;
-  break;
-case LONG:
+}
+
+@Override
+Object convert(ColumnVector batch, int batchIndex, Field field) throws 
IOException {
+  return convertInt(field.getConversionWritable());
+}
+  }
+
+  class VectorLongDeserializer extends VectorBatchDeserializer {
+@Override
+void store(ColumnVector

[jira] [Work logged] (HIVE-24474) Failed compaction always logs TxnAbortedException (again)

2020-12-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24474?focusedWorklogId=523128&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-523128
 ]

ASF GitHub Bot logged work on HIVE-24474:
-

Author: ASF GitHub Bot
Created on: 11/Dec/20 10:20
Start Date: 11/Dec/20 10:20
Worklog Time Spent: 10m 
  Work Description: klcopp commented on a change in pull request #1735:
URL: https://github.com/apache/hive/pull/1735#discussion_r540841418



##
File path: ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Worker.java
##
@@ -579,7 +567,7 @@ protected Boolean findNextCompactionAndExecute(boolean 
computeStats) throws Inte
 } catch (Throwable t) {
   LOG.error("Caught an exception in the main loop of compactor worker " + 
workerName, t);
 } finally {
-  commitTxnIfSet(compactorTxnId);
+  compactionTxn.commit();

Review comment:
   If compaction failed, it should be aborted before this. 
CompactionTxn#commit will only commit the txn if it's still open.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 523128)
Time Spent: 1h 20m  (was: 1h 10m)

> Failed compaction always logs TxnAbortedException (again)
> -
>
> Key: HIVE-24474
> URL: https://issues.apache.org/jira/browse/HIVE-24474
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Karen Coppage
>Assignee: Karen Coppage
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Re-introduced with HIVE-24096.
> If there is an error during compaction, the compaction's txn is aborted but 
> in the finally clause, we try to commit it (commitTxnIfSet), so Worker throws 
> a TxnAbortedException.
> We should set compactorTxnId to TXN_ID_NOT_SET if the compaction's txn is 
> aborted.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24488) Make docker host configurable for metastoredb/perf tests

2020-12-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-24488:
--
Labels: pull-request-available  (was: )

> Make docker host configurable for metastoredb/perf tests
> 
>
> Key: HIVE-24488
> URL: https://issues.apache.org/jira/browse/HIVE-24488
> Project: Hive
>  Issue Type: Improvement
>  Components: Test
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> I tend to develop patches inside containers (hive-dev-box) to be able to work 
> on multiple patches in parallel
> Running tests which do use docker were always a bit problematic for me - when 
> I wanted to do it before: I manually exposed /var/lib/docker and added a 
> rinetd forward by hand (which is not nice)
> ...with the current move to run Perf tests as well against a dockerized 
> metastore exposes this problem a bit more for me.
> I'm also considering to add the ability to use minikube with hive-dev-box ; 
> but that's still needs exploring
> it would be much easier to expose the address of the docker host I'm using...



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24474) Failed compaction always logs TxnAbortedException (again)

2020-12-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24474?focusedWorklogId=523127&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-523127
 ]

ASF GitHub Bot logged work on HIVE-24474:
-

Author: ASF GitHub Bot
Created on: 11/Dec/20 10:17
Start Date: 11/Dec/20 10:17
Worklog Time Spent: 10m 
  Work Description: klcopp commented on a change in pull request #1735:
URL: https://github.com/apache/hive/pull/1735#discussion_r540839467



##
File path: ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Worker.java
##
@@ -556,20 +544,20 @@ protected Boolean findNextCompactionAndExecute(boolean 
computeStats) throws Inte
 verifyTableIdHasNotChanged(ci, t1);
 
 LOG.info("Completed " + ci.type.toString() + " compaction for " + 
ci.getFullPartitionName() + " in txn "
-+ JavaUtils.txnIdToString(compactorTxnId) + ", marking as 
compacted.");
++ JavaUtils.txnIdToString(compactionTxn.getTxnId()) + ", marking 
as compacted.");

Review comment:
   Done

##
File path: ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Worker.java
##
@@ -483,12 +471,12 @@ protected Boolean findNextCompactionAndExecute(boolean 
computeStats) throws Inte
* multiple statements in it (for query based compactor) which is not 
supported (and since
* this case some of the statements are DDL, even in the future will not 
be allowed in a
* multi-stmt txn. {@link Driver#setCompactionWriteIds(ValidWriteIdList, 
long)} */
-  compactorTxnId = msc.openTxn(ci.runAs, TxnType.COMPACTION);
+  compactionTxn.open(ci);
 
-  heartbeater = new CompactionHeartbeater(compactorTxnId, fullTableName, 
conf);
+  heartbeater = new CompactionHeartbeater(compactionTxn.getTxnId(), 
fullTableName, conf);

Review comment:
   Done





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 523127)
Time Spent: 1h 10m  (was: 1h)

> Failed compaction always logs TxnAbortedException (again)
> -
>
> Key: HIVE-24474
> URL: https://issues.apache.org/jira/browse/HIVE-24474
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Karen Coppage
>Assignee: Karen Coppage
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Re-introduced with HIVE-24096.
> If there is an error during compaction, the compaction's txn is aborted but 
> in the finally clause, we try to commit it (commitTxnIfSet), so Worker throws 
> a TxnAbortedException.
> We should set compactorTxnId to TXN_ID_NOT_SET if the compaction's txn is 
> aborted.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24488) Make docker host configurable for metastoredb/perf tests

2020-12-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24488?focusedWorklogId=523126&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-523126
 ]

ASF GitHub Bot logged work on HIVE-24488:
-

Author: ASF GitHub Bot
Created on: 11/Dec/20 10:17
Start Date: 11/Dec/20 10:17
Worklog Time Spent: 10m 
  Work Description: kgyrtkirk opened a new pull request #1766:
URL: https://github.com/apache/hive/pull/1766


   bugfixes:
   * force remove the container prior to starting the test
   * also remove the volume on deletion (tpcds30tb left a ~600M garbage volume 
behind on each run)
   
   additions:
   * add option to enable specifying the docker host address (enables to run 
them from hive-dev-box containers)
   * container name suffix support (needed to possibly run multiple instance of 
the same test at the same time)
   
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 523126)
Remaining Estimate: 0h
Time Spent: 10m

> Make docker host configurable for metastoredb/perf tests
> 
>
> Key: HIVE-24488
> URL: https://issues.apache.org/jira/browse/HIVE-24488
> Project: Hive
>  Issue Type: Improvement
>  Components: Test
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> I tend to develop patches inside containers (hive-dev-box) to be able to work 
> on multiple patches in parallel
> Running tests which do use docker were always a bit problematic for me - when 
> I wanted to do it before: I manually exposed /var/lib/docker and added a 
> rinetd forward by hand (which is not nice)
> ...with the current move to run Perf tests as well against a dockerized 
> metastore exposes this problem a bit more for me.
> I'm also considering to add the ability to use minikube with hive-dev-box ; 
> but that's still needs exploring
> it would be much easier to expose the address of the docker host I'm using...



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24523) Vectorized read path for LazySimpleSerde does not honor the SERDEPROPERTIES for timestamp

2020-12-11 Thread Panagiotis Garefalakis (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Panagiotis Garefalakis updated HIVE-24523:
--
Affects Version/s: 4.0.0

> Vectorized read path for LazySimpleSerde does not honor the SERDEPROPERTIES 
> for timestamp
> -
>
> Key: HIVE-24523
> URL: https://issues.apache.org/jira/browse/HIVE-24523
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Affects Versions: 3.2.0, 4.0.0
>Reporter: Rajkumar Singh
>Priority: Major
>
> Steps to repro:
> {code:java}
>   create external  table tstable(date_created timestamp)   ROW FORMAT SERDE 
> 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'   WITH SERDEPROPERTIES ( 
>  'timestamp.formats'='MMddHHmmss') stored as textfile;
> cat sampledata 
> 2020120517
> hdfs dfs -put sampledata /warehouse/tablespace/external/hive/tstable
> {code}
> disable fetch task conversion and run select * from tstable which produce no 
> results, disabling the set 
> hive.vectorized.use.vector.serde.deserialize=false; return the expected 
> output.
> while parsing the string to timestamp 
> https://github.com/apache/hive/blob/master/serde/src/java/org/apache/hadoop/hive/serde2/lazy/fast/LazySimpleDeserializeRead.java#L812
>  does not set the DateTimeFormatter which results IllegalArgumentException 
> while parsing the timestamp through TimestampUtils.stringToTimestamp(strValue)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HIVE-24498) Package facebook thrift classes into hive-exec jar

2020-12-11 Thread Peter Vary (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Vary resolved HIVE-24498.
---
Fix Version/s: 4.0.0
   Resolution: Fixed

Pushed to master.

Thanks for the patch [~Marton Bod]!

> Package facebook thrift classes into hive-exec jar
> --
>
> Key: HIVE-24498
> URL: https://issues.apache.org/jira/browse/HIVE-24498
> Project: Hive
>  Issue Type: Task
>Reporter: Marton Bod
>Assignee: Marton Bod
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> When using Iceberg tables, the Tez AM needs to communicate with the HMS to 
> fetch table information in the SerDe and InputFormat operations. The libfb303 
> classes are currently missing from the hive-exec jar leading to Tez missing 
> these classes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24498) Package facebook thrift classes into hive-exec jar

2020-12-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24498?focusedWorklogId=523105&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-523105
 ]

ASF GitHub Bot logged work on HIVE-24498:
-

Author: ASF GitHub Bot
Created on: 11/Dec/20 09:23
Start Date: 11/Dec/20 09:23
Worklog Time Spent: 10m 
  Work Description: pvary merged pull request #1751:
URL: https://github.com/apache/hive/pull/1751


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 523105)
Time Spent: 20m  (was: 10m)

> Package facebook thrift classes into hive-exec jar
> --
>
> Key: HIVE-24498
> URL: https://issues.apache.org/jira/browse/HIVE-24498
> Project: Hive
>  Issue Type: Task
>Reporter: Marton Bod
>Assignee: Marton Bod
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> When using Iceberg tables, the Tez AM needs to communicate with the HMS to 
> fetch table information in the SerDe and InputFormat operations. The libfb303 
> classes are currently missing from the hive-exec jar leading to Tez missing 
> these classes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24474) Failed compaction always logs TxnAbortedException (again)

2020-12-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24474?focusedWorklogId=523102&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-523102
 ]

ASF GitHub Bot logged work on HIVE-24474:
-

Author: ASF GitHub Bot
Created on: 11/Dec/20 09:11
Start Date: 11/Dec/20 09:11
Worklog Time Spent: 10m 
  Work Description: pvary commented on a change in pull request #1735:
URL: https://github.com/apache/hive/pull/1735#discussion_r540797373



##
File path: ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Worker.java
##
@@ -579,7 +567,7 @@ protected Boolean findNextCompactionAndExecute(boolean 
computeStats) throws Inte
 } catch (Throwable t) {
   LOG.error("Caught an exception in the main loop of compactor worker " + 
workerName, t);
 } finally {
-  commitTxnIfSet(compactorTxnId);
+  compactionTxn.commit();

Review comment:
   Maybe commitOrAbort? Or close? This is not always commit if I understand 
correctly





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 523102)
Time Spent: 1h  (was: 50m)

> Failed compaction always logs TxnAbortedException (again)
> -
>
> Key: HIVE-24474
> URL: https://issues.apache.org/jira/browse/HIVE-24474
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Karen Coppage
>Assignee: Karen Coppage
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Re-introduced with HIVE-24096.
> If there is an error during compaction, the compaction's txn is aborted but 
> in the finally clause, we try to commit it (commitTxnIfSet), so Worker throws 
> a TxnAbortedException.
> We should set compactorTxnId to TXN_ID_NOT_SET if the compaction's txn is 
> aborted.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24474) Failed compaction always logs TxnAbortedException (again)

2020-12-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24474?focusedWorklogId=523100&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-523100
 ]

ASF GitHub Bot logged work on HIVE-24474:
-

Author: ASF GitHub Bot
Created on: 11/Dec/20 09:10
Start Date: 11/Dec/20 09:10
Worklog Time Spent: 10m 
  Work Description: pvary commented on a change in pull request #1735:
URL: https://github.com/apache/hive/pull/1735#discussion_r540796707



##
File path: ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Worker.java
##
@@ -556,20 +544,20 @@ protected Boolean findNextCompactionAndExecute(boolean 
computeStats) throws Inte
 verifyTableIdHasNotChanged(ci, t1);
 
 LOG.info("Completed " + ci.type.toString() + " compaction for " + 
ci.getFullPartitionName() + " in txn "
-+ JavaUtils.txnIdToString(compactorTxnId) + ", marking as 
compacted.");
++ JavaUtils.txnIdToString(compactionTxn.getTxnId()) + ", marking 
as compacted.");

Review comment:
   Maybe toString for CompactionTxnId?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 523100)
Time Spent: 50m  (was: 40m)

> Failed compaction always logs TxnAbortedException (again)
> -
>
> Key: HIVE-24474
> URL: https://issues.apache.org/jira/browse/HIVE-24474
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Karen Coppage
>Assignee: Karen Coppage
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Re-introduced with HIVE-24096.
> If there is an error during compaction, the compaction's txn is aborted but 
> in the finally clause, we try to commit it (commitTxnIfSet), so Worker throws 
> a TxnAbortedException.
> We should set compactorTxnId to TXN_ID_NOT_SET if the compaction's txn is 
> aborted.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24474) Failed compaction always logs TxnAbortedException (again)

2020-12-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24474?focusedWorklogId=523099&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-523099
 ]

ASF GitHub Bot logged work on HIVE-24474:
-

Author: ASF GitHub Bot
Created on: 11/Dec/20 09:10
Start Date: 11/Dec/20 09:10
Worklog Time Spent: 10m 
  Work Description: pvary commented on a change in pull request #1735:
URL: https://github.com/apache/hive/pull/1735#discussion_r540796283



##
File path: ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Worker.java
##
@@ -483,12 +471,12 @@ protected Boolean findNextCompactionAndExecute(boolean 
computeStats) throws Inte
* multiple statements in it (for query based compactor) which is not 
supported (and since
* this case some of the statements are DDL, even in the future will not 
be allowed in a
* multi-stmt txn. {@link Driver#setCompactionWriteIds(ValidWriteIdList, 
long)} */
-  compactorTxnId = msc.openTxn(ci.runAs, TxnType.COMPACTION);
+  compactionTxn.open(ci);
 
-  heartbeater = new CompactionHeartbeater(compactorTxnId, fullTableName, 
conf);
+  heartbeater = new CompactionHeartbeater(compactionTxn.getTxnId(), 
fullTableName, conf);

Review comment:
   Maybe change compactionHeartbeater to use the new class instead of Id?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 523099)
Time Spent: 40m  (was: 0.5h)

> Failed compaction always logs TxnAbortedException (again)
> -
>
> Key: HIVE-24474
> URL: https://issues.apache.org/jira/browse/HIVE-24474
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Karen Coppage
>Assignee: Karen Coppage
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Re-introduced with HIVE-24096.
> If there is an error during compaction, the compaction's txn is aborted but 
> in the finally clause, we try to commit it (commitTxnIfSet), so Worker throws 
> a TxnAbortedException.
> We should set compactorTxnId to TXN_ID_NOT_SET if the compaction's txn is 
> aborted.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24274) Implement Query Text based MaterializedView rewrite

2020-12-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24274?focusedWorklogId=523095&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-523095
 ]

ASF GitHub Bot logged work on HIVE-24274:
-

Author: ASF GitHub Bot
Created on: 11/Dec/20 09:05
Start Date: 11/Dec/20 09:05
Worklog Time Spent: 10m 
  Work Description: kasakrisz commented on a change in pull request #1706:
URL: https://github.com/apache/hive/pull/1706#discussion_r540793497



##
File path: 
ql/src/test/results/clientpositive/llap/materialized_view_rewrite_by_text_2.q.out
##
@@ -0,0 +1,334 @@
+PREHOOK: query: create table cmv_basetable_n0 (a int, b varchar(256), c 
decimal(10,2), d int) stored as orc TBLPROPERTIES ('transactional'='true')
+PREHOOK: type: CREATETABLE
+PREHOOK: Output: database:default
+PREHOOK: Output: default@cmv_basetable_n0
+POSTHOOK: query: create table cmv_basetable_n0 (a int, b varchar(256), c 
decimal(10,2), d int) stored as orc TBLPROPERTIES ('transactional'='true')
+POSTHOOK: type: CREATETABLE
+POSTHOOK: Output: database:default
+POSTHOOK: Output: default@cmv_basetable_n0
+PREHOOK: query: insert into cmv_basetable_n0 values
+ (1, 'alfred', 10.30, 2),
+ (2, 'bob', 3.14, 3),
+ (2, 'bonnie', 172342.2, 3),
+ (3, 'calvin', 978.76, 3),
+ (3, 'charlie', 9.8, 1)
+PREHOOK: type: QUERY
+PREHOOK: Input: _dummy_database@_dummy_table
+PREHOOK: Output: default@cmv_basetable_n0
+POSTHOOK: query: insert into cmv_basetable_n0 values
+ (1, 'alfred', 10.30, 2),
+ (2, 'bob', 3.14, 3),
+ (2, 'bonnie', 172342.2, 3),
+ (3, 'calvin', 978.76, 3),
+ (3, 'charlie', 9.8, 1)
+POSTHOOK: type: QUERY
+POSTHOOK: Input: _dummy_database@_dummy_table
+POSTHOOK: Output: default@cmv_basetable_n0
+POSTHOOK: Lineage: cmv_basetable_n0.a SCRIPT []
+POSTHOOK: Lineage: cmv_basetable_n0.b SCRIPT []
+POSTHOOK: Lineage: cmv_basetable_n0.c SCRIPT []
+POSTHOOK: Lineage: cmv_basetable_n0.d SCRIPT []
+PREHOOK: query: create materialized view cmv_mat_view_n0
+as select a, b, c from cmv_basetable_n0 where a = 2
+PREHOOK: type: CREATE_MATERIALIZED_VIEW
+PREHOOK: Input: default@cmv_basetable_n0
+PREHOOK: Output: database:default
+PREHOOK: Output: default@cmv_mat_view_n0
+POSTHOOK: query: create materialized view cmv_mat_view_n0
+as select a, b, c from cmv_basetable_n0 where a = 2
+POSTHOOK: type: CREATE_MATERIALIZED_VIEW
+POSTHOOK: Input: default@cmv_basetable_n0
+POSTHOOK: Output: database:default
+POSTHOOK: Output: default@cmv_mat_view_n0
+PREHOOK: query: select * from cmv_mat_view_n0
+PREHOOK: type: QUERY
+PREHOOK: Input: default@cmv_mat_view_n0
+ A masked pattern was here 
+POSTHOOK: query: select * from cmv_mat_view_n0
+POSTHOOK: type: QUERY
+POSTHOOK: Input: default@cmv_mat_view_n0
+ A masked pattern was here 
+2  bob 3.14
+2  bonnie  172342.20
+PREHOOK: query: show tblproperties cmv_mat_view_n0
+PREHOOK: type: SHOW_TBLPROPERTIES
+POSTHOOK: query: show tblproperties cmv_mat_view_n0
+POSTHOOK: type: SHOW_TBLPROPERTIES
+COLUMN_STATS_ACCURATE  
{"BASIC_STATS":"true","COLUMN_STATS":{"a":"true","b":"true","c":"true"}}
+bucketing_version  2
+numFiles   1
+numFilesErasureCoded   0
+numRows2
+rawDataSize408
+totalSize  468
+ A masked pattern was here 
+PREHOOK: query: create materialized view if not exists cmv_mat_view2
+as select a, c from cmv_basetable_n0 where a = 3
+PREHOOK: type: CREATE_MATERIALIZED_VIEW
+PREHOOK: Input: default@cmv_basetable_n0
+PREHOOK: Output: database:default
+PREHOOK: Output: default@cmv_mat_view2
+POSTHOOK: query: create materialized view if not exists cmv_mat_view2
+as select a, c from cmv_basetable_n0 where a = 3
+POSTHOOK: type: CREATE_MATERIALIZED_VIEW
+POSTHOOK: Input: default@cmv_basetable_n0
+POSTHOOK: Output: database:default
+POSTHOOK: Output: default@cmv_mat_view2
+PREHOOK: query: select * from cmv_mat_view2
+PREHOOK: type: QUERY
+PREHOOK: Input: default@cmv_mat_view2
+ A masked pattern was here 
+POSTHOOK: query: select * from cmv_mat_view2
+POSTHOOK: type: QUERY
+POSTHOOK: Input: default@cmv_mat_view2
+ A masked pattern was here 
+3  9.80
+3  978.76
+PREHOOK: query: show tblproperties cmv_mat_view2
+PREHOOK: type: SHOW_TBLPROPERTIES
+POSTHOOK: query: show tblproperties cmv_mat_view2
+POSTHOOK: type: SHOW_TBLPROPERTIES
+COLUMN_STATS_ACCURATE  
{"BASIC_STATS":"true","COLUMN_STATS":{"a":"true","c":"true"}}
+bucketing_version  2
+numFiles   1
+numFilesErasureCoded   0
+numRows2
+rawDataSize232
+totalSize  334
+ A masked pattern was here 
+PREHOOK: query: explain
+select a, c from cmv_basetable_n0 where a = 3
+PREHOOK: type: QUERY
+PREHOOK: Input: default@cmv_basetable_n0
+PREHOOK: Input: default@cmv_mat_view2
+ A masked pattern was here 
+POSTHOOK: query: explain
+select a, c from cmv_basetable_n0 where a = 3
+POSTHOOK: type: QUERY
+POSTHOOK: Input: default@cmv_basetable_n0
+POSTHOOK: Input: d

[jira] [Work logged] (HIVE-24274) Implement Query Text based MaterializedView rewrite

2020-12-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24274?focusedWorklogId=523094&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-523094
 ]

ASF GitHub Bot logged work on HIVE-24274:
-

Author: ASF GitHub Bot
Created on: 11/Dec/20 08:58
Start Date: 11/Dec/20 08:58
Worklog Time Spent: 10m 
  Work Description: kasakrisz commented on a change in pull request #1706:
URL: https://github.com/apache/hive/pull/1706#discussion_r540789256



##
File path: ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
##
@@ -12565,15 +12566,17 @@ void analyzeInternal(ASTNode ast, 
Supplier pcf) throws SemanticE
 sinkOp = genOPTree(ast, plannerCtx);
 
 boolean usesMasking = false;

Review comment:
   added tests





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 523094)
Time Spent: 2.5h  (was: 2h 20m)

> Implement Query Text based MaterializedView rewrite
> ---
>
> Key: HIVE-24274
> URL: https://issues.apache.org/jira/browse/HIVE-24274
> Project: Hive
>  Issue Type: Improvement
>Reporter: Krisztian Kasa
>Assignee: Krisztian Kasa
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> Besides the way queries are currently rewritten to use materialized views in 
> Hive this project provides an alternative:
> Compare the query text with the materialized views query text stored. If we 
> found a match the original query's logical plan can be replaced by a scan on 
> the materialized view.
> - Only materialized views which are enabled to rewrite can participate
> - Use existing *HiveMaterializedViewsRegistry* through *Hive* object by 
> adding a lookup method by query text.
> - There might be more than one materialized views which have the same query 
> text. In this case chose the first valid one.
> - Validation can be done by calling 
> *Hive.validateMaterializedViewsFromRegistry()*
> - The scope of this first patch is rewriting queries which entire text can be 
> matched only.
> - Use the expanded query text (fully qualified column and table names) for 
> comparing



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-24517) Add mapreduce workflow information to job configuration in MergeFileTask

2020-12-11 Thread Ning Sheng (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-24517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17247742#comment-17247742
 ] 

Ning Sheng commented on HIVE-24517:
---

[~sunchao] Please review this patch if you're free.

> Add mapreduce workflow information to job configuration in MergeFileTask
> 
>
> Key: HIVE-24517
> URL: https://issues.apache.org/jira/browse/HIVE-24517
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.1.2
>Reporter: Ning Sheng
>Assignee: Ning Sheng
>Priority: Blocker
> Fix For: 3.1.2
>
> Attachments: HIVE-24517.branch-3.1.path
>
>
> When use tools to visualize and monitor multiple MR jobs, MergeFileTask's 
> mapreduce.workflow.node.name is previous stage name.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24524) LLAP ShuffleHandler: upgrade to netty4

2020-12-11 Thread Jira


 [ 
https://issues.apache.org/jira/browse/HIVE-24524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Bodor updated HIVE-24524:

Summary: LLAP ShuffleHandler: upgrade to netty4  (was: ShuffleHandler: 
upgrade to netty4)

> LLAP ShuffleHandler: upgrade to netty4
> --
>
> Key: HIVE-24524
> URL: https://issues.apache.org/jira/browse/HIVE-24524
> Project: Hive
>  Issue Type: Improvement
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
>
> Tez already has a WIP patch for upgrading its shuffle handler to netty4. 
> Netty4 is told to be a possible performance improvement compared to Netty3. 
> However, the refactor is not trivial, TEZ-4157 covers that more or less (the 
> code bases are very similar).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24524) ShuffleHandler: upgrade to netty4

2020-12-11 Thread Jira


 [ 
https://issues.apache.org/jira/browse/HIVE-24524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Bodor updated HIVE-24524:

Description: Tez already has a WIP patch for upgrading its shuffle handler 
to netty4. Netty4 is told to be a possible performance improvement compared to 
Netty3. However, the refactor is not trivial, TEZ-4157 covers that more or less 
(the code bases are very similar).

> ShuffleHandler: upgrade to netty4
> -
>
> Key: HIVE-24524
> URL: https://issues.apache.org/jira/browse/HIVE-24524
> Project: Hive
>  Issue Type: Improvement
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
>
> Tez already has a WIP patch for upgrading its shuffle handler to netty4. 
> Netty4 is told to be a possible performance improvement compared to Netty3. 
> However, the refactor is not trivial, TEZ-4157 covers that more or less (the 
> code bases are very similar).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24524) LLAP ShuffleHandler: upgrade to netty4

2020-12-11 Thread Jira


 [ 
https://issues.apache.org/jira/browse/HIVE-24524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Bodor updated HIVE-24524:

Description: 
Tez already has a WIP patch for upgrading its shuffle handler to netty4. Netty4 
is told to be a possible performance improvement compared to Netty3. However, 
the refactor is not trivial, TEZ-4157 covers that more or less (the code bases 
are very similar).

Background:
netty4 migration guideline: https://netty.io/wiki/new-and-noteworthy-in-4.0.html
articles of possible performance improvement:
https://blog.twitter.com/engineering/en_us/a/2013/netty-4-at-twitter-reduced-gc-overhead.html
https://developer.squareup.com/blog/upgrading-a-reverse-proxy-from-netty-3-to-4/

  was:Tez already has a WIP patch for upgrading its shuffle handler to netty4. 
Netty4 is told to be a possible performance improvement compared to Netty3. 
However, the refactor is not trivial, TEZ-4157 covers that more or less (the 
code bases are very similar).


> LLAP ShuffleHandler: upgrade to netty4
> --
>
> Key: HIVE-24524
> URL: https://issues.apache.org/jira/browse/HIVE-24524
> Project: Hive
>  Issue Type: Improvement
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
>
> Tez already has a WIP patch for upgrading its shuffle handler to netty4. 
> Netty4 is told to be a possible performance improvement compared to Netty3. 
> However, the refactor is not trivial, TEZ-4157 covers that more or less (the 
> code bases are very similar).
> Background:
> netty4 migration guideline: 
> https://netty.io/wiki/new-and-noteworthy-in-4.0.html
> articles of possible performance improvement:
> https://blog.twitter.com/engineering/en_us/a/2013/netty-4-at-twitter-reduced-gc-overhead.html
> https://developer.squareup.com/blog/upgrading-a-reverse-proxy-from-netty-3-to-4/



--
This message was sent by Atlassian Jira
(v8.3.4#803005)