[jira] [Work logged] (HIVE-24270) Move scratchdir cleanup to background

2020-10-29 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24270?focusedWorklogId=506427=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-506427
 ]

ASF GitHub Bot logged work on HIVE-24270:
-

Author: ASF GitHub Bot
Created on: 29/Oct/20 21:48
Start Date: 29/Oct/20 21:48
Worklog Time Spent: 10m 
  Work Description: mustafaiman opened a new pull request #1627:
URL: https://github.com/apache/hive/pull/1627


   …e cases
   
   
   
   ### What changes were proposed in this pull request?
   
   
   
   ### Why are the changes needed?
   
   
   
   ### Does this PR introduce _any_ user-facing change?
   
   
   
   ### How was this patch tested?
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 506427)
Time Spent: 2.5h  (was: 2h 20m)

> Move scratchdir cleanup to background
> -
>
> Key: HIVE-24270
> URL: https://issues.apache.org/jira/browse/HIVE-24270
> Project: Hive
>  Issue Type: Improvement
>Reporter: Mustafa Iman
>Assignee: Mustafa Iman
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> In cloud environment, scratchdir cleaning at the end of the query may take 
> long time. This causes client to hang up to 1 minute even after the results 
> were streamed back. During this time client just waits for cleanup to finish. 
> Cleanup can take place in the background in HiveServer.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24270) Move scratchdir cleanup to background

2020-10-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24270?focusedWorklogId=505862=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-505862
 ]

ASF GitHub Bot logged work on HIVE-24270:
-

Author: ASF GitHub Bot
Created on: 28/Oct/20 18:48
Start Date: 28/Oct/20 18:48
Worklog Time Spent: 10m 
  Work Description: mustafaiman closed pull request #1577:
URL: https://github.com/apache/hive/pull/1577


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 505862)
Time Spent: 2h 20m  (was: 2h 10m)

> Move scratchdir cleanup to background
> -
>
> Key: HIVE-24270
> URL: https://issues.apache.org/jira/browse/HIVE-24270
> Project: Hive
>  Issue Type: Improvement
>Reporter: Mustafa Iman
>Assignee: Mustafa Iman
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> In cloud environment, scratchdir cleaning at the end of the query may take 
> long time. This causes client to hang up to 1 minute even after the results 
> were streamed back. During this time client just waits for cleanup to finish. 
> Cleanup can take place in the background in HiveServer.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24270) Move scratchdir cleanup to background

2020-10-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24270?focusedWorklogId=504904=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-504904
 ]

ASF GitHub Bot logged work on HIVE-24270:
-

Author: ASF GitHub Bot
Created on: 26/Oct/20 20:16
Start Date: 26/Oct/20 20:16
Worklog Time Spent: 10m 
  Work Description: mustafaiman commented on a change in pull request #1577:
URL: https://github.com/apache/hive/pull/1577#discussion_r512240902



##
File path: 
service/src/java/org/apache/hive/service/cli/session/SessionManager.java
##
@@ -135,6 +138,13 @@ public synchronized void init(HiveConf hiveConf) {
 userIpAddressLimit = 
hiveConf.getIntVar(ConfVars.HIVE_SERVER2_LIMIT_CONNECTIONS_PER_USER_IPADDRESS);
 LOG.info("Connections limit are user: {} ipaddress: {} user-ipaddress: 
{}", userLimit, ipAddressLimit,
   userIpAddressLimit);
+
+int cleanupThreadCount = 
hiveConf.getIntVar(ConfVars.HIVE_ASYNC_CLEANUP_SERVICE_THREAD_COUNT);
+int cleanupQueueSize = 
hiveConf.getIntVar(ConfVars.HIVE_ASYNC_CLEANUP_SERVICE_QUEUE_SIZE);
+if (cleanupThreadCount > 0) {
+  cleanupService = new EventualCleanupService(cleanupThreadCount, 
cleanupQueueSize);
+}
+cleanupService.start();

Review comment:
   fixed





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 504904)
Time Spent: 2h 10m  (was: 2h)

> Move scratchdir cleanup to background
> -
>
> Key: HIVE-24270
> URL: https://issues.apache.org/jira/browse/HIVE-24270
> Project: Hive
>  Issue Type: Improvement
>Reporter: Mustafa Iman
>Assignee: Mustafa Iman
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> In cloud environment, scratchdir cleaning at the end of the query may take 
> long time. This causes client to hang up to 1 minute even after the results 
> were streamed back. During this time client just waits for cleanup to finish. 
> Cleanup can take place in the background in HiveServer.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24270) Move scratchdir cleanup to background

2020-10-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24270?focusedWorklogId=504820=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-504820
 ]

ASF GitHub Bot logged work on HIVE-24270:
-

Author: ASF GitHub Bot
Created on: 26/Oct/20 17:32
Start Date: 26/Oct/20 17:32
Worklog Time Spent: 10m 
  Work Description: mustafaiman commented on a change in pull request #1577:
URL: https://github.com/apache/hive/pull/1577#discussion_r512144358



##
File path: 
ql/src/java/org/apache/hadoop/hive/ql/cleanup/EventualCleanupService.java
##
@@ -0,0 +1,145 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hive.ql.cleanup;
+
+import com.google.common.annotations.VisibleForTesting;
+import com.google.common.util.concurrent.ThreadFactoryBuilder;
+import org.apache.hadoop.fs.FileSystem;
+import org.apache.hadoop.fs.Path;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+import java.util.concurrent.BlockingQueue;
+import java.util.concurrent.ExecutorService;
+import java.util.concurrent.Executors;
+import java.util.concurrent.LinkedBlockingQueue;
+import java.util.concurrent.ThreadFactory;
+import java.util.concurrent.TimeUnit;
+import java.util.concurrent.atomic.AtomicBoolean;
+
+public class EventualCleanupService implements CleanupService {
+  private final int threadCount;
+  private final int queueSize;
+  private final ThreadFactory factory;
+  private final Logger LOG = 
LoggerFactory.getLogger(EventualCleanupService.class.getName());
+  private final AtomicBoolean isRunning = new AtomicBoolean(true);
+  private final BlockingQueue deleteActions;
+  private ExecutorService cleanerExecutorService;
+
+  public EventualCleanupService(int threadCount, int queueSize) {
+if (queueSize < threadCount) {
+  throw new IllegalArgumentException("Queue size should be greater or 
equal to thread count. Queue size: "
+  + queueSize + ", thread count: " + threadCount);
+}
+this.factory = new 
ThreadFactoryBuilder().setDaemon(true).setNameFormat("EventualCleanupService 
thread %d").build();
+this.threadCount = threadCount;
+this.queueSize = queueSize;
+this.deleteActions = new LinkedBlockingQueue<>(queueSize);
+  }
+
+  @Override
+  public synchronized void start() {
+if (cleanerExecutorService != null) {
+  LOG.debug("EventualCleanupService is already running.");
+  return;
+}
+cleanerExecutorService = Executors.newFixedThreadPool(threadCount, 
factory);
+for (int i = 0; i < threadCount; i++) {
+  cleanerExecutorService.submit(new CleanupRunnable());

Review comment:
   CleanupService is one per HiveServer, not per session. So there will be 
N threads running all the time (actually blocked on blocking queue when there 
is no cleanup to be done, so not scheduled when unncessary).





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 504820)
Time Spent: 2h  (was: 1h 50m)

> Move scratchdir cleanup to background
> -
>
> Key: HIVE-24270
> URL: https://issues.apache.org/jira/browse/HIVE-24270
> Project: Hive
>  Issue Type: Improvement
>Reporter: Mustafa Iman
>Assignee: Mustafa Iman
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> In cloud environment, scratchdir cleaning at the end of the query may take 
> long time. This causes client to hang up to 1 minute even after the results 
> were streamed back. During this time client just waits for cleanup to finish. 
> Cleanup can take place in the background in HiveServer.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24270) Move scratchdir cleanup to background

2020-10-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24270?focusedWorklogId=504814=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-504814
 ]

ASF GitHub Bot logged work on HIVE-24270:
-

Author: ASF GitHub Bot
Created on: 26/Oct/20 17:21
Start Date: 26/Oct/20 17:21
Worklog Time Spent: 10m 
  Work Description: kgyrtkirk commented on a change in pull request #1577:
URL: https://github.com/apache/hive/pull/1577#discussion_r512136737



##
File path: 
service/src/java/org/apache/hive/service/cli/session/SessionManager.java
##
@@ -135,6 +138,13 @@ public synchronized void init(HiveConf hiveConf) {
 userIpAddressLimit = 
hiveConf.getIntVar(ConfVars.HIVE_SERVER2_LIMIT_CONNECTIONS_PER_USER_IPADDRESS);
 LOG.info("Connections limit are user: {} ipaddress: {} user-ipaddress: 
{}", userLimit, ipAddressLimit,
   userIpAddressLimit);
+
+int cleanupThreadCount = 
hiveConf.getIntVar(ConfVars.HIVE_ASYNC_CLEANUP_SERVICE_THREAD_COUNT);
+int cleanupQueueSize = 
hiveConf.getIntVar(ConfVars.HIVE_ASYNC_CLEANUP_SERVICE_QUEUE_SIZE);
+if (cleanupThreadCount > 0) {
+  cleanupService = new EventualCleanupService(cleanupThreadCount, 
cleanupQueueSize);
+}
+cleanupService.start();

Review comment:
   it seems to me that `cleanupService` is only initialized in case 
`cleanupThreadCount>0` - wouldn't there be an `NPE` when that is not satisfied?

##
File path: 
ql/src/java/org/apache/hadoop/hive/ql/cleanup/EventualCleanupService.java
##
@@ -0,0 +1,145 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hive.ql.cleanup;
+
+import com.google.common.annotations.VisibleForTesting;
+import com.google.common.util.concurrent.ThreadFactoryBuilder;
+import org.apache.hadoop.fs.FileSystem;
+import org.apache.hadoop.fs.Path;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+import java.util.concurrent.BlockingQueue;
+import java.util.concurrent.ExecutorService;
+import java.util.concurrent.Executors;
+import java.util.concurrent.LinkedBlockingQueue;
+import java.util.concurrent.ThreadFactory;
+import java.util.concurrent.TimeUnit;
+import java.util.concurrent.atomic.AtomicBoolean;
+
+public class EventualCleanupService implements CleanupService {
+  private final int threadCount;
+  private final int queueSize;
+  private final ThreadFactory factory;
+  private final Logger LOG = 
LoggerFactory.getLogger(EventualCleanupService.class.getName());
+  private final AtomicBoolean isRunning = new AtomicBoolean(true);
+  private final BlockingQueue deleteActions;
+  private ExecutorService cleanerExecutorService;
+
+  public EventualCleanupService(int threadCount, int queueSize) {
+if (queueSize < threadCount) {
+  throw new IllegalArgumentException("Queue size should be greater or 
equal to thread count. Queue size: "
+  + queueSize + ", thread count: " + threadCount);
+}
+this.factory = new 
ThreadFactoryBuilder().setDaemon(true).setNameFormat("EventualCleanupService 
thread %d").build();
+this.threadCount = threadCount;
+this.queueSize = queueSize;
+this.deleteActions = new LinkedBlockingQueue<>(queueSize);
+  }
+
+  @Override
+  public synchronized void start() {
+if (cleanerExecutorService != null) {
+  LOG.debug("EventualCleanupService is already running.");
+  return;
+}
+cleanerExecutorService = Executors.newFixedThreadPool(threadCount, 
factory);
+for (int i = 0; i < threadCount; i++) {
+  cleanerExecutorService.submit(new CleanupRunnable());

Review comment:
   I might be wrong but:
   * I think these threads will be there all the time - even if there is no 
work to be done (let's use `N` for this)
   * it seems to me that every session will have these cleaner threads for them 
(current session count: `M`)
   
   ...so there will be `N*M` threads running all the time?
   





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above 

[jira] [Work logged] (HIVE-24270) Move scratchdir cleanup to background

2020-10-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24270?focusedWorklogId=504332=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-504332
 ]

ASF GitHub Bot logged work on HIVE-24270:
-

Author: ASF GitHub Bot
Created on: 23/Oct/20 21:36
Start Date: 23/Oct/20 21:36
Worklog Time Spent: 10m 
  Work Description: nareshpr commented on pull request #1577:
URL: https://github.com/apache/hive/pull/1577#issuecomment-715600895


   LGTM



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 504332)
Time Spent: 1h 40m  (was: 1.5h)

> Move scratchdir cleanup to background
> -
>
> Key: HIVE-24270
> URL: https://issues.apache.org/jira/browse/HIVE-24270
> Project: Hive
>  Issue Type: Improvement
>Reporter: Mustafa Iman
>Assignee: Mustafa Iman
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> In cloud environment, scratchdir cleaning at the end of the query may take 
> long time. This causes client to hang up to 1 minute even after the results 
> were streamed back. During this time client just waits for cleanup to finish. 
> Cleanup can take place in the background in HiveServer.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24270) Move scratchdir cleanup to background

2020-10-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24270?focusedWorklogId=504250=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-504250
 ]

ASF GitHub Bot logged work on HIVE-24270:
-

Author: ASF GitHub Bot
Created on: 23/Oct/20 16:56
Start Date: 23/Oct/20 16:56
Worklog Time Spent: 10m 
  Work Description: mustafaiman commented on pull request #1577:
URL: https://github.com/apache/hive/pull/1577#issuecomment-715459208


   @kgyrtkirk @nareshpr I significantly changed the patch. Please let me know 
if you have further concerns.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 504250)
Time Spent: 1.5h  (was: 1h 20m)

> Move scratchdir cleanup to background
> -
>
> Key: HIVE-24270
> URL: https://issues.apache.org/jira/browse/HIVE-24270
> Project: Hive
>  Issue Type: Improvement
>Reporter: Mustafa Iman
>Assignee: Mustafa Iman
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> In cloud environment, scratchdir cleaning at the end of the query may take 
> long time. This causes client to hang up to 1 minute even after the results 
> were streamed back. During this time client just waits for cleanup to finish. 
> Cleanup can take place in the background in HiveServer.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24270) Move scratchdir cleanup to background

2020-10-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24270?focusedWorklogId=503523=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-503523
 ]

ASF GitHub Bot logged work on HIVE-24270:
-

Author: ASF GitHub Bot
Created on: 22/Oct/20 04:00
Start Date: 22/Oct/20 04:00
Worklog Time Spent: 10m 
  Work Description: rbalamohan commented on pull request #1577:
URL: https://github.com/apache/hive/pull/1577#issuecomment-714206578


   Thanks for revising the patch @mustafaiman . Recent patch LGTM. +1 pending 
tests.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 503523)
Time Spent: 1h 20m  (was: 1h 10m)

> Move scratchdir cleanup to background
> -
>
> Key: HIVE-24270
> URL: https://issues.apache.org/jira/browse/HIVE-24270
> Project: Hive
>  Issue Type: Improvement
>Reporter: Mustafa Iman
>Assignee: Mustafa Iman
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> In cloud environment, scratchdir cleaning at the end of the query may take 
> long time. This causes client to hang up to 1 minute even after the results 
> were streamed back. During this time client just waits for cleanup to finish. 
> Cleanup can take place in the background in HiveServer.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24270) Move scratchdir cleanup to background

2020-10-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24270?focusedWorklogId=503203=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-503203
 ]

ASF GitHub Bot logged work on HIVE-24270:
-

Author: ASF GitHub Bot
Created on: 21/Oct/20 14:00
Start Date: 21/Oct/20 14:00
Worklog Time Spent: 10m 
  Work Description: mustafaiman commented on a change in pull request #1577:
URL: https://github.com/apache/hive/pull/1577#discussion_r509312017



##
File path: ql/src/java/org/apache/hadoop/hive/ql/DriverUtils.java
##
@@ -95,7 +95,7 @@ public static SessionState setUpSessionState(HiveConf conf, 
String user, boolean
 if (sessionState == null) {
   // Note: we assume that workers run on the same threads repeatedly, so 
we can set up
   //   the session here and it will be reused without explicitly 
storing in the worker.
-  sessionState = new SessionState(conf, user);
+  sessionState = new SessionState(conf, user, true);

Review comment:
   background threads do not need async delete. Many compaction tests 
specifically have sync assumptions. I dont see any benefit in moving background 
operations to async cleanup model.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 503203)
Time Spent: 1h 10m  (was: 1h)

> Move scratchdir cleanup to background
> -
>
> Key: HIVE-24270
> URL: https://issues.apache.org/jira/browse/HIVE-24270
> Project: Hive
>  Issue Type: Improvement
>Reporter: Mustafa Iman
>Assignee: Mustafa Iman
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> In cloud environment, scratchdir cleaning at the end of the query may take 
> long time. This causes client to hang up to 1 minute even after the results 
> were streamed back. During this time client just waits for cleanup to finish. 
> Cleanup can take place in the background in HiveServer.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24270) Move scratchdir cleanup to background

2020-10-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24270?focusedWorklogId=503046=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-503046
 ]

ASF GitHub Bot logged work on HIVE-24270:
-

Author: ASF GitHub Bot
Created on: 21/Oct/20 07:00
Start Date: 21/Oct/20 07:00
Worklog Time Spent: 10m 
  Work Description: nareshpr commented on a change in pull request #1577:
URL: https://github.com/apache/hive/pull/1577#discussion_r509033974



##
File path: ql/src/java/org/apache/hadoop/hive/ql/DriverUtils.java
##
@@ -95,7 +95,7 @@ public static SessionState setUpSessionState(HiveConf conf, 
String user, boolean
 if (sessionState == null) {
   // Note: we assume that workers run on the same threads repeatedly, so 
we can set up
   //   the session here and it will be reused without explicitly 
storing in the worker.
-  sessionState = new SessionState(conf, user);
+  sessionState = new SessionState(conf, user, true);

Review comment:
   Are we targeting specific queries like auto-gather background stats 
threads & compaction ? why are we not providing a config to toggle sync/async 
delete ?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 503046)
Time Spent: 1h  (was: 50m)

> Move scratchdir cleanup to background
> -
>
> Key: HIVE-24270
> URL: https://issues.apache.org/jira/browse/HIVE-24270
> Project: Hive
>  Issue Type: Improvement
>Reporter: Mustafa Iman
>Assignee: Mustafa Iman
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> In cloud environment, scratchdir cleaning at the end of the query may take 
> long time. This causes client to hang up to 1 minute even after the results 
> were streamed back. During this time client just waits for cleanup to finish. 
> Cleanup can take place in the background in HiveServer.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24270) Move scratchdir cleanup to background

2020-10-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24270?focusedWorklogId=502415=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-502415
 ]

ASF GitHub Bot logged work on HIVE-24270:
-

Author: ASF GitHub Bot
Created on: 19/Oct/20 21:36
Start Date: 19/Oct/20 21:36
Worklog Time Spent: 10m 
  Work Description: mustafaiman commented on a change in pull request #1577:
URL: https://github.com/apache/hive/pull/1577#discussion_r508077019



##
File path: ql/src/java/org/apache/hadoop/hive/ql/DriverUtils.java
##
@@ -61,6 +61,7 @@ public static void runOnDriver(HiveConf conf, String user,
   throw new 
IllegalArgumentException(JavaUtils.txnIdToString(compactorTxnId) +
   " is not valid. Context: " + query);
 }
+sessionState.setSyncCleanup();

Review comment:
   This switch is actually meant to be constant for a specific session. 
I've put it as a setter as it seemed easier than putting it in constructor(many 
changes to many file). Sync cleanup is used for compactor and stats updater 
thread. Everything else will be async. I believe compactor and stats updater 
use their own session repeatedly. Unless a regular query re-uses 
compactor/statsupdater's session, there is no need to set it back to async. 
I'll double check if this is the case.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 502415)
Time Spent: 50m  (was: 40m)

> Move scratchdir cleanup to background
> -
>
> Key: HIVE-24270
> URL: https://issues.apache.org/jira/browse/HIVE-24270
> Project: Hive
>  Issue Type: Improvement
>Reporter: Mustafa Iman
>Assignee: Mustafa Iman
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> In cloud environment, scratchdir cleaning at the end of the query may take 
> long time. This causes client to hang up to 1 minute even after the results 
> were streamed back. During this time client just waits for cleanup to finish. 
> Cleanup can take place in the background in HiveServer.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24270) Move scratchdir cleanup to background

2020-10-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24270?focusedWorklogId=502191=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-502191
 ]

ASF GitHub Bot logged work on HIVE-24270:
-

Author: ASF GitHub Bot
Created on: 19/Oct/20 13:31
Start Date: 19/Oct/20 13:31
Worklog Time Spent: 10m 
  Work Description: kgyrtkirk commented on a change in pull request #1577:
URL: https://github.com/apache/hive/pull/1577#discussion_r507744113



##
File path: ql/src/java/org/apache/hadoop/hive/ql/PathCleaner.java
##
@@ -0,0 +1,117 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hive.ql;
+
+import org.apache.hadoop.fs.FileSystem;
+import org.apache.hadoop.fs.Path;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.Collection;
+import java.util.concurrent.BlockingDeque;
+import java.util.concurrent.LinkedBlockingDeque;
+import java.util.concurrent.TimeUnit;
+import java.util.concurrent.atomic.AtomicBoolean;
+
+/**
+ * This class is used to asynchronously remove directories after query 
execution
+ */
+public class PathCleaner {
+
+  private static final Logger LOG = 
LoggerFactory.getLogger(PathCleaner.class.getName());
+  private static final AsyncDeleteAction END_OF_PROCESS = new 
AsyncDeleteAction(null, null);
+
+  private final BlockingDeque deleteActions = new 
LinkedBlockingDeque<>();
+  private final AtomicBoolean isShutdown = new AtomicBoolean();

Review comment:
   note: all usages are negated for this boolean - instead of using 
negative logic (shutdown) ; using a positive name like "run" might make things 
easier to read/follow

##
File path: ql/src/java/org/apache/hadoop/hive/ql/Context.java
##
@@ -673,22 +673,27 @@ public void removeScratchDir() {
 if(this.fsResultCacheDirs != null) {
   resultCacheDir = this.fsResultCacheDirs.toUri().getPath();
 }
-for (Map.Entry entry : fsScratchDirs.entrySet()) {
+SessionState sessionState = SessionState.get();
+for (Path p: fsScratchDirs.values()) {
   try {
-Path p = entry.getValue();
 if (p.toUri().getPath().contains(stagingDir) && subDirOf(p, 
fsScratchDirs.values())  ) {
   LOG.debug("Skip deleting stagingDir: " + p);
   FileSystem fs = p.getFileSystem(conf);
   fs.cancelDeleteOnExit(p);
   continue; // staging dir is deleted when deleting the scratch dir
 }
-if(resultCacheDir == null || 
!p.toUri().getPath().contains(resultCacheDir)) {
+if (resultCacheDir == null || 
!p.toUri().getPath().contains(resultCacheDir)) {
   // delete only the paths which aren't result cache dir path
   // because that will be taken care by removeResultCacheDir
-FileSystem fs = p.getFileSystem(conf);
-LOG.debug("Deleting scratch dir: {}",  p);
-fs.delete(p, true);
-fs.cancelDeleteOnExit(p);
+  FileSystem fs = p.getFileSystem(conf);
+  if (sessionState.isSyncContextCleanup()) {

Review comment:
   I think another approach could be to create 2 PathCleaner 
implementations - the existing Async and the synch one; that could create a 
tighter contract seal between the usage and the implementations.

##
File path: ql/src/java/org/apache/hadoop/hive/ql/DriverUtils.java
##
@@ -61,6 +61,7 @@ public static void runOnDriver(HiveConf conf, String user,
   throw new 
IllegalArgumentException(JavaUtils.txnIdToString(compactorTxnId) +
   " is not valid. Context: " + query);
 }
+sessionState.setSyncCleanup();

Review comment:
   this is a one-way switch; so there is no way to switch back to the 
earlier behaviour

##
File path: ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java
##
@@ -388,6 +397,19 @@ public boolean getIsQtestLogging() {
 return isQtestLogging;
   }
 
+  public PathCleaner getPathCleaner() {
+if (pathCleaner == null) {
+  pathCleaner = new PathCleaner(getSessionId());
+  pathCleaner.start();
+}
+return 

[jira] [Work logged] (HIVE-24270) Move scratchdir cleanup to background

2020-10-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24270?focusedWorklogId=500883=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-500883
 ]

ASF GitHub Bot logged work on HIVE-24270:
-

Author: ASF GitHub Bot
Created on: 14/Oct/20 22:56
Start Date: 14/Oct/20 22:56
Worklog Time Spent: 10m 
  Work Description: mustafaiman commented on a change in pull request #1577:
URL: https://github.com/apache/hive/pull/1577#discussion_r505043441



##
File path: ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java
##
@@ -1852,6 +1863,9 @@ public void close() throws IOException {
   closeSparkSession();
   registry.closeCUDFLoaders();
   dropSessionPaths(sessionConf);
+  if (pathCleaner != null) {
+pathCleaner.shutdown();

Review comment:
   I dont see how this happens. Everything that needs to be deleted is 
guaranteed to be in PathCleaner's deleteActions list by the time shutdown is 
called. PathCleaner goes over everything in the list before exiting. See: 
https://github.com/apache/hive/pull/1577/files#diff-80bcef5e921dc6ac67a42ad691b771943f0ffa608454dae389c76b29139c61f9R65
 
   
   I didn't get your point with 1024 entries either. That is just the initial 
capacity of intermediate ArrayList. That is not an upper bound.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 500883)
Time Spent: 0.5h  (was: 20m)

> Move scratchdir cleanup to background
> -
>
> Key: HIVE-24270
> URL: https://issues.apache.org/jira/browse/HIVE-24270
> Project: Hive
>  Issue Type: Improvement
>Reporter: Mustafa Iman
>Assignee: Mustafa Iman
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> In cloud environment, scratchdir cleaning at the end of the query may take 
> long time. This causes client to hang up to 1 minute even after the results 
> were streamed back. During this time client just waits for cleanup to finish. 
> Cleanup can take place in the background in HiveServer.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24270) Move scratchdir cleanup to background

2020-10-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24270?focusedWorklogId=500396=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-500396
 ]

ASF GitHub Bot logged work on HIVE-24270:
-

Author: ASF GitHub Bot
Created on: 14/Oct/20 03:05
Start Date: 14/Oct/20 03:05
Worklog Time Spent: 10m 
  Work Description: rbalamohan commented on a change in pull request #1577:
URL: https://github.com/apache/hive/pull/1577#discussion_r504372394



##
File path: ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java
##
@@ -1852,6 +1863,9 @@ public void close() throws IOException {
   closeSparkSession();
   registry.closeCUDFLoaders();
   dropSessionPaths(sessionConf);
+  if (pathCleaner != null) {
+pathCleaner.shutdown();

Review comment:
   dropSessionPaths() would have created the thread and would have started 
deleting. As shutdown() is called immediately, it would stop further processing 
and quit without purging other entries. (i.e This would prevent the thread from 
processing > 1024 entries in PathCleaner. It would be good to wait until all 
files are purged before quitting the thread.

##
File path: ql/src/java/org/apache/hadoop/hive/ql/PathCleaner.java
##
@@ -0,0 +1,130 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hive.ql;
+
+import org.apache.hadoop.fs.FileSystem;
+import org.apache.hadoop.fs.Path;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.Collection;
+import java.util.concurrent.BlockingDeque;
+import java.util.concurrent.LinkedBlockingDeque;
+import java.util.concurrent.TimeUnit;
+import java.util.concurrent.atomic.AtomicBoolean;
+
+/**
+ * This class is used to asynchronously remove directories after query 
execution
+ */
+public class PathCleaner {
+
+  private static final Logger LOG = 
LoggerFactory.getLogger(PathCleaner.class.getName());
+
+  private final BlockingDeque deleteActions = new 
LinkedBlockingDeque<>();
+  private final AtomicBoolean isShutdown = new AtomicBoolean();
+  private final Thread cleanupThread;
+
+  public PathCleaner(String name) {
+cleanupThread = new Thread(new Runnable() {
+  @Override
+  public void run() {
+while (!isShutdown.get()) {
+  Path path = null;
+  FileSystem fs;
+  try {
+AsyncDeleteAction firstAction = deleteActions.poll(30, 
TimeUnit.SECONDS);
+if (firstAction != null) {

Review comment:
   Refer to the other comment plz. If firstAction is null, shutdown 
condition can be checked to quit the thread.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 500396)
Time Spent: 20m  (was: 10m)

> Move scratchdir cleanup to background
> -
>
> Key: HIVE-24270
> URL: https://issues.apache.org/jira/browse/HIVE-24270
> Project: Hive
>  Issue Type: Improvement
>Reporter: Mustafa Iman
>Assignee: Mustafa Iman
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> In cloud environment, scratchdir cleaning at the end of the query may take 
> long time. This causes client to hang up to 1 minute even after the results 
> were streamed back. During this time client just waits for cleanup to finish. 
> Cleanup can take place in the background in HiveServer.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24270) Move scratchdir cleanup to background

2020-10-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24270?focusedWorklogId=500299=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-500299
 ]

ASF GitHub Bot logged work on HIVE-24270:
-

Author: ASF GitHub Bot
Created on: 13/Oct/20 21:18
Start Date: 13/Oct/20 21:18
Worklog Time Spent: 10m 
  Work Description: mustafaiman opened a new pull request #1577:
URL: https://github.com/apache/hive/pull/1577


   Change-Id: Ife47a373720bca86edab5f911eade9a471ac5b35
   
   
   
   ### What changes were proposed in this pull request?
   
   
   
   ### Why are the changes needed?
   
   
   
   ### Does this PR introduce _any_ user-facing change?
   
   
   
   ### How was this patch tested?
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 500299)
Remaining Estimate: 0h
Time Spent: 10m

> Move scratchdir cleanup to background
> -
>
> Key: HIVE-24270
> URL: https://issues.apache.org/jira/browse/HIVE-24270
> Project: Hive
>  Issue Type: Improvement
>Reporter: Mustafa Iman
>Assignee: Mustafa Iman
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> In cloud environment, scratchdir cleaning at the end of the query may take 
> long time. This causes client to hang up to 1 minute even after the results 
> were streamed back. During this time client just waits for cleanup to finish. 
> Cleanup can take place in the background in HiveServer.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)