[jira] [Commented] (DRILL-6125) PartitionSenderRootExec can leak memory because close method is not synchronized

2018-03-30 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-6125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16420568#comment-16420568
 ] 

ASF GitHub Bot commented on DRILL-6125:
---

Github user asfgit closed the pull request at:

https://github.com/apache/drill/pull/1105


> PartitionSenderRootExec can leak memory because close method is not 
> synchronized
> 
>
> Key: DRILL-6125
> URL: https://issues.apache.org/jira/browse/DRILL-6125
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.13.0
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>Priority: Minor
>  Labels: ready-to-commit
> Fix For: 1.14.0
>
>
> PartitionSenderRootExec creates a PartitionerDecorator and saves it in the 
> *partitioner* field. The creation of the partitioner happens in the 
> createPartitioner method. This method get's called by the main fragment 
> thread. The partitioner field is accessed by the fragment thread during 
> normal execution but it can also be accessed by the receivingFragmentFinished 
> method which is a callback executed by the event processor thread. Because 
> multiple threads can access the partitioner field synchronization is done on 
> creation and on when receivingFragmentFinished. However, the close method can 
> also be called by the event processor thread, and the close method does not 
> synchronize before accessing the partitioner field. Since synchronization is 
> not done the event processor thread may have an old reference to the 
> partitioner when a query cancellation is done. Since it has an old reference 
> the current partitioner can may not be cleared and a memory leak may occur.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6125) PartitionSenderRootExec can leak memory because close method is not synchronized

2018-03-29 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-6125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16419083#comment-16419083
 ] 

ASF GitHub Bot commented on DRILL-6125:
---

Github user arina-ielchiieva commented on the issue:

https://github.com/apache/drill/pull/1105
  
+1


> PartitionSenderRootExec can leak memory because close method is not 
> synchronized
> 
>
> Key: DRILL-6125
> URL: https://issues.apache.org/jira/browse/DRILL-6125
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.13.0
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>Priority: Minor
>  Labels: ready-to-commit
> Fix For: 1.14.0
>
>
> PartitionSenderRootExec creates a PartitionerDecorator and saves it in the 
> *partitioner* field. The creation of the partitioner happens in the 
> createPartitioner method. This method get's called by the main fragment 
> thread. The partitioner field is accessed by the fragment thread during 
> normal execution but it can also be accessed by the receivingFragmentFinished 
> method which is a callback executed by the event processor thread. Because 
> multiple threads can access the partitioner field synchronization is done on 
> creation and on when receivingFragmentFinished. However, the close method can 
> also be called by the event processor thread, and the close method does not 
> synchronize before accessing the partitioner field. Since synchronization is 
> not done the event processor thread may have an old reference to the 
> partitioner when a query cancellation is done. Since it has an old reference 
> the current partitioner can may not be cleared and a memory leak may occur.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6125) PartitionSenderRootExec can leak memory because close method is not synchronized

2018-03-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-6125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16416189#comment-16416189
 ] 

ASF GitHub Bot commented on DRILL-6125:
---

Github user vrozov commented on the issue:

https://github.com/apache/drill/pull/1105
  
@arina-ielchiieva Please review.


> PartitionSenderRootExec can leak memory because close method is not 
> synchronized
> 
>
> Key: DRILL-6125
> URL: https://issues.apache.org/jira/browse/DRILL-6125
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.13.0
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>Priority: Minor
> Fix For: 1.14.0
>
>
> PartitionSenderRootExec creates a PartitionerDecorator and saves it in the 
> *partitioner* field. The creation of the partitioner happens in the 
> createPartitioner method. This method get's called by the main fragment 
> thread. The partitioner field is accessed by the fragment thread during 
> normal execution but it can also be accessed by the receivingFragmentFinished 
> method which is a callback executed by the event processor thread. Because 
> multiple threads can access the partitioner field synchronization is done on 
> creation and on when receivingFragmentFinished. However, the close method can 
> also be called by the event processor thread, and the close method does not 
> synchronize before accessing the partitioner field. Since synchronization is 
> not done the event processor thread may have an old reference to the 
> partitioner when a query cancellation is done. Since it has an old reference 
> the current partitioner can may not be cleared and a memory leak may occur.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6125) PartitionSenderRootExec can leak memory because close method is not synchronized

2018-03-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-6125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16395574#comment-16395574
 ] 

ASF GitHub Bot commented on DRILL-6125:
---

Github user ilooner commented on the issue:

https://github.com/apache/drill/pull/1105
  
Squashed commits. @arina-ielchiieva please let me know if you have any 
comments.


> PartitionSenderRootExec can leak memory because close method is not 
> synchronized
> 
>
> Key: DRILL-6125
> URL: https://issues.apache.org/jira/browse/DRILL-6125
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.13.0
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>Priority: Minor
> Fix For: 1.14.0
>
>
> PartitionSenderRootExec creates a PartitionerDecorator and saves it in the 
> *partitioner* field. The creation of the partitioner happens in the 
> createPartitioner method. This method get's called by the main fragment 
> thread. The partitioner field is accessed by the fragment thread during 
> normal execution but it can also be accessed by the receivingFragmentFinished 
> method which is a callback executed by the event processor thread. Because 
> multiple threads can access the partitioner field synchronization is done on 
> creation and on when receivingFragmentFinished. However, the close method can 
> also be called by the event processor thread, and the close method does not 
> synchronize before accessing the partitioner field. Since synchronization is 
> not done the event processor thread may have an old reference to the 
> partitioner when a query cancellation is done. Since it has an old reference 
> the current partitioner can may not be cleared and a memory leak may occur.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6125) PartitionSenderRootExec can leak memory because close method is not synchronized

2018-03-09 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-6125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16393419#comment-16393419
 ] 

ASF GitHub Bot commented on DRILL-6125:
---

Github user vrozov commented on a diff in the pull request:

https://github.com/apache/drill/pull/1105#discussion_r173545351
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/work/fragment/FragmentExecutor.java
 ---
@@ -227,11 +280,19 @@ public void run() {
 @Override
 public Void run() throws Exception {
   injector.injectChecked(fragmentContext.getExecutionControls(), 
"fragment-execution", IOException.class);
-  /*
-   * Run the query until root.next returns false OR we no longer 
need to continue.
-   */
-  while (shouldContinue() && root.next()) {
-// loop
+
+  while (shouldContinue()) {
+// Fragment is not cancelled
+
+for (FragmentHandle fragmentHandle; (fragmentHandle = 
receiverFinishedQueue.poll()) != null;) {
--- End diff --

OK, I see.


> PartitionSenderRootExec can leak memory because close method is not 
> synchronized
> 
>
> Key: DRILL-6125
> URL: https://issues.apache.org/jira/browse/DRILL-6125
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.13.0
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>Priority: Minor
> Fix For: 1.14.0
>
>
> PartitionSenderRootExec creates a PartitionerDecorator and saves it in the 
> *partitioner* field. The creation of the partitioner happens in the 
> createPartitioner method. This method get's called by the main fragment 
> thread. The partitioner field is accessed by the fragment thread during 
> normal execution but it can also be accessed by the receivingFragmentFinished 
> method which is a callback executed by the event processor thread. Because 
> multiple threads can access the partitioner field synchronization is done on 
> creation and on when receivingFragmentFinished. However, the close method can 
> also be called by the event processor thread, and the close method does not 
> synchronize before accessing the partitioner field. Since synchronization is 
> not done the event processor thread may have an old reference to the 
> partitioner when a query cancellation is done. Since it has an old reference 
> the current partitioner can may not be cleared and a memory leak may occur.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6125) PartitionSenderRootExec can leak memory because close method is not synchronized

2018-03-09 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-6125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16393411#comment-16393411
 ] 

ASF GitHub Bot commented on DRILL-6125:
---

Github user ilooner commented on a diff in the pull request:

https://github.com/apache/drill/pull/1105#discussion_r173543287
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/work/fragment/FragmentExecutor.java
 ---
@@ -245,19 +306,17 @@ public Void run() throws Exception {
 // we have a heap out of memory error. The JVM in unstable, exit.
 CatastrophicFailure.exit(e, "Unable to handle out of memory 
condition in FragmentExecutor.", -2);
   }
+} catch (InterruptedException e) {
+  // Swallow interrupted exceptions since we intentionally interrupt 
the root when cancelling a query
+  logger.trace("Interruped root: {}", e);
 } catch (Throwable t) {
   fail(t);
 } finally {
 
-  // no longer allow this thread to be interrupted. We synchronize 
here to make sure that cancel can't set an
-  // interruption after we have moved beyond this block.
-  synchronized (myThreadRef) {
-myThreadRef.set(null);
-Thread.interrupted();
-  }
-
-  // Make sure the event processor is started at least once
-  eventProcessor.start();
+  // Don't process any more termination requests, we are done.
+  eventProcessor.terminate();
--- End diff --

There is a corner case. If we didn't include eventProcessor.terminate() we 
could theoretically receive a cancellation request for the first time after the 
interrupts were cleared for the FragmentExecutor#run thread. The cancellation 
would then interrupt the Thread again, and our FragmentExecutor would finish 
and leave the thread it used in an interrupted state. This could cause problems 
for the next FragmentExecutor that uses the same thread.


> PartitionSenderRootExec can leak memory because close method is not 
> synchronized
> 
>
> Key: DRILL-6125
> URL: https://issues.apache.org/jira/browse/DRILL-6125
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.13.0
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>Priority: Minor
> Fix For: 1.14.0
>
>
> PartitionSenderRootExec creates a PartitionerDecorator and saves it in the 
> *partitioner* field. The creation of the partitioner happens in the 
> createPartitioner method. This method get's called by the main fragment 
> thread. The partitioner field is accessed by the fragment thread during 
> normal execution but it can also be accessed by the receivingFragmentFinished 
> method which is a callback executed by the event processor thread. Because 
> multiple threads can access the partitioner field synchronization is done on 
> creation and on when receivingFragmentFinished. However, the close method can 
> also be called by the event processor thread, and the close method does not 
> synchronize before accessing the partitioner field. Since synchronization is 
> not done the event processor thread may have an old reference to the 
> partitioner when a query cancellation is done. Since it has an old reference 
> the current partitioner can may not be cleared and a memory leak may occur.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6125) PartitionSenderRootExec can leak memory because close method is not synchronized

2018-03-09 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-6125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16393397#comment-16393397
 ] 

ASF GitHub Bot commented on DRILL-6125:
---

Github user ilooner commented on a diff in the pull request:

https://github.com/apache/drill/pull/1105#discussion_r173542042
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/work/fragment/FragmentExecutor.java
 ---
@@ -227,11 +280,19 @@ public void run() {
 @Override
 public Void run() throws Exception {
   injector.injectChecked(fragmentContext.getExecutionControls(), 
"fragment-execution", IOException.class);
-  /*
-   * Run the query until root.next returns false OR we no longer 
need to continue.
-   */
-  while (shouldContinue() && root.next()) {
-// loop
+
+  while (shouldContinue()) {
+// Fragment is not cancelled
+
+for (FragmentHandle fragmentHandle; (fragmentHandle = 
receiverFinishedQueue.poll()) != null;) {
--- End diff --

The original code allowed 
PartitionSenderRootExec#receivingFragmentFinished() to be called before, after 
or concurrently with PartitionSenderRootExec#innerNext(). The idea was that 
PartitionSenderRootExec#receivingFragmentFinished would terminate a partition. 
The terminate method of OutgoingRecordBatch would cause all outgoing records to 
a finished receiver to be dropped. So the finished downstream receivers would 
stop receiving data from the PartitionSenderRootExec, but the downstream 
receivers that did not finish would continue to receive records. There is a 
gray area here because I'm not sure why only some of the downstream receivers 
would send a RECEIVER_FINISHED message and others wouldn't, but the original 
design seems to make an assumption that this is a very common thing and is 
optimized for it. So I assume the original authors know something that we don't 
with respect to that.

Also calling receivingFragmentFinished after we finished processing all the 
data would defeat the purpose, since the intention was to allow our fragment to 
terminate early if the downstream receivers decided they don't need anymore 
data (ex. A select limit query which only asks the first 100 rows of a result 
with 10 million rows). If we called it only after the next() loop was done we 
would always process all upstream records even when we didn't have to and we 
would see significant degradation in the performance of limit queries.




> PartitionSenderRootExec can leak memory because close method is not 
> synchronized
> 
>
> Key: DRILL-6125
> URL: https://issues.apache.org/jira/browse/DRILL-6125
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.13.0
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>Priority: Minor
> Fix For: 1.14.0
>
>
> PartitionSenderRootExec creates a PartitionerDecorator and saves it in the 
> *partitioner* field. The creation of the partitioner happens in the 
> createPartitioner method. This method get's called by the main fragment 
> thread. The partitioner field is accessed by the fragment thread during 
> normal execution but it can also be accessed by the receivingFragmentFinished 
> method which is a callback executed by the event processor thread. Because 
> multiple threads can access the partitioner field synchronization is done on 
> creation and on when receivingFragmentFinished. However, the close method can 
> also be called by the event processor thread, and the close method does not 
> synchronize before accessing the partitioner field. Since synchronization is 
> not done the event processor thread may have an old reference to the 
> partitioner when a query cancellation is done. Since it has an old reference 
> the current partitioner can may not be cleared and a memory leak may occur.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6125) PartitionSenderRootExec can leak memory because close method is not synchronized

2018-03-09 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-6125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16393371#comment-16393371
 ] 

ASF GitHub Bot commented on DRILL-6125:
---

Github user ilooner commented on a diff in the pull request:

https://github.com/apache/drill/pull/1105#discussion_r173537455
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/work/fragment/FragmentExecutor.java
 ---
@@ -488,47 +548,74 @@ void receiverFinished(FragmentHandle handle) {
   sendEvent(new FragmentEvent(EventType.RECEIVER_FINISHED, handle));
 }
 
+/**
+ * Tell the {@link FragmentEventProcessor} not to process anymore 
events. This keeps stray cancellation requests
+ * from being processed after the root has finished running and 
interrupts in the root thread have been cleared.
+ */
+public void terminate() {
+  terminate.set(true);
+}
+
 @Override
 protected void processEvent(FragmentEvent event) {
+  if (event.type.equals(EventType.RECEIVER_FINISHED)) {
+// Finish request
+if (terminate.get()) {
+  // We have already recieved a cancellation or we have terminated 
the event processor. Do not process anymore finish requests.
+  return;
+}
+  } else {
+// Cancel request
+if (!terminate.compareAndSet(false, true)) {
+  // We have already received a cancellation or we have terminated 
the event processor. Do not process anymore cancellation requests.
+  // This prevents the root thread from being interrupted at an 
inappropriate time.
+  return;
+}
+  }
+
   switch (event.type) {
 case CANCEL:
-  /*
-   * We set the cancel requested flag but the actual cancellation 
is managed by the run() loop, if called.
-   */
+  // We set the cancel requested flag but the actual cancellation 
is managed by the run() loop, if called.
   updateState(FragmentState.CANCELLATION_REQUESTED);
-
-  /*
-   * Interrupt the thread so that it exits from any blocking 
operation it could be executing currently. We
-   * synchronize here to ensure we don't accidentally create a 
race condition where we interrupt the close out
-   * procedure of the main thread.
-  */
-  synchronized (myThreadRef) {
-final Thread myThread = myThreadRef.get();
-if (myThread != null) {
-  logger.debug("Interrupting fragment thread {}", 
myThread.getName());
-  myThread.interrupt();
-}
-  }
+  // The root was started so we have to interrupt it in case it is 
performing a blocking operation.
+  killThread();
+  terminate.set(true);
--- End diff --

Doh! Thanks for catching


> PartitionSenderRootExec can leak memory because close method is not 
> synchronized
> 
>
> Key: DRILL-6125
> URL: https://issues.apache.org/jira/browse/DRILL-6125
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.13.0
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>Priority: Minor
> Fix For: 1.14.0
>
>
> PartitionSenderRootExec creates a PartitionerDecorator and saves it in the 
> *partitioner* field. The creation of the partitioner happens in the 
> createPartitioner method. This method get's called by the main fragment 
> thread. The partitioner field is accessed by the fragment thread during 
> normal execution but it can also be accessed by the receivingFragmentFinished 
> method which is a callback executed by the event processor thread. Because 
> multiple threads can access the partitioner field synchronization is done on 
> creation and on when receivingFragmentFinished. However, the close method can 
> also be called by the event processor thread, and the close method does not 
> synchronize before accessing the partitioner field. Since synchronization is 
> not done the event processor thread may have an old reference to the 
> partitioner when a query cancellation is done. Since it has an old reference 
> the current partitioner can may not be cleared and a memory leak may occur.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6125) PartitionSenderRootExec can leak memory because close method is not synchronized

2018-03-09 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-6125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16393372#comment-16393372
 ] 

ASF GitHub Bot commented on DRILL-6125:
---

Github user ilooner commented on a diff in the pull request:

https://github.com/apache/drill/pull/1105#discussion_r173537487
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/work/fragment/FragmentExecutor.java
 ---
@@ -488,47 +548,74 @@ void receiverFinished(FragmentHandle handle) {
   sendEvent(new FragmentEvent(EventType.RECEIVER_FINISHED, handle));
 }
 
+/**
+ * Tell the {@link FragmentEventProcessor} not to process anymore 
events. This keeps stray cancellation requests
+ * from being processed after the root has finished running and 
interrupts in the root thread have been cleared.
+ */
+public void terminate() {
+  terminate.set(true);
+}
+
 @Override
 protected void processEvent(FragmentEvent event) {
+  if (event.type.equals(EventType.RECEIVER_FINISHED)) {
+// Finish request
+if (terminate.get()) {
+  // We have already recieved a cancellation or we have terminated 
the event processor. Do not process anymore finish requests.
+  return;
+}
+  } else {
+// Cancel request
+if (!terminate.compareAndSet(false, true)) {
+  // We have already received a cancellation or we have terminated 
the event processor. Do not process anymore cancellation requests.
+  // This prevents the root thread from being interrupted at an 
inappropriate time.
+  return;
+}
+  }
+
   switch (event.type) {
 case CANCEL:
-  /*
-   * We set the cancel requested flag but the actual cancellation 
is managed by the run() loop, if called.
-   */
+  // We set the cancel requested flag but the actual cancellation 
is managed by the run() loop, if called.
   updateState(FragmentState.CANCELLATION_REQUESTED);
-
-  /*
-   * Interrupt the thread so that it exits from any blocking 
operation it could be executing currently. We
-   * synchronize here to ensure we don't accidentally create a 
race condition where we interrupt the close out
-   * procedure of the main thread.
-  */
-  synchronized (myThreadRef) {
-final Thread myThread = myThreadRef.get();
-if (myThread != null) {
-  logger.debug("Interrupting fragment thread {}", 
myThread.getName());
-  myThread.interrupt();
-}
-  }
+  // The root was started so we have to interrupt it in case it is 
performing a blocking operation.
+  killThread();
+  terminate.set(true);
   break;
-
 case CANCEL_AND_FINISH:
+  // In this case the root was never started so we do not have to 
interrupt the thread.
   updateState(FragmentState.CANCELLATION_REQUESTED);
+  // The FragmentExecutor#run() loop will not execute in this case 
so we have to cleanup resources here
   cleanup(FragmentState.FINISHED);
+  terminate.set(true);
--- End diff --

Thanks for catching


> PartitionSenderRootExec can leak memory because close method is not 
> synchronized
> 
>
> Key: DRILL-6125
> URL: https://issues.apache.org/jira/browse/DRILL-6125
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.13.0
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>Priority: Minor
> Fix For: 1.14.0
>
>
> PartitionSenderRootExec creates a PartitionerDecorator and saves it in the 
> *partitioner* field. The creation of the partitioner happens in the 
> createPartitioner method. This method get's called by the main fragment 
> thread. The partitioner field is accessed by the fragment thread during 
> normal execution but it can also be accessed by the receivingFragmentFinished 
> method which is a callback executed by the event processor thread. Because 
> multiple threads can access the partitioner field synchronization is done on 
> creation and on when receivingFragmentFinished. However, the close method can 
> also be called by the event processor thread, and the close method does not 
> synchronize before accessing the partitioner field. Since synchronization is 
> not done the event processor thread may have an old reference to the 
> partitioner when a query cancellation is done. Since it has an old reference 
> the current partitioner can may not 

[jira] [Commented] (DRILL-6125) PartitionSenderRootExec can leak memory because close method is not synchronized

2018-03-09 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-6125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16393325#comment-16393325
 ] 

ASF GitHub Bot commented on DRILL-6125:
---

Github user vrozov commented on a diff in the pull request:

https://github.com/apache/drill/pull/1105#discussion_r173527617
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/work/fragment/FragmentExecutor.java
 ---
@@ -227,11 +280,19 @@ public void run() {
 @Override
 public Void run() throws Exception {
   injector.injectChecked(fragmentContext.getExecutionControls(), 
"fragment-execution", IOException.class);
-  /*
-   * Run the query until root.next returns false OR we no longer 
need to continue.
-   */
-  while (shouldContinue() && root.next()) {
-// loop
+
+  while (shouldContinue()) {
+// Fragment is not cancelled
+
+for (FragmentHandle fragmentHandle; (fragmentHandle = 
receiverFinishedQueue.poll()) != null;) {
--- End diff --

How this processing affects `root.next()`? Should it be also/only done 
after fragment is canceled or done (after while loop)?


> PartitionSenderRootExec can leak memory because close method is not 
> synchronized
> 
>
> Key: DRILL-6125
> URL: https://issues.apache.org/jira/browse/DRILL-6125
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.13.0
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>Priority: Minor
> Fix For: 1.14.0
>
>
> PartitionSenderRootExec creates a PartitionerDecorator and saves it in the 
> *partitioner* field. The creation of the partitioner happens in the 
> createPartitioner method. This method get's called by the main fragment 
> thread. The partitioner field is accessed by the fragment thread during 
> normal execution but it can also be accessed by the receivingFragmentFinished 
> method which is a callback executed by the event processor thread. Because 
> multiple threads can access the partitioner field synchronization is done on 
> creation and on when receivingFragmentFinished. However, the close method can 
> also be called by the event processor thread, and the close method does not 
> synchronize before accessing the partitioner field. Since synchronization is 
> not done the event processor thread may have an old reference to the 
> partitioner when a query cancellation is done. Since it has an old reference 
> the current partitioner can may not be cleared and a memory leak may occur.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6125) PartitionSenderRootExec can leak memory because close method is not synchronized

2018-03-09 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-6125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16393326#comment-16393326
 ] 

ASF GitHub Bot commented on DRILL-6125:
---

Github user vrozov commented on a diff in the pull request:

https://github.com/apache/drill/pull/1105#discussion_r173528013
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/work/fragment/FragmentExecutor.java
 ---
@@ -488,47 +548,74 @@ void receiverFinished(FragmentHandle handle) {
   sendEvent(new FragmentEvent(EventType.RECEIVER_FINISHED, handle));
 }
 
+/**
+ * Tell the {@link FragmentEventProcessor} not to process anymore 
events. This keeps stray cancellation requests
+ * from being processed after the root has finished running and 
interrupts in the root thread have been cleared.
+ */
+public void terminate() {
+  terminate.set(true);
+}
+
 @Override
 protected void processEvent(FragmentEvent event) {
+  if (event.type.equals(EventType.RECEIVER_FINISHED)) {
+// Finish request
+if (terminate.get()) {
+  // We have already recieved a cancellation or we have terminated 
the event processor. Do not process anymore finish requests.
+  return;
+}
+  } else {
+// Cancel request
+if (!terminate.compareAndSet(false, true)) {
+  // We have already received a cancellation or we have terminated 
the event processor. Do not process anymore cancellation requests.
+  // This prevents the root thread from being interrupted at an 
inappropriate time.
+  return;
+}
+  }
+
   switch (event.type) {
 case CANCEL:
-  /*
-   * We set the cancel requested flag but the actual cancellation 
is managed by the run() loop, if called.
-   */
+  // We set the cancel requested flag but the actual cancellation 
is managed by the run() loop, if called.
   updateState(FragmentState.CANCELLATION_REQUESTED);
-
-  /*
-   * Interrupt the thread so that it exits from any blocking 
operation it could be executing currently. We
-   * synchronize here to ensure we don't accidentally create a 
race condition where we interrupt the close out
-   * procedure of the main thread.
-  */
-  synchronized (myThreadRef) {
-final Thread myThread = myThreadRef.get();
-if (myThread != null) {
-  logger.debug("Interrupting fragment thread {}", 
myThread.getName());
-  myThread.interrupt();
-}
-  }
+  // The root was started so we have to interrupt it in case it is 
performing a blocking operation.
+  killThread();
+  terminate.set(true);
   break;
-
 case CANCEL_AND_FINISH:
+  // In this case the root was never started so we do not have to 
interrupt the thread.
   updateState(FragmentState.CANCELLATION_REQUESTED);
+  // The FragmentExecutor#run() loop will not execute in this case 
so we have to cleanup resources here
   cleanup(FragmentState.FINISHED);
+  terminate.set(true);
--- End diff --

it is not necessary (terminate should be already true).


> PartitionSenderRootExec can leak memory because close method is not 
> synchronized
> 
>
> Key: DRILL-6125
> URL: https://issues.apache.org/jira/browse/DRILL-6125
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.13.0
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>Priority: Minor
> Fix For: 1.14.0
>
>
> PartitionSenderRootExec creates a PartitionerDecorator and saves it in the 
> *partitioner* field. The creation of the partitioner happens in the 
> createPartitioner method. This method get's called by the main fragment 
> thread. The partitioner field is accessed by the fragment thread during 
> normal execution but it can also be accessed by the receivingFragmentFinished 
> method which is a callback executed by the event processor thread. Because 
> multiple threads can access the partitioner field synchronization is done on 
> creation and on when receivingFragmentFinished. However, the close method can 
> also be called by the event processor thread, and the close method does not 
> synchronize before accessing the partitioner field. Since synchronization is 
> not done the event processor thread may have an old reference to the 
> partitioner when a query cancellation is done. Since it has an old reference 
> 

[jira] [Commented] (DRILL-6125) PartitionSenderRootExec can leak memory because close method is not synchronized

2018-03-09 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-6125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16393323#comment-16393323
 ] 

ASF GitHub Bot commented on DRILL-6125:
---

Github user vrozov commented on a diff in the pull request:

https://github.com/apache/drill/pull/1105#discussion_r173526625
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/work/fragment/FragmentExecutor.java
 ---
@@ -245,19 +306,17 @@ public Void run() throws Exception {
 // we have a heap out of memory error. The JVM in unstable, exit.
 CatastrophicFailure.exit(e, "Unable to handle out of memory 
condition in FragmentExecutor.", -2);
   }
+} catch (InterruptedException e) {
+  // Swallow interrupted exceptions since we intentionally interrupt 
the root when cancelling a query
+  logger.trace("Interruped root: {}", e);
 } catch (Throwable t) {
   fail(t);
 } finally {
 
-  // no longer allow this thread to be interrupted. We synchronize 
here to make sure that cancel can't set an
-  // interruption after we have moved beyond this block.
-  synchronized (myThreadRef) {
-myThreadRef.set(null);
-Thread.interrupted();
-  }
-
-  // Make sure the event processor is started at least once
-  eventProcessor.start();
+  // Don't process any more termination requests, we are done.
+  eventProcessor.terminate();
--- End diff --

Is this terminate() required?


> PartitionSenderRootExec can leak memory because close method is not 
> synchronized
> 
>
> Key: DRILL-6125
> URL: https://issues.apache.org/jira/browse/DRILL-6125
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.13.0
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>Priority: Minor
> Fix For: 1.14.0
>
>
> PartitionSenderRootExec creates a PartitionerDecorator and saves it in the 
> *partitioner* field. The creation of the partitioner happens in the 
> createPartitioner method. This method get's called by the main fragment 
> thread. The partitioner field is accessed by the fragment thread during 
> normal execution but it can also be accessed by the receivingFragmentFinished 
> method which is a callback executed by the event processor thread. Because 
> multiple threads can access the partitioner field synchronization is done on 
> creation and on when receivingFragmentFinished. However, the close method can 
> also be called by the event processor thread, and the close method does not 
> synchronize before accessing the partitioner field. Since synchronization is 
> not done the event processor thread may have an old reference to the 
> partitioner when a query cancellation is done. Since it has an old reference 
> the current partitioner can may not be cleared and a memory leak may occur.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6125) PartitionSenderRootExec can leak memory because close method is not synchronized

2018-03-09 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-6125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16393324#comment-16393324
 ] 

ASF GitHub Bot commented on DRILL-6125:
---

Github user vrozov commented on a diff in the pull request:

https://github.com/apache/drill/pull/1105#discussion_r173527951
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/work/fragment/FragmentExecutor.java
 ---
@@ -488,47 +548,74 @@ void receiverFinished(FragmentHandle handle) {
   sendEvent(new FragmentEvent(EventType.RECEIVER_FINISHED, handle));
 }
 
+/**
+ * Tell the {@link FragmentEventProcessor} not to process anymore 
events. This keeps stray cancellation requests
+ * from being processed after the root has finished running and 
interrupts in the root thread have been cleared.
+ */
+public void terminate() {
+  terminate.set(true);
+}
+
 @Override
 protected void processEvent(FragmentEvent event) {
+  if (event.type.equals(EventType.RECEIVER_FINISHED)) {
+// Finish request
+if (terminate.get()) {
+  // We have already recieved a cancellation or we have terminated 
the event processor. Do not process anymore finish requests.
+  return;
+}
+  } else {
+// Cancel request
+if (!terminate.compareAndSet(false, true)) {
+  // We have already received a cancellation or we have terminated 
the event processor. Do not process anymore cancellation requests.
+  // This prevents the root thread from being interrupted at an 
inappropriate time.
+  return;
+}
+  }
+
   switch (event.type) {
 case CANCEL:
-  /*
-   * We set the cancel requested flag but the actual cancellation 
is managed by the run() loop, if called.
-   */
+  // We set the cancel requested flag but the actual cancellation 
is managed by the run() loop, if called.
   updateState(FragmentState.CANCELLATION_REQUESTED);
-
-  /*
-   * Interrupt the thread so that it exits from any blocking 
operation it could be executing currently. We
-   * synchronize here to ensure we don't accidentally create a 
race condition where we interrupt the close out
-   * procedure of the main thread.
-  */
-  synchronized (myThreadRef) {
-final Thread myThread = myThreadRef.get();
-if (myThread != null) {
-  logger.debug("Interrupting fragment thread {}", 
myThread.getName());
-  myThread.interrupt();
-}
-  }
+  // The root was started so we have to interrupt it in case it is 
performing a blocking operation.
+  killThread();
+  terminate.set(true);
--- End diff --

it is not necessary (terminate should be already true).


> PartitionSenderRootExec can leak memory because close method is not 
> synchronized
> 
>
> Key: DRILL-6125
> URL: https://issues.apache.org/jira/browse/DRILL-6125
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.13.0
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>Priority: Minor
> Fix For: 1.14.0
>
>
> PartitionSenderRootExec creates a PartitionerDecorator and saves it in the 
> *partitioner* field. The creation of the partitioner happens in the 
> createPartitioner method. This method get's called by the main fragment 
> thread. The partitioner field is accessed by the fragment thread during 
> normal execution but it can also be accessed by the receivingFragmentFinished 
> method which is a callback executed by the event processor thread. Because 
> multiple threads can access the partitioner field synchronization is done on 
> creation and on when receivingFragmentFinished. However, the close method can 
> also be called by the event processor thread, and the close method does not 
> synchronize before accessing the partitioner field. Since synchronization is 
> not done the event processor thread may have an old reference to the 
> partitioner when a query cancellation is done. Since it has an old reference 
> the current partitioner can may not be cleared and a memory leak may occur.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6125) PartitionSenderRootExec can leak memory because close method is not synchronized

2018-03-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-6125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16392513#comment-16392513
 ] 

ASF GitHub Bot commented on DRILL-6125:
---

Github user ilooner commented on the issue:

https://github.com/apache/drill/pull/1105
  
@vrozov @arina-ielchiieva Applied review comments, please let me know if 
there are anymore comments.


> PartitionSenderRootExec can leak memory because close method is not 
> synchronized
> 
>
> Key: DRILL-6125
> URL: https://issues.apache.org/jira/browse/DRILL-6125
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.13.0
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>Priority: Minor
> Fix For: 1.14.0
>
>
> PartitionSenderRootExec creates a PartitionerDecorator and saves it in the 
> *partitioner* field. The creation of the partitioner happens in the 
> createPartitioner method. This method get's called by the main fragment 
> thread. The partitioner field is accessed by the fragment thread during 
> normal execution but it can also be accessed by the receivingFragmentFinished 
> method which is a callback executed by the event processor thread. Because 
> multiple threads can access the partitioner field synchronization is done on 
> creation and on when receivingFragmentFinished. However, the close method can 
> also be called by the event processor thread, and the close method does not 
> synchronize before accessing the partitioner field. Since synchronization is 
> not done the event processor thread may have an old reference to the 
> partitioner when a query cancellation is done. Since it has an old reference 
> the current partitioner can may not be cleared and a memory leak may occur.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6125) PartitionSenderRootExec can leak memory because close method is not synchronized

2018-03-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-6125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16392432#comment-16392432
 ] 

ASF GitHub Bot commented on DRILL-6125:
---

Github user ilooner commented on a diff in the pull request:

https://github.com/apache/drill/pull/1105#discussion_r173367485
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/work/fragment/FragmentExecutor.java
 ---
@@ -231,9 +261,9 @@ public Void run() throws Exception {
   while (shouldContinue()) {
 // Fragment is not cancelled
 
-if (eventProcessor.hasFinishedRequests()) {
+if (!recieverFinishedQueue.isEmpty()) {
--- End diff --

Thanks for catching removed.


> PartitionSenderRootExec can leak memory because close method is not 
> synchronized
> 
>
> Key: DRILL-6125
> URL: https://issues.apache.org/jira/browse/DRILL-6125
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.13.0
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>Priority: Minor
> Fix For: 1.14.0
>
>
> PartitionSenderRootExec creates a PartitionerDecorator and saves it in the 
> *partitioner* field. The creation of the partitioner happens in the 
> createPartitioner method. This method get's called by the main fragment 
> thread. The partitioner field is accessed by the fragment thread during 
> normal execution but it can also be accessed by the receivingFragmentFinished 
> method which is a callback executed by the event processor thread. Because 
> multiple threads can access the partitioner field synchronization is done on 
> creation and on when receivingFragmentFinished. However, the close method can 
> also be called by the event processor thread, and the close method does not 
> synchronize before accessing the partitioner field. Since synchronization is 
> not done the event processor thread may have an old reference to the 
> partitioner when a query cancellation is done. Since it has an old reference 
> the current partitioner can may not be cleared and a memory leak may occur.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6125) PartitionSenderRootExec can leak memory because close method is not synchronized

2018-03-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-6125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16392431#comment-16392431
 ] 

ASF GitHub Bot commented on DRILL-6125:
---

Github user ilooner commented on a diff in the pull request:

https://github.com/apache/drill/pull/1105#discussion_r173367462
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/work/fragment/FragmentExecutor.java
 ---
@@ -488,47 +527,66 @@ void receiverFinished(FragmentHandle handle) {
   sendEvent(new FragmentEvent(EventType.RECEIVER_FINISHED, handle));
 }
 
+/**
+ * Tell the {@link FragmentEventProcessor} not to process anymore 
events. This keeps stray cancellation requests
+ * from being processed after the root has finished running and 
interrupts in the root thread have been cleared.
+ */
+public synchronized void terminate() {
+  terminate.set(true);
+}
+
 @Override
 protected void processEvent(FragmentEvent event) {
+  if (terminate.get()) {
--- End diff --

Thanks for catching. Fixed.


> PartitionSenderRootExec can leak memory because close method is not 
> synchronized
> 
>
> Key: DRILL-6125
> URL: https://issues.apache.org/jira/browse/DRILL-6125
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.13.0
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>Priority: Minor
> Fix For: 1.14.0
>
>
> PartitionSenderRootExec creates a PartitionerDecorator and saves it in the 
> *partitioner* field. The creation of the partitioner happens in the 
> createPartitioner method. This method get's called by the main fragment 
> thread. The partitioner field is accessed by the fragment thread during 
> normal execution but it can also be accessed by the receivingFragmentFinished 
> method which is a callback executed by the event processor thread. Because 
> multiple threads can access the partitioner field synchronization is done on 
> creation and on when receivingFragmentFinished. However, the close method can 
> also be called by the event processor thread, and the close method does not 
> synchronize before accessing the partitioner field. Since synchronization is 
> not done the event processor thread may have an old reference to the 
> partitioner when a query cancellation is done. Since it has an old reference 
> the current partitioner can may not be cleared and a memory leak may occur.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6125) PartitionSenderRootExec can leak memory because close method is not synchronized

2018-03-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-6125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16392427#comment-16392427
 ] 

ASF GitHub Bot commented on DRILL-6125:
---

Github user ilooner commented on a diff in the pull request:

https://github.com/apache/drill/pull/1105#discussion_r173367313
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/work/fragment/FragmentExecutor.java
 ---
@@ -475,6 +483,8 @@ public void drillbitUnregistered(final 
Set
* This is especially important as fragments can take longer to start
*/
   private class FragmentEventProcessor extends 
EventProcessor {
+private boolean terminate = false;
--- End diff --

Done


> PartitionSenderRootExec can leak memory because close method is not 
> synchronized
> 
>
> Key: DRILL-6125
> URL: https://issues.apache.org/jira/browse/DRILL-6125
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.13.0
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>Priority: Minor
> Fix For: 1.14.0
>
>
> PartitionSenderRootExec creates a PartitionerDecorator and saves it in the 
> *partitioner* field. The creation of the partitioner happens in the 
> createPartitioner method. This method get's called by the main fragment 
> thread. The partitioner field is accessed by the fragment thread during 
> normal execution but it can also be accessed by the receivingFragmentFinished 
> method which is a callback executed by the event processor thread. Because 
> multiple threads can access the partitioner field synchronization is done on 
> creation and on when receivingFragmentFinished. However, the close method can 
> also be called by the event processor thread, and the close method does not 
> synchronize before accessing the partitioner field. Since synchronization is 
> not done the event processor thread may have an old reference to the 
> partitioner when a query cancellation is done. Since it has an old reference 
> the current partitioner can may not be cleared and a memory leak may occur.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6125) PartitionSenderRootExec can leak memory because close method is not synchronized

2018-03-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-6125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16392429#comment-16392429
 ] 

ASF GitHub Bot commented on DRILL-6125:
---

Github user ilooner commented on a diff in the pull request:

https://github.com/apache/drill/pull/1105#discussion_r173367348
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/work/fragment/FragmentExecutor.java
 ---
@@ -67,6 +88,11 @@
 
   private volatile RootExec root;
   private final AtomicReference fragmentState = new 
AtomicReference<>(FragmentState.AWAITING_ALLOCATION);
+  /**
+   * Holds all of the messages sent by downstream recievers that have 
finished. The {@link FragmentExecutor#run()} thread reads from this queue and 
passes the
+   * finished messages to the fragment's {@link RootExec} via the {@link 
RootExec#receivingFragmentFinished(FragmentHandle)} method.
+   */
+  private final Queue recieverFinishedQueue = new 
ConcurrentLinkedQueue<>();
--- End diff --

Thanks for catching. Fixed all the misspellings of receiver.


> PartitionSenderRootExec can leak memory because close method is not 
> synchronized
> 
>
> Key: DRILL-6125
> URL: https://issues.apache.org/jira/browse/DRILL-6125
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.13.0
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>Priority: Minor
> Fix For: 1.14.0
>
>
> PartitionSenderRootExec creates a PartitionerDecorator and saves it in the 
> *partitioner* field. The creation of the partitioner happens in the 
> createPartitioner method. This method get's called by the main fragment 
> thread. The partitioner field is accessed by the fragment thread during 
> normal execution but it can also be accessed by the receivingFragmentFinished 
> method which is a callback executed by the event processor thread. Because 
> multiple threads can access the partitioner field synchronization is done on 
> creation and on when receivingFragmentFinished. However, the close method can 
> also be called by the event processor thread, and the close method does not 
> synchronize before accessing the partitioner field. Since synchronization is 
> not done the event processor thread may have an old reference to the 
> partitioner when a query cancellation is done. Since it has an old reference 
> the current partitioner can may not be cleared and a memory leak may occur.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6125) PartitionSenderRootExec can leak memory because close method is not synchronized

2018-03-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-6125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16392426#comment-16392426
 ] 

ASF GitHub Bot commented on DRILL-6125:
---

Github user ilooner commented on a diff in the pull request:

https://github.com/apache/drill/pull/1105#discussion_r173367305
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/work/fragment/FragmentExecutor.java
 ---
@@ -475,6 +483,8 @@ public void drillbitUnregistered(final 
Set
* This is especially important as fragments can take longer to start
*/
   private class FragmentEventProcessor extends 
EventProcessor {
+private boolean terminate = false;
+private List finishedHandles = Lists.newArrayList();
--- End diff --

Done


> PartitionSenderRootExec can leak memory because close method is not 
> synchronized
> 
>
> Key: DRILL-6125
> URL: https://issues.apache.org/jira/browse/DRILL-6125
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.13.0
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>Priority: Minor
> Fix For: 1.14.0
>
>
> PartitionSenderRootExec creates a PartitionerDecorator and saves it in the 
> *partitioner* field. The creation of the partitioner happens in the 
> createPartitioner method. This method get's called by the main fragment 
> thread. The partitioner field is accessed by the fragment thread during 
> normal execution but it can also be accessed by the receivingFragmentFinished 
> method which is a callback executed by the event processor thread. Because 
> multiple threads can access the partitioner field synchronization is done on 
> creation and on when receivingFragmentFinished. However, the close method can 
> also be called by the event processor thread, and the close method does not 
> synchronize before accessing the partitioner field. Since synchronization is 
> not done the event processor thread may have an old reference to the 
> partitioner when a query cancellation is done. Since it has an old reference 
> the current partitioner can may not be cleared and a memory leak may occur.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6125) PartitionSenderRootExec can leak memory because close method is not synchronized

2018-03-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-6125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16391959#comment-16391959
 ] 

ASF GitHub Bot commented on DRILL-6125:
---

Github user vrozov commented on a diff in the pull request:

https://github.com/apache/drill/pull/1105#discussion_r173303074
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/work/fragment/FragmentExecutor.java
 ---
@@ -488,47 +527,66 @@ void receiverFinished(FragmentHandle handle) {
   sendEvent(new FragmentEvent(EventType.RECEIVER_FINISHED, handle));
 }
 
+/**
+ * Tell the {@link FragmentEventProcessor} not to process anymore 
events. This keeps stray cancellation requests
+ * from being processed after the root has finished running and 
interrupts in the root thread have been cleared.
+ */
+public synchronized void terminate() {
+  terminate.set(true);
+}
+
 @Override
 protected void processEvent(FragmentEvent event) {
+  if (terminate.get()) {
--- End diff --

use atomic compare and set to avoid race condition.


> PartitionSenderRootExec can leak memory because close method is not 
> synchronized
> 
>
> Key: DRILL-6125
> URL: https://issues.apache.org/jira/browse/DRILL-6125
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.13.0
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>Priority: Minor
> Fix For: 1.14.0
>
>
> PartitionSenderRootExec creates a PartitionerDecorator and saves it in the 
> *partitioner* field. The creation of the partitioner happens in the 
> createPartitioner method. This method get's called by the main fragment 
> thread. The partitioner field is accessed by the fragment thread during 
> normal execution but it can also be accessed by the receivingFragmentFinished 
> method which is a callback executed by the event processor thread. Because 
> multiple threads can access the partitioner field synchronization is done on 
> creation and on when receivingFragmentFinished. However, the close method can 
> also be called by the event processor thread, and the close method does not 
> synchronize before accessing the partitioner field. Since synchronization is 
> not done the event processor thread may have an old reference to the 
> partitioner when a query cancellation is done. Since it has an old reference 
> the current partitioner can may not be cleared and a memory leak may occur.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6125) PartitionSenderRootExec can leak memory because close method is not synchronized

2018-03-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-6125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16391958#comment-16391958
 ] 

ASF GitHub Bot commented on DRILL-6125:
---

Github user vrozov commented on a diff in the pull request:

https://github.com/apache/drill/pull/1105#discussion_r173300352
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/work/fragment/FragmentExecutor.java
 ---
@@ -67,6 +88,11 @@
 
   private volatile RootExec root;
   private final AtomicReference fragmentState = new 
AtomicReference<>(FragmentState.AWAITING_ALLOCATION);
+  /**
+   * Holds all of the messages sent by downstream recievers that have 
finished. The {@link FragmentExecutor#run()} thread reads from this queue and 
passes the
+   * finished messages to the fragment's {@link RootExec} via the {@link 
RootExec#receivingFragmentFinished(FragmentHandle)} method.
+   */
+  private final Queue recieverFinishedQueue = new 
ConcurrentLinkedQueue<>();
--- End diff --

please fix typo (receiver)


> PartitionSenderRootExec can leak memory because close method is not 
> synchronized
> 
>
> Key: DRILL-6125
> URL: https://issues.apache.org/jira/browse/DRILL-6125
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.13.0
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>Priority: Minor
> Fix For: 1.14.0
>
>
> PartitionSenderRootExec creates a PartitionerDecorator and saves it in the 
> *partitioner* field. The creation of the partitioner happens in the 
> createPartitioner method. This method get's called by the main fragment 
> thread. The partitioner field is accessed by the fragment thread during 
> normal execution but it can also be accessed by the receivingFragmentFinished 
> method which is a callback executed by the event processor thread. Because 
> multiple threads can access the partitioner field synchronization is done on 
> creation and on when receivingFragmentFinished. However, the close method can 
> also be called by the event processor thread, and the close method does not 
> synchronize before accessing the partitioner field. Since synchronization is 
> not done the event processor thread may have an old reference to the 
> partitioner when a query cancellation is done. Since it has an old reference 
> the current partitioner can may not be cleared and a memory leak may occur.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6125) PartitionSenderRootExec can leak memory because close method is not synchronized

2018-03-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-6125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16391960#comment-16391960
 ] 

ASF GitHub Bot commented on DRILL-6125:
---

Github user vrozov commented on a diff in the pull request:

https://github.com/apache/drill/pull/1105#discussion_r173300621
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/work/fragment/FragmentExecutor.java
 ---
@@ -231,9 +261,9 @@ public Void run() throws Exception {
   while (shouldContinue()) {
 // Fragment is not cancelled
 
-if (eventProcessor.hasFinishedRequests()) {
+if (!recieverFinishedQueue.isEmpty()) {
--- End diff --

I don't think `isEmpty()` is necessary, it is the same as `poll() != null`.


> PartitionSenderRootExec can leak memory because close method is not 
> synchronized
> 
>
> Key: DRILL-6125
> URL: https://issues.apache.org/jira/browse/DRILL-6125
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.13.0
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>Priority: Minor
> Fix For: 1.14.0
>
>
> PartitionSenderRootExec creates a PartitionerDecorator and saves it in the 
> *partitioner* field. The creation of the partitioner happens in the 
> createPartitioner method. This method get's called by the main fragment 
> thread. The partitioner field is accessed by the fragment thread during 
> normal execution but it can also be accessed by the receivingFragmentFinished 
> method which is a callback executed by the event processor thread. Because 
> multiple threads can access the partitioner field synchronization is done on 
> creation and on when receivingFragmentFinished. However, the close method can 
> also be called by the event processor thread, and the close method does not 
> synchronize before accessing the partitioner field. Since synchronization is 
> not done the event processor thread may have an old reference to the 
> partitioner when a query cancellation is done. Since it has an old reference 
> the current partitioner can may not be cleared and a memory leak may occur.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6125) PartitionSenderRootExec can leak memory because close method is not synchronized

2018-03-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-6125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16391562#comment-16391562
 ] 

ASF GitHub Bot commented on DRILL-6125:
---

Github user vrozov commented on a diff in the pull request:

https://github.com/apache/drill/pull/1105#discussion_r173229346
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/work/fragment/FragmentExecutor.java
 ---
@@ -475,6 +483,8 @@ public void drillbitUnregistered(final 
Set
* This is especially important as fragments can take longer to start
*/
   private class FragmentEventProcessor extends 
EventProcessor {
+private boolean terminate = false;
--- End diff --

Consider using AtomicBoolean and avoiding `synchronized`.


> PartitionSenderRootExec can leak memory because close method is not 
> synchronized
> 
>
> Key: DRILL-6125
> URL: https://issues.apache.org/jira/browse/DRILL-6125
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.13.0
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>Priority: Minor
> Fix For: 1.14.0
>
>
> PartitionSenderRootExec creates a PartitionerDecorator and saves it in the 
> *partitioner* field. The creation of the partitioner happens in the 
> createPartitioner method. This method get's called by the main fragment 
> thread. The partitioner field is accessed by the fragment thread during 
> normal execution but it can also be accessed by the receivingFragmentFinished 
> method which is a callback executed by the event processor thread. Because 
> multiple threads can access the partitioner field synchronization is done on 
> creation and on when receivingFragmentFinished. However, the close method can 
> also be called by the event processor thread, and the close method does not 
> synchronize before accessing the partitioner field. Since synchronization is 
> not done the event processor thread may have an old reference to the 
> partitioner when a query cancellation is done. Since it has an old reference 
> the current partitioner can may not be cleared and a memory leak may occur.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6125) PartitionSenderRootExec can leak memory because close method is not synchronized

2018-03-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-6125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16391563#comment-16391563
 ] 

ASF GitHub Bot commented on DRILL-6125:
---

Github user vrozov commented on a diff in the pull request:

https://github.com/apache/drill/pull/1105#discussion_r173224242
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/work/fragment/FragmentExecutor.java
 ---
@@ -475,6 +483,8 @@ public void drillbitUnregistered(final 
Set
* This is especially important as fragments can take longer to start
*/
   private class FragmentEventProcessor extends 
EventProcessor {
+private boolean terminate = false;
+private List finishedHandles = Lists.newArrayList();
--- End diff --

Consider using Queue and possibly concurrent Queue and moving 
`finishedHandles` to FragmentExecutor.


> PartitionSenderRootExec can leak memory because close method is not 
> synchronized
> 
>
> Key: DRILL-6125
> URL: https://issues.apache.org/jira/browse/DRILL-6125
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.13.0
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>Priority: Minor
> Fix For: 1.14.0
>
>
> PartitionSenderRootExec creates a PartitionerDecorator and saves it in the 
> *partitioner* field. The creation of the partitioner happens in the 
> createPartitioner method. This method get's called by the main fragment 
> thread. The partitioner field is accessed by the fragment thread during 
> normal execution but it can also be accessed by the receivingFragmentFinished 
> method which is a callback executed by the event processor thread. Because 
> multiple threads can access the partitioner field synchronization is done on 
> creation and on when receivingFragmentFinished. However, the close method can 
> also be called by the event processor thread, and the close method does not 
> synchronize before accessing the partitioner field. Since synchronization is 
> not done the event processor thread may have an old reference to the 
> partitioner when a query cancellation is done. Since it has an old reference 
> the current partitioner can may not be cleared and a memory leak may occur.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6125) PartitionSenderRootExec can leak memory because close method is not synchronized

2018-03-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-6125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16390683#comment-16390683
 ] 

ASF GitHub Bot commented on DRILL-6125:
---

Github user ilooner commented on the issue:

https://github.com/apache/drill/pull/1105
  
@vrozov @arina-ielchiieva Handled multiple receiver finished messages 
correctly. This PR is ready for another round of review.


> PartitionSenderRootExec can leak memory because close method is not 
> synchronized
> 
>
> Key: DRILL-6125
> URL: https://issues.apache.org/jira/browse/DRILL-6125
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.13.0
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>Priority: Minor
> Fix For: 1.14.0
>
>
> PartitionSenderRootExec creates a PartitionerDecorator and saves it in the 
> *partitioner* field. The creation of the partitioner happens in the 
> createPartitioner method. This method get's called by the main fragment 
> thread. The partitioner field is accessed by the fragment thread during 
> normal execution but it can also be accessed by the receivingFragmentFinished 
> method which is a callback executed by the event processor thread. Because 
> multiple threads can access the partitioner field synchronization is done on 
> creation and on when receivingFragmentFinished. However, the close method can 
> also be called by the event processor thread, and the close method does not 
> synchronize before accessing the partitioner field. Since synchronization is 
> not done the event processor thread may have an old reference to the 
> partitioner when a query cancellation is done. Since it has an old reference 
> the current partitioner can may not be cleared and a memory leak may occur.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6125) PartitionSenderRootExec can leak memory because close method is not synchronized

2018-03-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-6125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16390184#comment-16390184
 ] 

ASF GitHub Bot commented on DRILL-6125:
---

Github user ilooner commented on the issue:

https://github.com/apache/drill/pull/1105
  
Nope never mind I think I spoke too soon. I just realized we may get 
multiple receivingFragmentFinished requests, one for each downstream receiver. 
Back to the drawing board.


> PartitionSenderRootExec can leak memory because close method is not 
> synchronized
> 
>
> Key: DRILL-6125
> URL: https://issues.apache.org/jira/browse/DRILL-6125
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.13.0
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>Priority: Minor
> Fix For: 1.14.0
>
>
> PartitionSenderRootExec creates a PartitionerDecorator and saves it in the 
> *partitioner* field. The creation of the partitioner happens in the 
> createPartitioner method. This method get's called by the main fragment 
> thread. The partitioner field is accessed by the fragment thread during 
> normal execution but it can also be accessed by the receivingFragmentFinished 
> method which is a callback executed by the event processor thread. Because 
> multiple threads can access the partitioner field synchronization is done on 
> creation and on when receivingFragmentFinished. However, the close method can 
> also be called by the event processor thread, and the close method does not 
> synchronize before accessing the partitioner field. Since synchronization is 
> not done the event processor thread may have an old reference to the 
> partitioner when a query cancellation is done. Since it has an old reference 
> the current partitioner can may not be cleared and a memory leak may occur.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6125) PartitionSenderRootExec can leak memory because close method is not synchronized

2018-03-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-6125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16390169#comment-16390169
 ] 

ASF GitHub Bot commented on DRILL-6125:
---

Github user ilooner commented on the issue:

https://github.com/apache/drill/pull/1105
  
@arina-ielchiieva @vrozov In light of Vlad's comments I have reworked the 
synchronization model yet again. This change now removes all synchronization 
from PartitionSenderRootExec and enforces the guarantee that all the lifecycle 
methods of the PartitionSenderRootExec will only be called by a single Run 
thread in the FragmentExecutor. Also while making this change I discovered a 
few other bugs with how cancellations and receiver finishes are handled, so I 
have addressed those bugs as well. I will go into more detail about what I 
changed below.

# Motivation

As Vlad pointed out **close** and **innerNext** are never called 
concurrently. After closer inspection of the code I also released that 
currently (in apache master) innerNext and close will always be called by the 
**FragmentExecutor#run** thread. The only method of PartitionSenderRootExec 
that is not called by the **FragmentExecutor#run** thread is 
**receivingFragmentFinished**. In order to simplify the implementation of 
PartitionSenderRootExec and also simplify the design of the FragmentExecutor I 
changed the code so that only the **FragmentExecutor#run** thread calls 
**receivingFragmentFinished** as well. In this way we can remove all the 
synchronization from PartitionSenderRootExec. This was done by by:
 1. Making the event processor save the FragmentHandle in the event that a 
receiver finish request was sent. 
 2. After the **root.next()** loop terminates in **FragmentExecutor#run** 
the eventProcessor is checked to see if a finish request was received. If so 
**receivingFragmentFinished** is called on root by the **FragmentExecutor#run** 
method.

# Other Bugs

## Processing of multiple termination requests

The event processor would process all cancellation and finish requests, 
even if there is more than one. This doesn't really make sense, since we can 
only cancel or finish once. So I have changed the event processor to execute 
only the first termination request and ignore all the others.

## Simultaneous Cancellation and Finishing

Since the event processor would process multiple termination requests 
concurrently it was possible for a cancel and a finish message to be received 
and processed simultaneously. The results of this were not well defined since 
**close** and **receivingFragmentFinished** could be called concurrently.

# Other Improvements

Vlad also pointed out that we did not need the hasCloseoutThread atomic 
reference, since we were already using the myThreadRef atomic reference. That 
cleaned up the code a bit.


> PartitionSenderRootExec can leak memory because close method is not 
> synchronized
> 
>
> Key: DRILL-6125
> URL: https://issues.apache.org/jira/browse/DRILL-6125
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.13.0
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>Priority: Minor
> Fix For: 1.14.0
>
>
> PartitionSenderRootExec creates a PartitionerDecorator and saves it in the 
> *partitioner* field. The creation of the partitioner happens in the 
> createPartitioner method. This method get's called by the main fragment 
> thread. The partitioner field is accessed by the fragment thread during 
> normal execution but it can also be accessed by the receivingFragmentFinished 
> method which is a callback executed by the event processor thread. Because 
> multiple threads can access the partitioner field synchronization is done on 
> creation and on when receivingFragmentFinished. However, the close method can 
> also be called by the event processor thread, and the close method does not 
> synchronize before accessing the partitioner field. Since synchronization is 
> not done the event processor thread may have an old reference to the 
> partitioner when a query cancellation is done. Since it has an old reference 
> the current partitioner can may not be cleared and a memory leak may occur.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6125) PartitionSenderRootExec can leak memory because close method is not synchronized

2018-03-06 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-6125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16388572#comment-16388572
 ] 

ASF GitHub Bot commented on DRILL-6125:
---

Github user ilooner commented on the issue:

https://github.com/apache/drill/pull/1105
  
@vrozov Thanks for catching this, I believe you are right. 
hasClouseoutThread guarantees innerNext and close won't be called concurrently. 
However, I still believe innerNext and receivingFragmentFinished could be 
called concurrently, since the ControlMessageHandler thread executes 
recievingFragmentFinished. Additionally in rare cases where a limit query is 
cancelled recievingFragmentFinished and close could be called concurrently as 
well. While reflecting on your comments I also saw another issue where the root 
could be blocked on a next call but a Finished event would not cause root to 
terminate.

In light of all of this I actually think the **synchronized** is not 
sufficient. We will have to have some way to interrupt the execution of the 
root when we received a finish signal and only close out the resources after 
receivingFragmentFinished has been called. Similarly if we receive a finish 
signal we should ignore any cancellation requests instead of trying to cancel 
and finish simultaneously and vice versa.

I will rework the solution to address these issues and let you know when I 
have an update.


> PartitionSenderRootExec can leak memory because close method is not 
> synchronized
> 
>
> Key: DRILL-6125
> URL: https://issues.apache.org/jira/browse/DRILL-6125
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.13.0
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>Priority: Minor
> Fix For: 1.13.0
>
>
> PartitionSenderRootExec creates a PartitionerDecorator and saves it in the 
> *partitioner* field. The creation of the partitioner happens in the 
> createPartitioner method. This method get's called by the main fragment 
> thread. The partitioner field is accessed by the fragment thread during 
> normal execution but it can also be accessed by the receivingFragmentFinished 
> method which is a callback executed by the event processor thread. Because 
> multiple threads can access the partitioner field synchronization is done on 
> creation and on when receivingFragmentFinished. However, the close method can 
> also be called by the event processor thread, and the close method does not 
> synchronize before accessing the partitioner field. Since synchronization is 
> not done the event processor thread may have an old reference to the 
> partitioner when a query cancellation is done. Since it has an old reference 
> the current partitioner can may not be cleared and a memory leak may occur.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6125) PartitionSenderRootExec can leak memory because close method is not synchronized

2018-03-06 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-6125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16388402#comment-16388402
 ] 

ASF GitHub Bot commented on DRILL-6125:
---

Github user vrozov commented on the issue:

https://github.com/apache/drill/pull/1105
  
@ilooner @arina-ielchiieva How `innerNext()` and `close()` can execute 
concurrently? Does not `FragmentExecutor.hasCloseoutThread` ensure that either 
`close()` is called on the `run()` thread or `run()` exits if the fragment is 
already cancelled?


> PartitionSenderRootExec can leak memory because close method is not 
> synchronized
> 
>
> Key: DRILL-6125
> URL: https://issues.apache.org/jira/browse/DRILL-6125
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.13.0
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>Priority: Minor
> Fix For: 1.13.0
>
>
> PartitionSenderRootExec creates a PartitionerDecorator and saves it in the 
> *partitioner* field. The creation of the partitioner happens in the 
> createPartitioner method. This method get's called by the main fragment 
> thread. The partitioner field is accessed by the fragment thread during 
> normal execution but it can also be accessed by the receivingFragmentFinished 
> method which is a callback executed by the event processor thread. Because 
> multiple threads can access the partitioner field synchronization is done on 
> creation and on when receivingFragmentFinished. However, the close method can 
> also be called by the event processor thread, and the close method does not 
> synchronize before accessing the partitioner field. Since synchronization is 
> not done the event processor thread may have an old reference to the 
> partitioner when a query cancellation is done. Since it has an old reference 
> the current partitioner can may not be cleared and a memory leak may occur.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6125) PartitionSenderRootExec can leak memory because close method is not synchronized

2018-03-06 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-6125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16388261#comment-16388261
 ] 

ASF GitHub Bot commented on DRILL-6125:
---

Github user arina-ielchiieva commented on the issue:

https://github.com/apache/drill/pull/1105
  
Looks good, @vrozov are you ok with the changes?


> PartitionSenderRootExec can leak memory because close method is not 
> synchronized
> 
>
> Key: DRILL-6125
> URL: https://issues.apache.org/jira/browse/DRILL-6125
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.13.0
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>Priority: Minor
> Fix For: 1.13.0
>
>
> PartitionSenderRootExec creates a PartitionerDecorator and saves it in the 
> *partitioner* field. The creation of the partitioner happens in the 
> createPartitioner method. This method get's called by the main fragment 
> thread. The partitioner field is accessed by the fragment thread during 
> normal execution but it can also be accessed by the receivingFragmentFinished 
> method which is a callback executed by the event processor thread. Because 
> multiple threads can access the partitioner field synchronization is done on 
> creation and on when receivingFragmentFinished. However, the close method can 
> also be called by the event processor thread, and the close method does not 
> synchronize before accessing the partitioner field. Since synchronization is 
> not done the event processor thread may have an old reference to the 
> partitioner when a query cancellation is done. Since it has an old reference 
> the current partitioner can may not be cleared and a memory leak may occur.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6125) PartitionSenderRootExec can leak memory because close method is not synchronized

2018-03-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-6125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16386765#comment-16386765
 ] 

ASF GitHub Bot commented on DRILL-6125:
---

Github user ilooner commented on a diff in the pull request:

https://github.com/apache/drill/pull/1105#discussion_r172327159
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/partitionsender/PartitionSenderRootExec.java
 ---
@@ -358,7 +357,7 @@ public void close() throws Exception {
 }
   }
 
-  public void sendEmptyBatch(boolean isLast) {
+  public synchronized void sendEmptyBatch(boolean isLast) {
--- End diff --

Done


> PartitionSenderRootExec can leak memory because close method is not 
> synchronized
> 
>
> Key: DRILL-6125
> URL: https://issues.apache.org/jira/browse/DRILL-6125
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.13.0
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>Priority: Minor
> Fix For: 1.13.0
>
>
> PartitionSenderRootExec creates a PartitionerDecorator and saves it in the 
> *partitioner* field. The creation of the partitioner happens in the 
> createPartitioner method. This method get's called by the main fragment 
> thread. The partitioner field is accessed by the fragment thread during 
> normal execution but it can also be accessed by the receivingFragmentFinished 
> method which is a callback executed by the event processor thread. Because 
> multiple threads can access the partitioner field synchronization is done on 
> creation and on when receivingFragmentFinished. However, the close method can 
> also be called by the event processor thread, and the close method does not 
> synchronize before accessing the partitioner field. Since synchronization is 
> not done the event processor thread may have an old reference to the 
> partitioner when a query cancellation is done. Since it has an old reference 
> the current partitioner can may not be cleared and a memory leak may occur.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6125) PartitionSenderRootExec can leak memory because close method is not synchronized

2018-03-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-6125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16386766#comment-16386766
 ] 

ASF GitHub Bot commented on DRILL-6125:
---

Github user ilooner commented on the issue:

https://github.com/apache/drill/pull/1105
  
@arina-ielchiieva Applied review comments.


> PartitionSenderRootExec can leak memory because close method is not 
> synchronized
> 
>
> Key: DRILL-6125
> URL: https://issues.apache.org/jira/browse/DRILL-6125
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.13.0
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>Priority: Minor
> Fix For: 1.13.0
>
>
> PartitionSenderRootExec creates a PartitionerDecorator and saves it in the 
> *partitioner* field. The creation of the partitioner happens in the 
> createPartitioner method. This method get's called by the main fragment 
> thread. The partitioner field is accessed by the fragment thread during 
> normal execution but it can also be accessed by the receivingFragmentFinished 
> method which is a callback executed by the event processor thread. Because 
> multiple threads can access the partitioner field synchronization is done on 
> creation and on when receivingFragmentFinished. However, the close method can 
> also be called by the event processor thread, and the close method does not 
> synchronize before accessing the partitioner field. Since synchronization is 
> not done the event processor thread may have an old reference to the 
> partitioner when a query cancellation is done. Since it has an old reference 
> the current partitioner can may not be cleared and a memory leak may occur.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6125) PartitionSenderRootExec can leak memory because close method is not synchronized

2018-03-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-6125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16386762#comment-16386762
 ] 

ASF GitHub Bot commented on DRILL-6125:
---

Github user ilooner commented on a diff in the pull request:

https://github.com/apache/drill/pull/1105#discussion_r172327066
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/work/fragment/FragmentExecutor.java
 ---
@@ -245,6 +244,8 @@ public Void run() throws Exception {
 // we have a heap out of memory error. The JVM in unstable, exit.
 CatastrophicFailure.exit(e, "Unable to handle out of memory 
condition in FragmentExecutor.", -2);
   }
+} catch (InterruptedException e) {
+  // Swallow interrupted exceptions since we intentionally interrupt 
the root when cancelling a query
--- End diff --

Done


> PartitionSenderRootExec can leak memory because close method is not 
> synchronized
> 
>
> Key: DRILL-6125
> URL: https://issues.apache.org/jira/browse/DRILL-6125
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.13.0
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>Priority: Minor
> Fix For: 1.13.0
>
>
> PartitionSenderRootExec creates a PartitionerDecorator and saves it in the 
> *partitioner* field. The creation of the partitioner happens in the 
> createPartitioner method. This method get's called by the main fragment 
> thread. The partitioner field is accessed by the fragment thread during 
> normal execution but it can also be accessed by the receivingFragmentFinished 
> method which is a callback executed by the event processor thread. Because 
> multiple threads can access the partitioner field synchronization is done on 
> creation and on when receivingFragmentFinished. However, the close method can 
> also be called by the event processor thread, and the close method does not 
> synchronize before accessing the partitioner field. Since synchronization is 
> not done the event processor thread may have an old reference to the 
> partitioner when a query cancellation is done. Since it has an old reference 
> the current partitioner can may not be cleared and a memory leak may occur.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6125) PartitionSenderRootExec can leak memory because close method is not synchronized

2018-03-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-6125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16386763#comment-16386763
 ] 

ASF GitHub Bot commented on DRILL-6125:
---

Github user ilooner commented on a diff in the pull request:

https://github.com/apache/drill/pull/1105#discussion_r172327106
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/partitionsender/PartitionSenderRootExec.java
 ---
@@ -231,7 +231,7 @@ public boolean innerNext() {
   }
 
   @VisibleForTesting
-  protected void createPartitioner() throws SchemaChangeException {
+  protected synchronized void createPartitioner() throws 
SchemaChangeException {
--- End diff --

Removed


> PartitionSenderRootExec can leak memory because close method is not 
> synchronized
> 
>
> Key: DRILL-6125
> URL: https://issues.apache.org/jira/browse/DRILL-6125
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.13.0
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>Priority: Minor
> Fix For: 1.13.0
>
>
> PartitionSenderRootExec creates a PartitionerDecorator and saves it in the 
> *partitioner* field. The creation of the partitioner happens in the 
> createPartitioner method. This method get's called by the main fragment 
> thread. The partitioner field is accessed by the fragment thread during 
> normal execution but it can also be accessed by the receivingFragmentFinished 
> method which is a callback executed by the event processor thread. Because 
> multiple threads can access the partitioner field synchronization is done on 
> creation and on when receivingFragmentFinished. However, the close method can 
> also be called by the event processor thread, and the close method does not 
> synchronize before accessing the partitioner field. Since synchronization is 
> not done the event processor thread may have an old reference to the 
> partitioner when a query cancellation is done. Since it has an old reference 
> the current partitioner can may not be cleared and a memory leak may occur.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6125) PartitionSenderRootExec can leak memory because close method is not synchronized

2018-03-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-6125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16386690#comment-16386690
 ] 

ASF GitHub Bot commented on DRILL-6125:
---

Github user ilooner commented on the issue:

https://github.com/apache/drill/pull/1105
  
@vrozov The main issue is that InnerNext and Close should not execute 
concurrently. Even when we are using AtomicReferences and volatile the 
following sequence of events could happen which could cause a memory leak:

  1. Let's say there are two Threads. The **Close thread** which starts at 
the beginning of the close method, and the **Next thread** which starts at the 
beginning of the innerNext method.
  1. Now let's say the **Next Thread** runs and checks **ok**. Since close 
has not been called yet **ok** is true.
  1. Now the **Next Thread** is after the ok check.
  1. The **Close thread** now starta executing. And the close thread clears 
the partitioner.
  1. Now after the partitioner is cleared the **Next Thread** can resume 
executing. If the next thread receives an OK_SCHEMA he will allocate a new 
partitioner. Since the OK_SCHEMA message may include records the partitioner 
may partition some data as well.
  1. Now the **Close thread** is done, but there is a partitioner that has 
not been closed, and we will leak memory.

In order to property solve this problem we need to make sure that the 
innerNext and close methods are mutually exclusive so the above scenario can 
never happen. The easiest way to do that is to use the synchronized key word. 
If we use the synchronized keyword then we don't have to use volatile or atomic 
reference.

Also as a side note using synchronized will probably be more efficient 
since a cache flush would only be triggered at the start of the innerNext and 
close method. Alternatively if we used volatile and AtomicReference a cache 
flush would be triggered every time we accessed the ok and partitioner 
variables.



> PartitionSenderRootExec can leak memory because close method is not 
> synchronized
> 
>
> Key: DRILL-6125
> URL: https://issues.apache.org/jira/browse/DRILL-6125
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.13.0
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>Priority: Minor
> Fix For: 1.13.0
>
>
> PartitionSenderRootExec creates a PartitionerDecorator and saves it in the 
> *partitioner* field. The creation of the partitioner happens in the 
> createPartitioner method. This method get's called by the main fragment 
> thread. The partitioner field is accessed by the fragment thread during 
> normal execution but it can also be accessed by the receivingFragmentFinished 
> method which is a callback executed by the event processor thread. Because 
> multiple threads can access the partitioner field synchronization is done on 
> creation and on when receivingFragmentFinished. However, the close method can 
> also be called by the event processor thread, and the close method does not 
> synchronize before accessing the partitioner field. Since synchronization is 
> not done the event processor thread may have an old reference to the 
> partitioner when a query cancellation is done. Since it has an old reference 
> the current partitioner can may not be cleared and a memory leak may occur.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6125) PartitionSenderRootExec can leak memory because close method is not synchronized

2018-03-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-6125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16386506#comment-16386506
 ] 

ASF GitHub Bot commented on DRILL-6125:
---

Github user vrozov commented on the issue:

https://github.com/apache/drill/pull/1105
  
@ilooner What state is protected by `syncrhonized`? Why is it not 
sufficient to use `volatile` and `AtomicReference`?


> PartitionSenderRootExec can leak memory because close method is not 
> synchronized
> 
>
> Key: DRILL-6125
> URL: https://issues.apache.org/jira/browse/DRILL-6125
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.13.0
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>Priority: Minor
> Fix For: 1.13.0
>
>
> PartitionSenderRootExec creates a PartitionerDecorator and saves it in the 
> *partitioner* field. The creation of the partitioner happens in the 
> createPartitioner method. This method get's called by the main fragment 
> thread. The partitioner field is accessed by the fragment thread during 
> normal execution but it can also be accessed by the receivingFragmentFinished 
> method which is a callback executed by the event processor thread. Because 
> multiple threads can access the partitioner field synchronization is done on 
> creation and on when receivingFragmentFinished. However, the close method can 
> also be called by the event processor thread, and the close method does not 
> synchronize before accessing the partitioner field. Since synchronization is 
> not done the event processor thread may have an old reference to the 
> partitioner when a query cancellation is done. Since it has an old reference 
> the current partitioner can may not be cleared and a memory leak may occur.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6125) PartitionSenderRootExec can leak memory because close method is not synchronized

2018-03-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-6125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16384784#comment-16384784
 ] 

ASF GitHub Bot commented on DRILL-6125:
---

Github user arina-ielchiieva commented on a diff in the pull request:

https://github.com/apache/drill/pull/1105#discussion_r172024863
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/partitionsender/PartitionSenderRootExec.java
 ---
@@ -231,7 +231,7 @@ public boolean innerNext() {
   }
 
   @VisibleForTesting
-  protected void createPartitioner() throws SchemaChangeException {
+  protected synchronized void createPartitioner() throws 
SchemaChangeException {
--- End diff --

Why you need to synchronized`createParitioner`? It's being called from 
`innerNext` which is already synchronized.


> PartitionSenderRootExec can leak memory because close method is not 
> synchronized
> 
>
> Key: DRILL-6125
> URL: https://issues.apache.org/jira/browse/DRILL-6125
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.13.0
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>Priority: Minor
> Fix For: 1.13.0
>
>
> PartitionSenderRootExec creates a PartitionerDecorator and saves it in the 
> *partitioner* field. The creation of the partitioner happens in the 
> createPartitioner method. This method get's called by the main fragment 
> thread. The partitioner field is accessed by the fragment thread during 
> normal execution but it can also be accessed by the receivingFragmentFinished 
> method which is a callback executed by the event processor thread. Because 
> multiple threads can access the partitioner field synchronization is done on 
> creation and on when receivingFragmentFinished. However, the close method can 
> also be called by the event processor thread, and the close method does not 
> synchronize before accessing the partitioner field. Since synchronization is 
> not done the event processor thread may have an old reference to the 
> partitioner when a query cancellation is done. Since it has an old reference 
> the current partitioner can may not be cleared and a memory leak may occur.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6125) PartitionSenderRootExec can leak memory because close method is not synchronized

2018-03-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-6125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16384786#comment-16384786
 ] 

ASF GitHub Bot commented on DRILL-6125:
---

Github user arina-ielchiieva commented on a diff in the pull request:

https://github.com/apache/drill/pull/1105#discussion_r172024881
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/work/fragment/FragmentExecutor.java
 ---
@@ -245,6 +244,8 @@ public Void run() throws Exception {
 // we have a heap out of memory error. The JVM in unstable, exit.
 CatastrophicFailure.exit(e, "Unable to handle out of memory 
condition in FragmentExecutor.", -2);
   }
+} catch (InterruptedException e) {
+  // Swallow interrupted exceptions since we intentionally interrupt 
the root when cancelling a query
--- End diff --

Maybe add logger.trace?


> PartitionSenderRootExec can leak memory because close method is not 
> synchronized
> 
>
> Key: DRILL-6125
> URL: https://issues.apache.org/jira/browse/DRILL-6125
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.13.0
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>Priority: Minor
> Fix For: 1.13.0
>
>
> PartitionSenderRootExec creates a PartitionerDecorator and saves it in the 
> *partitioner* field. The creation of the partitioner happens in the 
> createPartitioner method. This method get's called by the main fragment 
> thread. The partitioner field is accessed by the fragment thread during 
> normal execution but it can also be accessed by the receivingFragmentFinished 
> method which is a callback executed by the event processor thread. Because 
> multiple threads can access the partitioner field synchronization is done on 
> creation and on when receivingFragmentFinished. However, the close method can 
> also be called by the event processor thread, and the close method does not 
> synchronize before accessing the partitioner field. Since synchronization is 
> not done the event processor thread may have an old reference to the 
> partitioner when a query cancellation is done. Since it has an old reference 
> the current partitioner can may not be cleared and a memory leak may occur.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6125) PartitionSenderRootExec can leak memory because close method is not synchronized

2018-03-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-6125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16384785#comment-16384785
 ] 

ASF GitHub Bot commented on DRILL-6125:
---

Github user arina-ielchiieva commented on a diff in the pull request:

https://github.com/apache/drill/pull/1105#discussion_r172024966
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/partitionsender/PartitionSenderRootExec.java
 ---
@@ -358,7 +357,7 @@ public void close() throws Exception {
 }
   }
 
-  public void sendEmptyBatch(boolean isLast) {
+  public synchronized void sendEmptyBatch(boolean isLast) {
--- End diff --

Method can be made private. Plus maybe not be synchronized as well since is 
being called from `innerNext`.


> PartitionSenderRootExec can leak memory because close method is not 
> synchronized
> 
>
> Key: DRILL-6125
> URL: https://issues.apache.org/jira/browse/DRILL-6125
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.13.0
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>Priority: Minor
> Fix For: 1.13.0
>
>
> PartitionSenderRootExec creates a PartitionerDecorator and saves it in the 
> *partitioner* field. The creation of the partitioner happens in the 
> createPartitioner method. This method get's called by the main fragment 
> thread. The partitioner field is accessed by the fragment thread during 
> normal execution but it can also be accessed by the receivingFragmentFinished 
> method which is a callback executed by the event processor thread. Because 
> multiple threads can access the partitioner field synchronization is done on 
> creation and on when receivingFragmentFinished. However, the close method can 
> also be called by the event processor thread, and the close method does not 
> synchronize before accessing the partitioner field. Since synchronization is 
> not done the event processor thread may have an old reference to the 
> partitioner when a query cancellation is done. Since it has an old reference 
> the current partitioner can may not be cleared and a memory leak may occur.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6125) PartitionSenderRootExec can leak memory because close method is not synchronized

2018-03-01 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-6125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16383023#comment-16383023
 ] 

ASF GitHub Bot commented on DRILL-6125:
---

Github user ilooner commented on the issue:

https://github.com/apache/drill/pull/1105
  
@arina-ielchiieva @vrozov I believe I have a solution. There were several 
issues with the original code.

1. It made incorrect assumptions about how cache invalidation works with 
java **synchronized**.
2. It assumed **innerNext** and **close** would be called sequentially.

I believe this PR fixes these issues now and I have gone into more detail 
about the problems below.

# 1. Incorrect Cache Invalidation Assumptions

The original code was trying to be smart by trying to reduce 
synchronization overhead on **innerNext**. So the code in **innerNext** did not 
synchronize before changing the partitioner object since this would be called 
often. The code in **close()** and ** receivingFragmentFinished()** 
synchronized before accessing the partitioner with the intention that this 
would trigger an update of the partitioner variable state across all threads. 
Unfortunately, this assumption is invalid (see 
https://stackoverflow.com/questions/22706739/does-synchronized-guarantee-a-thread-will-see-the-latest-value-of-a-non-volatile).
 Every thread that accesses a variable must synchronize before accessing a 
variable in order to properly invalidate cached data on a core. 

For example if **Thread A** modifies **Variable 1** then **Thread B** 
synchronizes before accessing **Variable 1** then there is no guarantee 
**Thread B** will see the most updated value for **Variable 1** since it might .

## Solution

In summary the right thing to do is the simple thing. Make the methods 
synchronized. Unfortunately there is no way to outsmart the system and reduce 
synchronization overhead without causing race conditions.

# 2. Concurrent InnerNext and Close Calls

The original code did not consider the case that innerNext was in the 
middle of execution when close was called. It did try to handle the case where 
**innerNext** could be called after **close** by setting the **ok** variable. 
But it didn't even do that right because there was no synchronization around 
the **ok** variable.

## Solution

The right thing to do is the simple thing. Make sure the methods are 
synchronized so close has to wait until innerNext is done before executing. 
Also when a query is cancelled the executing thread should be interrupted the 
thread running innerNext incase it is on a blocking call.


> PartitionSenderRootExec can leak memory because close method is not 
> synchronized
> 
>
> Key: DRILL-6125
> URL: https://issues.apache.org/jira/browse/DRILL-6125
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.13.0
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>Priority: Minor
> Fix For: 1.13.0
>
>
> PartitionSenderRootExec creates a PartitionerDecorator and saves it in the 
> *partitioner* field. The creation of the partitioner happens in the 
> createPartitioner method. This method get's called by the main fragment 
> thread. The partitioner field is accessed by the fragment thread during 
> normal execution but it can also be accessed by the receivingFragmentFinished 
> method which is a callback executed by the event processor thread. Because 
> multiple threads can access the partitioner field synchronization is done on 
> creation and on when receivingFragmentFinished. However, the close method can 
> also be called by the event processor thread, and the close method does not 
> synchronize before accessing the partitioner field. Since synchronization is 
> not done the event processor thread may have an old reference to the 
> partitioner when a query cancellation is done. Since it has an old reference 
> the current partitioner can may not be cleared and a memory leak may occur.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6125) PartitionSenderRootExec can leak memory because close method is not synchronized

2018-02-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-6125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16377840#comment-16377840
 ] 

ASF GitHub Bot commented on DRILL-6125:
---

Github user vrozov commented on a diff in the pull request:

https://github.com/apache/drill/pull/1105#discussion_r170782805
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/partitionsender/PartitionSenderRootExec.java
 ---
@@ -62,7 +62,7 @@
   private static final org.slf4j.Logger logger = 
org.slf4j.LoggerFactory.getLogger(PartitionSenderRootExec.class);
   private RecordBatch incoming;
   private HashPartitionSender operator;
-  private PartitionerDecorator partitioner;
+  private volatile PartitionerDecorator partitioner;
--- End diff --

Consider using `AtomicReference` instead of `volatile` 


> PartitionSenderRootExec can leak memory because close method is not 
> synchronized
> 
>
> Key: DRILL-6125
> URL: https://issues.apache.org/jira/browse/DRILL-6125
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.13.0
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>Priority: Minor
> Fix For: 1.13.0
>
>
> PartitionSenderRootExec creates a PartitionerDecorator and saves it in the 
> *partitioner* field. The creation of the partitioner happens in the 
> createPartitioner method. This method get's called by the main fragment 
> thread. The partitioner field is accessed by the fragment thread during 
> normal execution but it can also be accessed by the receivingFragmentFinished 
> method which is a callback executed by the event processor thread. Because 
> multiple threads can access the partitioner field synchronization is done on 
> creation and on when receivingFragmentFinished. However, the close method can 
> also be called by the event processor thread, and the close method does not 
> synchronize before accessing the partitioner field. Since synchronization is 
> not done the event processor thread may have an old reference to the 
> partitioner when a query cancellation is done. Since it has an old reference 
> the current partitioner can may not be cleared and a memory leak may occur.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6125) PartitionSenderRootExec can leak memory because close method is not synchronized

2018-02-22 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-6125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16374057#comment-16374057
 ] 

ASF GitHub Bot commented on DRILL-6125:
---

Github user ilooner commented on the issue:

https://github.com/apache/drill/pull/1105
  
@priteshm @arina-ielchiieva I should have updated this PR earlier this 
week, here is my update. After reflecting on Arina's comments and reading some 
more docs about how java implements volatile and synchronization, I think this 
solution might not fix the original race condition. I need to to more reading 
to get a better understanding. Additionally I realized there is another race 
condition where two threads are simultaneously calling close and innerNext 
which could cause a memory leak. Haven't had a chance to dig further this week, 
so I will try to wrap this up next week.


> PartitionSenderRootExec can leak memory because close method is not 
> synchronized
> 
>
> Key: DRILL-6125
> URL: https://issues.apache.org/jira/browse/DRILL-6125
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.13.0
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>Priority: Minor
> Fix For: 1.13.0
>
>
> PartitionSenderRootExec creates a PartitionerDecorator and saves it in the 
> *partitioner* field. The creation of the partitioner happens in the 
> createPartitioner method. This method get's called by the main fragment 
> thread. The partitioner field is accessed by the fragment thread during 
> normal execution but it can also be accessed by the receivingFragmentFinished 
> method which is a callback executed by the event processor thread. Because 
> multiple threads can access the partitioner field synchronization is done on 
> creation and on when receivingFragmentFinished. However, the close method can 
> also be called by the event processor thread, and the close method does not 
> synchronize before accessing the partitioner field. Since synchronization is 
> not done the event processor thread may have an old reference to the 
> partitioner when a query cancellation is done. Since it has an old reference 
> the current partitioner can may not be cleared and a memory leak may occur.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6125) PartitionSenderRootExec can leak memory because close method is not synchronized

2018-02-22 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-6125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16373941#comment-16373941
 ] 

ASF GitHub Bot commented on DRILL-6125:
---

Github user priteshm commented on the issue:

https://github.com/apache/drill/pull/1105
  
@arina-ielchiieva is this bug ready to commit?


> PartitionSenderRootExec can leak memory because close method is not 
> synchronized
> 
>
> Key: DRILL-6125
> URL: https://issues.apache.org/jira/browse/DRILL-6125
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.13.0
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>Priority: Minor
> Fix For: 1.13.0
>
>
> PartitionSenderRootExec creates a PartitionerDecorator and saves it in the 
> *partitioner* field. The creation of the partitioner happens in the 
> createPartitioner method. This method get's called by the main fragment 
> thread. The partitioner field is accessed by the fragment thread during 
> normal execution but it can also be accessed by the receivingFragmentFinished 
> method which is a callback executed by the event processor thread. Because 
> multiple threads can access the partitioner field synchronization is done on 
> creation and on when receivingFragmentFinished. However, the close method can 
> also be called by the event processor thread, and the close method does not 
> synchronize before accessing the partitioner field. Since synchronization is 
> not done the event processor thread may have an old reference to the 
> partitioner when a query cancellation is done. Since it has an old reference 
> the current partitioner can may not be cleared and a memory leak may occur.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6125) PartitionSenderRootExec can leak memory because close method is not synchronized

2018-02-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-6125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16369007#comment-16369007
 ] 

ASF GitHub Bot commented on DRILL-6125:
---

Github user arina-ielchiieva commented on a diff in the pull request:

https://github.com/apache/drill/pull/1105#discussion_r169048352
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/partitionsender/PartitionSenderRootExec.java
 ---
@@ -348,9 +348,12 @@ public void close() throws Exception {
 logger.debug("Partition sender stopping.");
 super.close();
 ok = false;
-if (partitioner != null) {
-  updateAggregateStats();
-  partitioner.clear();
+
+synchronized (this) {
--- End diff --

1. Should partitioner be volatile?
2. Should we check if partitioner is not null before synchronization as 
well (DCL)?


> PartitionSenderRootExec can leak memory because close method is not 
> synchronized
> 
>
> Key: DRILL-6125
> URL: https://issues.apache.org/jira/browse/DRILL-6125
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.13.0
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>Priority: Minor
> Fix For: 1.13.0
>
>
> PartitionSenderRootExec creates a PartitionerDecorator and saves it in the 
> *partitioner* field. The creation of the partitioner happens in the 
> createPartitioner method. This method get's called by the main fragment 
> thread. The partitioner field is accessed by the fragment thread during 
> normal execution but it can also be accessed by the receivingFragmentFinished 
> method which is a callback executed by the event processor thread. Because 
> multiple threads can access the partitioner field synchronization is done on 
> creation and on when receivingFragmentFinished. However, the close method can 
> also be called by the event processor thread, and the close method does not 
> synchronize before accessing the partitioner field. Since synchronization is 
> not done the event processor thread may have an old reference to the 
> partitioner when a query cancellation is done. Since it has an old reference 
> the current partitioner can may not be cleared and a memory leak may occur.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6125) PartitionSenderRootExec can leak memory because close method is not synchronized

2018-02-09 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-6125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16359167#comment-16359167
 ] 

ASF GitHub Bot commented on DRILL-6125:
---

Github user ilooner commented on the issue:

https://github.com/apache/drill/pull/1105
  
@sachouche I traced through the code. The updateAggregateStats method is 
called, which then calls the getOutgoingBatches method of the code generated 
Partitioners. That method is just a simple getter. So no one is acquiring the 
same lock. But even if someone else was, the code in the close method is single 
threaded, and synchronize blocks are reentrant. 


> PartitionSenderRootExec can leak memory because close method is not 
> synchronized
> 
>
> Key: DRILL-6125
> URL: https://issues.apache.org/jira/browse/DRILL-6125
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.13.0
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>Priority: Minor
> Fix For: 1.13.0
>
>
> PartitionSenderRootExec creates a PartitionerDecorator and saves it in the 
> *partitioner* field. The creation of the partitioner happens in the 
> createPartitioner method. This method get's called by the main fragment 
> thread. The partitioner field is accessed by the fragment thread during 
> normal execution but it can also be accessed by the receivingFragmentFinished 
> method which is a callback executed by the event processor thread. Because 
> multiple threads can access the partitioner field synchronization is done on 
> creation and on when receivingFragmentFinished. However, the close method can 
> also be called by the event processor thread, and the close method does not 
> synchronize before accessing the partitioner field. Since synchronization is 
> not done the event processor thread may have an old reference to the 
> partitioner when a query cancellation is done. Since it has an old reference 
> the current partitioner can may not be cleared and a memory leak may occur.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6125) PartitionSenderRootExec can leak memory because close method is not synchronized

2018-01-31 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-6125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16347395#comment-16347395
 ] 

ASF GitHub Bot commented on DRILL-6125:
---

Github user ilooner commented on the issue:

https://github.com/apache/drill/pull/1105
  
@sachouche @arina-ielchiieva 


> PartitionSenderRootExec can leak memory because close method is not 
> synchronized
> 
>
> Key: DRILL-6125
> URL: https://issues.apache.org/jira/browse/DRILL-6125
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>Priority: Major
>
> PartitionSenderRootExec creates a PartitionerDecorator and saves it in the 
> *partitioner* field. The creation of the partitioner happens in the 
> createPartitioner method. This method get's called by the main fragment 
> thread. The partitioner field is accessed by the fragment thread during 
> normal execution but it can also be accessed by the receivingFragmentFinished 
> method which is a callback executed by the event processor thread. Because 
> multiple threads can access the partitioner field synchronization is done on 
> creation and on when receivingFragmentFinished. However, the close method can 
> also be called by the event processor thread, and the close method does not 
> synchronize before accessing the partitioner field. Since synchronization is 
> not done the event processor thread may have an old reference to the 
> partitioner when a query cancellation is done. Since it has an old reference 
> the current partitioner can may not be cleared and a memory leak may occur.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6125) PartitionSenderRootExec can leak memory because close method is not synchronized

2018-01-31 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-6125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16347393#comment-16347393
 ] 

ASF GitHub Bot commented on DRILL-6125:
---

GitHub user ilooner opened a pull request:

https://github.com/apache/drill/pull/1105

DRILL-6125: Fix possible memory leak when query is cancelled.

A detailed description of the problem and solution can be found here: 

https://issues.apache.org/jira/browse/DRILL-6125

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ilooner/drill DRILL-6125

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/1105.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1105


commit 1d1725a276c058e8c09e456963bac928d1f062ed
Author: Timothy Farkas 
Date:   2018-01-30T23:55:41Z

DRILL-6125: Fix possible memory leak when query is cancelled.




> PartitionSenderRootExec can leak memory because close method is not 
> synchronized
> 
>
> Key: DRILL-6125
> URL: https://issues.apache.org/jira/browse/DRILL-6125
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>Priority: Major
>
> PartitionSenderRootExec creates a PartitionerDecorator and saves it in the 
> *partitioner* field. The creation of the partitioner happens in the 
> createPartitioner method. This method get's called by the main fragment 
> thread. The partitioner field is accessed by the fragment thread during 
> normal execution but it can also be accessed by the receivingFragmentFinished 
> method which is a callback executed by the event processor thread. Because 
> multiple threads can access the partitioner field synchronization is done on 
> creation and on when receivingFragmentFinished. However, the close method can 
> also be called by the event processor thread, and the close method does not 
> synchronize before accessing the partitioner field. Since synchronization is 
> not done the event processor thread may have an old reference to the 
> partitioner when a query cancellation is done. Since it has an old reference 
> the current partitioner can may not be cleared and a memory leak may occur.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)