[ 
https://issues.apache.org/jira/browse/DRILL-7487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Rogers updated DRILL-7487:
-------------------------------
    Description: 
Drill has long supported the {{OUT_OF_MEMORY}} iterator status. The idea is 
that an operator can realize it has encountered memory pressure and ask its 
downstream operator to free up some memory. However, an inspection of the code 
shows that the status is actually sent in only one place 
({{UnorderedReceiverBatch}}), and then only in response to the operator hitting 
its allocator limit (which no other batch can do anything about.)

If an operator did choose to try to use this status, there are two key problems:

1. The operator must be able to suspend itself at any point that it might need 
memory. For example, an operator that allocates a dozen vectors must be able to 
stop on, say, the 9th vector, then resume at that point on the subsequent call 
to {{next()}}. The complexity of the state machine needed to do this is very 
high.
2. The *downstream* operators (who may not yet have seen rows) are the least 
likely to be able to release memory. It is the *upstream* operators (such as 
spillable operators) that might be able to spill some of the rows they are 
holding.

Presto suggests a nice alternative:

* An operator which encounters memory pressure asks the fragment executor for 
more memory.
* The fragment executor asks all *other* operators in that fragment to release 
memory if possible.

This allows a very simple memory recovery strategy:

{noformat}
  try {
    // allocate something
  } catch (OutOfMemoryException e) {
    context.requestMemory(this);
    // allocate something again, throwing OOM if it fails again
  }
{noformat}

Note that, since the fragment runs on a single thread, the above is simple to 
implement. Each operator is either idle (not executing) or in a call to 
{{next()}} on a child operator. These are both stable times to consider 
invoking spilling. Further, a sender could use this opportunity to write 
partially-filled batches to the network and release them rather than waiting 
for more data.

The only thing that can't be handled is, say, having an interior node flush a 
batch to its downstream operator in the same batch.

Proposed are two changes:

1. Retire the OUT_OF_MEMORY status. Simply remove all references to it since it 
is never sent.
2. Create a stub {{requestMemory()}} method in the operator context that does 
nothing now, but could be expanded to perform the work suggested above.


  was:
Drill has long supported the {{OUT_OF_MEMORY}} iterator status. The idea is 
that an operator can realize it has encountered memory pressure and ask its 
downstream operator to free up some memory. However, an inspection of the code 
shows that the status is actually sent in only one place 
({{UnorderedReceiverBatch}}), and then only in response to the operator hitting 
its allocator limit (which no other batch can do anything about.)

If an operator did choose to try to use this status, there are two key problems:

1. The operator must be able to suspend itself at any point that it might need 
memory. For example, an operator that allocates a dozen vectors must be able to 
stop on, say, the 9th vector, then resume at that point on the subsequent call 
to {{next()}}. The complexity of the state machine needed to do this is very 
high.
2. The *downstream* operators (who may not yet have seen rows) are the least 
likely to be able to release memory. It is the *upstream* operators (such as 
spillable operators) that might be able to spill some of the rows they are 
holding.

Presto suggests a nice alternative:

* An operator which encounters memory pressure asks the fragment executor for 
more memory.
* The fragment executor asks all *other* operators in that fragment to release 
memory if possible.

This allows a very simple memory recovery strategy:

{noformat}
  try {
    // allocate something
  } catch (OutOfMemoryException e) {
    context.requestMemory(this);
    // allocate something again, throwing OOM if it fails again
  }
{noformat}

Note that, since the fragment runs on a single thread, the above is simple to 
implement. Each operator is either idle (not executing) or in a call to 
`next()` on a child operator. These are both stable times to consider invoking 
spilling. Further, a sender could use this opportunity to write 
partially-filled batches to the network and release them rather than waiting 
for more data.

The only thing that can't be handled is, say, having an interior node flush a 
batch to its downstream operator in the same batch.

Proposed are two changes:

1. Retire the OUT_OF_MEMORY status. Simply remove all references to it since it 
is never sent.
2. Create a stub {{requestMemory()}} method in the operator context that does 
nothing now, but could be expanded to perform the work suggested above.



> Retire unused OUT_OF_MEMORY iterator status
> -------------------------------------------
>
>                 Key: DRILL-7487
>                 URL: https://issues.apache.org/jira/browse/DRILL-7487
>             Project: Apache Drill
>          Issue Type: Improvement
>            Reporter: Paul Rogers
>            Assignee: Paul Rogers
>            Priority: Minor
>
> Drill has long supported the {{OUT_OF_MEMORY}} iterator status. The idea is 
> that an operator can realize it has encountered memory pressure and ask its 
> downstream operator to free up some memory. However, an inspection of the 
> code shows that the status is actually sent in only one place 
> ({{UnorderedReceiverBatch}}), and then only in response to the operator 
> hitting its allocator limit (which no other batch can do anything about.)
> If an operator did choose to try to use this status, there are two key 
> problems:
> 1. The operator must be able to suspend itself at any point that it might 
> need memory. For example, an operator that allocates a dozen vectors must be 
> able to stop on, say, the 9th vector, then resume at that point on the 
> subsequent call to {{next()}}. The complexity of the state machine needed to 
> do this is very high.
> 2. The *downstream* operators (who may not yet have seen rows) are the least 
> likely to be able to release memory. It is the *upstream* operators (such as 
> spillable operators) that might be able to spill some of the rows they are 
> holding.
> Presto suggests a nice alternative:
> * An operator which encounters memory pressure asks the fragment executor for 
> more memory.
> * The fragment executor asks all *other* operators in that fragment to 
> release memory if possible.
> This allows a very simple memory recovery strategy:
> {noformat}
>   try {
>     // allocate something
>   } catch (OutOfMemoryException e) {
>     context.requestMemory(this);
>     // allocate something again, throwing OOM if it fails again
>   }
> {noformat}
> Note that, since the fragment runs on a single thread, the above is simple to 
> implement. Each operator is either idle (not executing) or in a call to 
> {{next()}} on a child operator. These are both stable times to consider 
> invoking spilling. Further, a sender could use this opportunity to write 
> partially-filled batches to the network and release them rather than waiting 
> for more data.
> The only thing that can't be handled is, say, having an interior node flush a 
> batch to its downstream operator in the same batch.
> Proposed are two changes:
> 1. Retire the OUT_OF_MEMORY status. Simply remove all references to it since 
> it is never sent.
> 2. Create a stub {{requestMemory()}} method in the operator context that does 
> nothing now, but could be expanded to perform the work suggested above.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to