IMO, the further discussion should be continued on the dev@apex list.
As I mention, there are several enhancement requests open for the buffer
server related to the back pressure behavior and limiting memory usage
when spooling is disabled is only one of them. It is also necessary to
limit amount of disk usage when spooling is enabled. Those are the major
ones. There is another JIRA for changing how the buffer server reads
blocks spooled to disk. I plan to work on those requests once the
network prototype is ready, but possibly somebody from the Apex
community wants to take a look.
The netlet memory usage is separate from the buffer server memory usage.
The buffer server has it's own pool of memory blocks separate from the
netlet queue. The netlet queue is already upper bounded, but the problem
is that it limits number of byte arrays that may be waiting in the
queue, not the total memory consumption. There is pull request in
progress to limit netlet memory usage. It is still necessary to look
into the buffer server JIRAs.
Vlad
On 1/31/16 23:15, Pramod Immaneni wrote:
For natural back pressure to work in all cases the memory issues need
to be addressed and work is being done on that. That was my point in
the earlier email.
Thanks
On Sun, Jan 31, 2016 at 9:54 PM, Chinmay Kolhatkar
<[email protected] <mailto:[email protected]>> wrote:
When an upstream operator pushes data faster than the buffer
server can spool it to disk, the buffer server disables reads from
the upstream operator putting a back pressure on the upstream
operator once the limit is reached
[CK] If I read it correctly, this means that if the point when
back pressure takes effect on upstream operators is dependent on
disk performance. If that is true, would it make sense to make
this independent of disk performance?
On Mon, Feb 1, 2016 at 9:37 AM, Vlad Rozov
<[email protected] <mailto:[email protected]>> wrote:
Thomas is correct. When buffer server spooling is enabled
(default behavior since 3.0), the buffer server limits its
memory usage and starts spooling to disk once half of the
specified limit is reached. When an upstream operator pushes
data faster than the buffer server can spool it to disk, the
buffer server disables reads from the upstream operator
putting a back pressure on the upstream operator once the
limit is reached. It gives ability for a downstream
operator(s) to catch up with the data already pushed to the
buffer server reducing amount of memory the buffer server
uses. Once it drops below the limit, the buffer server enables
reads from the upstream operator. By default buffer server is
allowed to use 512 MB, and it can be configured using
BUFFER_MEMORY_MB port attribute.
When spooling is disabled (using another port attribute
BUFFER_SPOOLING), the buffer server does not limit its memory
usage, so if it is a temporary slowdown in a down stream
operator and there is sufficient amount of memory, the buffer
server will not crash. Only when downstream operator(s) are
not capable to keep up with the upstream operator, the buffer
server and JVM may run out of allocated memory.
There are several JIRAs open to enhance the buffer server to
enable back pressure mechanism when spooling is disabled and
also limit amount of disk storage that the buffer server may
use for spooling.
Vlad
On 1/31/16 16:24, Thomas Weise wrote:
That's incorrect. Backpressure works when spooling is enabled
(which is default). It's not handled only when you turn
spooling off explicitly.
On Sun, Jan 31, 2016 at 3:50 PM, Sandesh Hegde
<[email protected] <mailto:[email protected]>> wrote:
According to Vlad, disabling the spooling will crash the
buffer server after it runs out of memory.
It means Apex doesn't have a mechanism to handle
backpressure yet.
On Fri, Jan 29, 2016 at 9:34 AM Pramod Immaneni
<[email protected] <mailto:[email protected]>>
wrote:
By default buffer spooling is enabled so data gets
spooled to file system once the buffer limits are
reached, there will be some slow down but upstream
will continue to process, if buffer spooling is
disabled then when the buffers are filled the sender
is blocked and this back pressure will propagate
upstream to the first operator.
On Fri, Jan 29, 2016 at 7:47 AM, Sandesh Hegde
<[email protected]
<mailto:[email protected]>> wrote:
Hello Team,
My understanding of the backpressure in Apex is,
Buffer server will slow down ( because of TCP/IP
congestion control ) the upstream operator if the
downstream is slow. Is there more to it?
I don't see this topic covered in docs.
Thanks
Sandesh