Hi, I assume you're using the Dataflow runner. Have you tried using the OOM
debugging flags at
https://github.com/apache/beam/blob/master/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/options/DataflowPipelineDebugOptions.java#L169-L193
 ?

On Mon, Jan 29, 2018 at 2:05 AM Carlos Alonso <[email protected]> wrote:

> Hi everyone!!
>
> On our pipeline we're experiencing OOM errors. It's been a while since
> we've been trying to nail them down but without any luck.
>
> Our pipeline:
> 1. Reads messages from PubSub of different types (about 50 different
> types).
> 2. Outputs KV elements being the key the type and the value the element
> message itself (JSON)
> 3. Applies windowing (fixed windows of one our. With early and late
> firings after one minute after processing the first element in pane). Two
> days allowed lateness and discarding fired panes.
> 4. Buffers the elements (using stateful and timely processing). We buffer
> the elements for 15 minutes or until it reaches a maximum size of 16Mb.
> This step's objective is to avoid window's panes grow too big.
> 5. Writes the outputted elements to files in GCS using dynamic routing.
> Being the route: type/window/buffer_index-paneInfo.
>
> Our fist step was without buffering and just windows with early and late
> firings on 15 minutes, but. guessing that OOMs were because of panes
> growing too big we built that buffering step to trigger on size as well.
>
> The full trace we're seeing can be found here:
> https://pastebin.com/0vfE6pUg
>
> There's an annoying thing we've experienced and is that, the smaller the
> buffer size, the more OOM errors we see which was a bit disappointing...
>
> Can anyone please give us any hint?
>
> Thanks in advance!
>

Reply via email to