Hi,

Have you succeeded saving a heap dump? I've also run into this a while ago
and was not able to save a heap dump nor increase the boot disc size. If
you have any update on this, could you please share?

Thanks in advance,
Frantisek

On Wed, Nov 20, 2019 at 1:46 AM Luke Cwik <[email protected]> wrote:

> You might want to reach out to cloud support for help with debugging this
> and/or help with how to debug this.
>
> On Mon, Nov 18, 2019 at 10:56 AM Jeff Klukas <[email protected]> wrote:
>
>> On Mon, Nov 18, 2019 at 1:32 PM Reynaldo Baquerizo <
>> [email protected]> wrote:
>>
>>>
>>> Does it tell anything that the GCP console does not show the options
>>> --dumpHeapOnOOM --saveHeapDumpsToGcsPath of a running job under
>>> PipelineOptions (it does for diskSizeGb)?
>>>
>>
>> That's normal; I also never saw those heap dump options display in the
>> Dataflow UI. I think Dataflow doesn't show any options that originate from
>> "Debug" options interfaces.
>>
>>
>>
>>> On Mon, Nov 18, 2019 at 11:59 AM Jeff Klukas <[email protected]>
>>> wrote:
>>>
>>>> Using default Dataflow workers, this is the set of options I passed:
>>>>
>>>> --dumpHeapOnOOM --saveHeapDumpsToGcsPath=$MYBUCKET/heapdump
>>>> --diskSizeGb=100
>>>>
>>>>
>>>> On Mon, Nov 18, 2019 at 11:57 AM Jeff Klukas <[email protected]>
>>>> wrote:
>>>>
>>>>> It sounds like you're generally doing the right thing. I've
>>>>> successfully used --saveHeapDumpsToGcsPath in a Java pipeline running on
>>>>> Dataflow and inspected the results in Eclipse MAT.
>>>>>
>>>>> I think that --saveHeapDumpsToGcsPath will automatically turn on
>>>>> --dumpHeapOnOOM but worth setting that explicitly too.
>>>>>
>>>>> Are your boot disks large enough to store the heap dumps? The docs for
>>>>> getSaveHeapDumpsToGcsPath [0] mention "CAUTION: This option implies
>>>>> dumpHeapOnOOM, and has similar caveats. Specifically, heap dumps can of
>>>>> comparable size to the default boot disk. Consider increasing the boot 
>>>>> disk
>>>>> size before setting this flag to true."
>>>>>
>>>>> When I've done this in the past, I definitely had to increase boot
>>>>> disk size (though I forget now what the relevant Dataflow option was).
>>>>>
>>>>> [0]
>>>>> https://beam.apache.org/releases/javadoc/2.16.0/org/apache/beam/runners/dataflow/options/DataflowPipelineDebugOptions.html
>>>>>
>>>>> On Mon, Nov 18, 2019 at 11:35 AM Reynaldo Baquerizo <
>>>>> [email protected]> wrote:
>>>>>
>>>>>> Hi all,
>>>>>>
>>>>>> We are running into OOM issues with one of our pipelines. They are
>>>>>> not reproducible with DirectRunner, only with Dataflow.
>>>>>> I tried --saveHeapDumpsToGcsPath, but it does not save any heap dump
>>>>>> (MyOptions extends DataflowPipelineDebugOptions)
>>>>>> I looked at the java process inside the docker container and it has
>>>>>> remote jmx enabled through port 5555, but outside traffic is firewalled.
>>>>>>
>>>>>> Beam SDK: 2.15.0
>>>>>>
>>>>>> Any ideas?
>>>>>>
>>>>>> Cheers,
>>>>>> --
>>>>>> Reynaldo
>>>>>>
>>>>>

Reply via email to