[
https://issues.apache.org/jira/browse/BEAM-13130?focusedWorklogId=670999&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-670999
]
ASF GitHub Bot logged work on BEAM-13130:
-----------------------------------------
Author: ASF GitHub Bot
Created on: 27/Oct/21 20:00
Start Date: 27/Oct/21 20:00
Worklog Time Spent: 10m
Work Description: lostluck commented on a change in pull request #15815:
URL: https://github.com/apache/beam/pull/15815#discussion_r737801496
##########
File path: sdks/go/pkg/beam/core/runtime/exec/pardo.go
##########
@@ -324,6 +324,11 @@ func (n *ParDo) postInvoke() error {
func (n *ParDo) fail(err error) error {
Review comment:
Fail's only called if the *local* DoFn fails.We need to protect against
panics with defer.
Consider a Pipeline with DoFns `A -> B -> C.` Let's say `C` has an error and
is able to return it.
If `B` and `A` are the simplest 1:1 DoFns, then the error is passed through
and eventually gets caught. Great.
If `B` is using an emitter though then the emitter code is what's receiving
the error, because the emitter is what calls the downstream
ParDo.ProcessElement in for `C`.
Emitters don't pass out errors (by design), so the only way to send the
error back up, is to panic. This is true for reflectively generated emmitters
and code generated ones, but you can see the reflective one here along with
another explanation of this:
https://github.com/apache/beam/blob/e373d92f3e89396971038072e0e0d2489764ea30/sdks/go/pkg/beam/core/runtime/exec/emit.go#L119
Any reset behavior *has* to be happening in a `defer` to be safe during
panics. Just like errors, we recover the panic, and report that upstream as a
failed bundle, for the runner to possibly retry.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 670999)
Time Spent: 1h (was: 50m)
> [Go SDK] Side Input Memory Leak
> -------------------------------
>
> Key: BEAM-13130
> URL: https://issues.apache.org/jira/browse/BEAM-13130
> Project: Beam
> Issue Type: Bug
> Components: sdk-go
> Reporter: Jack McCluskey
> Assignee: Jack McCluskey
> Priority: P2
> Time Spent: 1h
> Remaining Estimate: 0h
>
> Keeping a list of the open stateKeyReader structs in the ScopedStateReader is
> holding on to memory for the life of bundles, leading to significant memory
> usage in large batch jobs. This list needs to be removed to the garbage
> collector can clean up stateKeyReader structs sooner.
>
> stateKeyReader:
> [https://github.com/apache/beam/blob/d916c1f55e57a61b54135d0922ad8660735bd287/sdks/go/pkg/beam/core/runtime/harness/statemgr.go#L105]
> List kept by ScopedStateReader:
> https://github.com/apache/beam/blob/d916c1f55e57a61b54135d0922ad8660735bd287/sdks/go/pkg/beam/core/runtime/harness/statemgr.go#L39
--
This message was sent by Atlassian Jira
(v8.3.4#803005)