Re: [I] [Bug][Prism]: Colab cannot run PrismRunner multiple times [beam]

via GitHub Mon, 14 Apr 2025 10:09:40 -0700


lostluck commented on issue #33623:
URL: https://github.com/apache/beam/issues/33623#issuecomment-2802309243

The only trick with the singleton approach without an occasional idle
timeout would be that Prism isn't yet set up for indefinite running. So a heads
up on OOMs if users keep an instance around long term, which may happen
depending on how many iterations someone does in Colab.

1. Artifacts are kept in memory indefinitely:
https://github.com/apache/beam/blob/master/sdks/go/pkg/beam/runners/prism/internal/jobservices/server.go#L63
2. In principle job metadata/pipeline Protos kept in memory indefinitely
https://github.com/apache/beam/blob/master/sdks/go/pkg/beam/runners/prism/internal/jobservices/server.go#L48

I have a suggested immediate resolution, and one longer term:
1. Recognize that when the UI isn't turned on, we likely only need to keep
active jobs in memory.
2. Offload some of the "archive" stuff out of memory into a persistent
storage location if configured.

The "default" mode for prism is really the SDKs spinning itself up,
lingering until the SDK process is done with it. Artifacts can be GC'd right
away once the job is done, and metrics can probably stick around indefinitely
to some limit on the number of cached jobs stats.

That's also complimentary to the long term set up, but a long term job is a
priority, and doing it right/completely is a much larger task. (eg things like:
also putting logs in the durable storage, putting durable intermediates there
for larger jobs, restart in progress jobs/job update etc).

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@beam.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [I] [Bug][Prism]: Colab cannot run PrismRunner multiple times [beam]

Reply via email to