scwhittle commented on issue #28776: URL: https://github.com/apache/beam/issues/28776#issuecomment-2141868476
Without the sdk cache enabled (not on by default), removing the PerWindowInvoker cache mans that there is lots of FnApi traffic to fetch side inputs which contributed to latency in Dataflow runner. This could be a noticable regression without improved functionality for pipelines that don't actually refresh the global window side input. I think that we should postpone making this always on until the sdk cache is enabled by default. If that is too far out we could modify this to not clear after every bundle but modify finish_bundle to clear it only after some timeout. Or perhaps we can somehow access the triggering of the side input to observe if it refreshes or is calculated once. Enabling the sdk cache will let the runner control the refresh via the side-input cache token. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
