ahmedabu98 commented on issue #32746: URL: https://github.com/apache/beam/issues/32746#issuecomment-2407681282
Hmmm, I'm seeing an old metric that we dropped (`manifestFilesWritten`). Can you try using beam version `2.61.0-SNAPSHOT`? > Is it similar to the Spark repartition? Does it shuffle data? Yes they're similar. The idea is to redistribute data across workers. > How will it work with autoscaling enabled? This is hard to predict. In general autoscaling reacts to your backlog and throughput, and it may autoscale to more than the number of keys in your Redistribute. > Right now, I have autoscaling disabled an I will try to set N=2 and machineType=n1-standard-4 That's a good first step! Let me know how it goes -- honestly you may end up only needing the Redistribute -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
