Looks like https://ci.apache.org/projects/flink/flink-docs-stable/dev/parallel.html#operator-level
Regards, Amit On Mon, Jun 29, 2020 at 12:59 PM Kenneth Knowles <k...@apache.org> wrote: > This exact issue has been discussed before, though I can't find the older > threads. Basically, specifying parallelism is a workaround (aka a cost), > not a feature (aka a benefit). Sometimes you have to pay that cost as it is > the only solution currently understood or implemented. It depends on what > your reason is for having to set parallelism. > > A lot of the time, the parallelism is a property of the combination of the > pipeline and the data. The same pipeline with different data should have > this tuned differently. For composite transforms in a library (not the top > level pipeline) this is even more likely. It sounds like the suggestions > here fit this case. > > Some of the time, max parallelism has to do with not overwhelming another > service. This depends on the particular endpoint. That is usually > construction-time information. In this case you want to have portable > mandatory limits. > > Could you clarify your use case? > > Kenn > > On Mon, Jun 29, 2020 at 8:58 AM Luke Cwik <lc...@google.com> wrote: > >> Check out this thread[1] about adding "runner determined sharding" as a >> general concept. This could be used to enhance the reshuffle implementation >> significantly and might remove the need for per transform parallelism from >> that specific use case and likely from most others. >> >> 1: >> https://lists.apache.org/thread.html/rfd1ca93268eb215fbbcfe098c1dfb330f1b84fb89673325135dfd9a8%40%3Cdev.beam.apache.org%3E >> >> On Mon, Jun 29, 2020 at 4:03 AM Maximilian Michels <m...@apache.org> >> wrote: >> >>> We could allow parameterizing transforms by using transform identifiers >>> from the pipeline, e.g. >>> >>> >>> options = ['--parameterize=MyTransform;parallelism=5'] >>> with Pipeline.create(PipelineOptions(options)) as p: >>> p | Create(1, 2, 3) | 'MyTransform' >> ParDo(..) >>> >>> >>> Those hints should always be optional, such that a pipeline continues to >>> run on all runners. >>> >>> -Max >>> >>> On 28.06.20 14:30, Reuven Lax wrote: >>> > However such a parameter would be specific to a single transform, >>> > whereas maxNumWorkers is a global parameter today. >>> > >>> > On Sat, Jun 27, 2020 at 10:31 PM Daniel Collins <dpcoll...@google.com >>> > <mailto:dpcoll...@google.com>> wrote: >>> > >>> > I could imagine for example, a 'parallelismHint' field in the base >>> > parameters that could be set to maxNumWorkers when running on >>> > dataflow or an equivalent parameter when running on flink. It would >>> > be useful to get a default value for the sharding in the Reshuffle >>> > changes here https://github.com/apache/beam/pull/11919, but more >>> > generally to have some decent guess on how to best shard work. Then >>> > it would be runner-agnostic; you could set it to something like >>> > numCpus on the local runner for instance. >>> > >>> > On Sat, Jun 27, 2020 at 2:04 AM Reuven Lax <re...@google.com >>> > <mailto:re...@google.com>> wrote: >>> > >>> > It's an interesting question - this parameter is clearly very >>> > runner specific (e.g. it would be meaningless for the Dataflow >>> > runner, where parallelism is not a static constant). How should >>> > we go about passing runner-specific options per transform? >>> > >>> > On Fri, Jun 26, 2020 at 1:14 PM Akshay Iyangar >>> > <aiyan...@godaddy.com <mailto:aiyan...@godaddy.com>> wrote: >>> > >>> > Hi beam community,____ >>> > >>> > __ __ >>> > >>> > So I had brought this issue in our slack channel but I >>> guess >>> > this warrants a deeper discussion and if we do go about >>> what >>> > is the POA for it.____ >>> > >>> > __ __ >>> > >>> > So basically currently for Flink Runner we don’t support >>> > operator level parallelism which native Flink provides >>> OOTB. >>> > So I was wondering what the community feels about having >>> > some way to pass parallelism for individual operators esp. >>> > for some of the existing IO’s ____ >>> > >>> > __ __ >>> > >>> > Wanted to know what people think of this.____ >>> > >>> > __ __ >>> > >>> > Thanks ____ >>> > >>> > Akshay I____ >>> > >>> >>