ibzib commented on issue #11052: [BEAM-9446] Add missing parallelism and execution mode args. URL: https://github.com/apache/beam/pull/11052#issuecomment-597375999 > Have you tried working around this by not discarding these options? AFAIK the json parser is smart enough to read the stringified verison of all option values. I think this may be the best strategy for the uber jar job server, however I don't think we should change this behavior for other runners. (Not sure if that's what you were proposing, just organizing my thoughts here:) - In Dataflow, we seem to duplicate every runner option for each SDK, perhaps because there is no better choice due to the runner architecture. In that case, since all the args are presumably known by the SDK, it makes more sense to drop them (status quo) or maybe even error when arguments are unknown, because it usually means the user made a mistake. - With the "old" Flink job server, retrieving args from the job server is an adequate workaround, so again, there should be no need for unrecognized arguments. I discussed this with @angoenka today and he suggested that we consider the runner-Flink boundary as well -- i.e., if we should have some way of enabling _all_ Flink environment options to be set through Beam pipeline options instead of just adding the ones we need as we go. This would potentially save users from having to wait for a new release just for us to add a pipeline option that trivially maps 1:1 to Flink (of course, they can always change Flink's conf files, which was going to be my proposed workaround here, but AFAIK that requires a restart of the cluster and would affect all jobs run on the cluster). WDYT? +cc @tweise
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
