[ 
https://issues.apache.org/jira/browse/BEAM-9446?focusedWorklogId=401057&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-401057
 ]

ASF GitHub Bot logged work on BEAM-9446:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 10/Mar/20 23:55
            Start Date: 10/Mar/20 23:55
    Worklog Time Spent: 10m 
      Work Description: ibzib commented on issue #11052: [BEAM-9446] Add 
missing parallelism and execution mode args.
URL: https://github.com/apache/beam/pull/11052#issuecomment-597375999
 
 
   > Have you tried working around this by not discarding these options? AFAIK 
the json parser is smart enough to read the stringified verison of all option 
values.
   
   I think this may be the best strategy for the uber jar job server, however I 
don't think we should change this behavior for other runners. (Not sure if 
that's what you were proposing, just organizing my thoughts here:)
   - In Dataflow, we seem to duplicate every runner option for each SDK, 
perhaps because there is no better choice due to the runner architecture. In 
that case, since all the args are presumably known by the SDK, it makes more 
sense to drop them (status quo) or maybe even error when arguments are unknown, 
because it usually means the user made a mistake.
   - With the "old" Flink job server, retrieving args from the job server is an 
adequate workaround, so again, there should be no need for unrecognized 
arguments.
   
   I discussed this with @angoenka today and he suggested that we consider the 
runner-Flink boundary as well -- i.e., if we should have some way of enabling 
_all_ Flink environment options to be set through Beam pipeline options instead 
of just adding the ones we need as we go. This would potentially save users 
from having to wait for a new release just for us to add a pipeline option that 
trivially maps 1:1 to Flink (of course, they can always change Flink's conf 
files, which was going to be my proposed workaround here, but AFAIK that 
requires a restart of the cluster and would affect all jobs run on the 
cluster). WDYT?
   
   +cc @tweise 
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

    Worklog Id:     (was: 401057)
    Time Spent: 1h 40m  (was: 1.5h)

> FlinkRunner discards parallelism and execution_mode_for_batch pipeline options
> ------------------------------------------------------------------------------
>
>                 Key: BEAM-9446
>                 URL: https://issues.apache.org/jira/browse/BEAM-9446
>             Project: Beam
>          Issue Type: Bug
>          Components: runner-flink
>            Reporter: Kyle Weaver
>            Assignee: Kyle Weaver
>            Priority: Major
>              Labels: portability-flink
>          Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> I need these options for TFX, but they're being discarded (I believe they are 
> normally supplied by the job server).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to