[jira] [Commented] (BEAM-5442) PortableRunner swallows custom options for Runner

2018-10-07 Thread Thomas Weise (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16641279#comment-16641279
 ] 

Thomas Weise commented on BEAM-5442:


[~mxm] I checked and see that an unknown option such as 
--checkpointing_interval=3 is now passed through to the runner. BTW 
checkpointing doesn't seem to work for other reasons, but that's a separate 
issue.

> PortableRunner swallows custom options for Runner
> -
>
> Key: BEAM-5442
> URL: https://issues.apache.org/jira/browse/BEAM-5442
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core, sdk-py-core
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Major
>  Labels: portability, portability-flink
> Fix For: 2.8.0
>
>  Time Spent: 6h 20m
>  Remaining Estimate: 0h
>
> The PortableRunner doesn't pass custom PipelineOptions to the executing 
> Runner.
> Example: {{--parallelism=4}} won't be forwarded to the FlinkRunner.
> (The option is just removed during proto translation without any warning)
> We should allow some form of customization through the options, even for the 
> PortableRunner. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-5442) PortableRunner swallows custom options for Runner

2018-10-05 Thread Thomas Weise (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16640145#comment-16640145
 ] 

Thomas Weise commented on BEAM-5442:


[~udim] as stop gap we could probably skip --beam_plugins here: 
[https://github.com/apache/beam/pull/6557/files#diff-525d5d65bedd7ea5e6fce6e4cd57e153R221]

 

> PortableRunner swallows custom options for Runner
> -
>
> Key: BEAM-5442
> URL: https://issues.apache.org/jira/browse/BEAM-5442
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core, sdk-py-core
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Major
>  Labels: portability, portability-flink
> Fix For: 2.8.0
>
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> The PortableRunner doesn't pass custom PipelineOptions to the executing 
> Runner.
> Example: {{--parallelism=4}} won't be forwarded to the FlinkRunner.
> (The option is just removed during proto translation without any warning)
> We should allow some form of customization through the options, even for the 
> PortableRunner. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-5442) PortableRunner swallows custom options for Runner

2018-10-05 Thread Udi Meiri (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16640114#comment-16640114
 ] 

Udi Meiri commented on BEAM-5442:
-

I believe PR 6557 broke integration tests using Dataflow.


Cloud console:

"Parsing unknown args: 
[u'--dataflowJobId=2018-10-05_07_00_20-5526009939236014896', 
u'--autoscalingAlgorithm=NONE', u'--direct_runner_use_stacked_bundle', 
u'--maxNumWorkers=0', u'--style=scrambled', u'--sleep_secs=20', 
u'--pipeline_type_check', 
u'--gcpTempLocation=gs://temp-storage-for-end-to-end-tests/temp-it/beamapp-jenkins-1005140012-917021.1538748012.917145',
 u'--numWorkers=1', u'--beam_plugins=apache_beam.io.filesystem.FileSystem', 
u'--beam_plugins=apache_beam.io.hadoopfilesystem.HadoopFileSystem', 
u'--beam_plugins=apache_beam.io.localfilesystem.LocalFileSystem', 
u'--beam_plugins=apache_beam.io.gcp.gcsfilesystem.GCSFileSystem', 
u'--beam_plugins=apache_beam.io.filesystem_test.TestingFileSystem', 
u'--beam_plugins=apache_beam.runners.interactive.display.pipeline_graph_renderer.PipelineGraphRenderer',
 
u'--beam_plugins=apache_beam.runners.interactive.display.pipeline_graph_renderer.MuteRenderer',
 
u'--beam_plugins=apache_beam.runners.interactive.display.pipeline_graph_renderer.TextRenderer',
 
u'--beam_plugins=apache_beam.runners.interactive.display.pipeline_graph_renderer.PydotRenderer',
 
u'--pipelineUrl=gs://temp-storage-for-end-to-end-tests/staging-it/beamapp-jenkins-1005140012-917021.1538748012.917145/pipeline.pb']"
 
"Python sdk harness failed: 
Traceback (most recent call last):
  File 
"/usr/local/lib/python2.7/dist-packages/apache_beam/runners/worker/sdk_worker_main.py",
 line 133, in main
sdk_pipeline_options.get_all_options(drop_default=True))
  File 
"/usr/local/lib/python2.7/dist-packages/apache_beam/options/pipeline_options.py",
 line 224, in get_all_options
parser.add_argument(arg.split('=', 1)[0], nargs='?')
  File "/usr/lib/python2.7/argparse.py", line 1308, in add_argument
return self._add_action(action)
  File "/usr/lib/python2.7/argparse.py", line 1682, in _add_action
self._optionals._add_action(action)
  File "/usr/lib/python2.7/argparse.py", line 1509, in _add_action
action = super(_ArgumentGroup, self)._add_action(action)
  File "/usr/lib/python2.7/argparse.py", line 1322, in _add_action
self._check_conflict(action)
  File "/usr/lib/python2.7/argparse.py", line 1460, in _check_conflict
conflict_handler(action, confl_optionals)
  File "/usr/lib/python2.7/argparse.py", line 1467, in _handle_conflict_error
raise ArgumentError(action, message % conflict_string)
ArgumentError: argument --beam_plugins: conflicting option string(s): 
--beam_plugins"   


Test output:
07:28:37 ==
07:28:37 FAIL: test_streaming_with_attributes 
(apache_beam.io.gcp.pubsub_integration_test.PubSubIntegrationTest)
07:28:37 --
07:28:37 Traceback (most recent call last):
07:28:37   File 
"/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Verify/src/sdks/python/apache_beam/io/gcp/pubsub_integration_test.py",
 line 172, in test_streaming_with_attributes
07:28:37 self._test_streaming(with_attributes=True)
07:28:37   File 
"/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Verify/src/sdks/python/apache_beam/io/gcp/pubsub_integration_test.py",
 line 164, in _test_streaming
07:28:37 timestamp_attribute=self.TIMESTAMP_ATTRIBUTE)
07:28:37   File 
"/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Verify/src/sdks/python/apache_beam/io/gcp/pubsub_it_pipeline.py",
 line 91, in run_pipeline
07:28:37 result = p.run()
07:28:37   File 
"/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Verify/src/sdks/python/apache_beam/pipeline.py",
 line 416, in run
07:28:37 return self.runner.run_pipeline(self)
07:28:37   File 
"/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Verify/src/sdks/python/apache_beam/runners/dataflow/test_dataflow_runner.py",
 line 65, in run_pipeline
07:28:37 hc_assert_that(self.result, pickler.loads(on_success_matcher))
07:28:37 AssertionError: 
07:28:37 Expected: (Test pipeline expected terminated in state: RUNNING and 
Expected 2 messages.)
07:28:37  but: Expected 2 messages. Got 0 messages. Diffs (item, count):
07:28:37   Expected but not in actual: [(PubsubMessage(data001-seen, 
{'processed': 'IT'}), 1), (PubsubMessage(data002-seen, {'timestamp_out': 
'2018-07-11T02:02:50.149000Z', 'processed': 'IT'}), 1)]
07:28:37   Unexpected: []
07:28:37   Stripped attributes: ['id', 'timestamp']
07:28:37 
07:28:37  >> begin captured stdout << -
07:28:37 Found: 
https://console.cloud.google.com/dataflow/jobsDetail/locations/us-central1/jobs/2018-10-05_07_00_20-5526009939236014896?project=apache-beam-testing.
07:28:37 
07:28:37 

[jira] [Commented] (BEAM-5442) PortableRunner swallows custom options for Runner

2018-10-04 Thread Maximilian Michels (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16637961#comment-16637961
 ] 

Maximilian Michels commented on BEAM-5442:
--

Yes, this should go into 2.8.0. [~altay] You can check out the PR if you're 
interested: https://github.com/apache/beam/pull/6557

> PortableRunner swallows custom options for Runner
> -
>
> Key: BEAM-5442
> URL: https://issues.apache.org/jira/browse/BEAM-5442
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core, sdk-py-core
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Major
>  Labels: portability, portability-flink
> Fix For: 2.8.0
>
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> The PortableRunner doesn't pass custom PipelineOptions to the executing 
> Runner.
> Example: {{--parallelism=4}} won't be forwarded to the FlinkRunner.
> (The option is just removed during proto translation without any warning)
> We should allow some form of customization through the options, even for the 
> PortableRunner. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-5442) PortableRunner swallows custom options for Runner

2018-10-03 Thread Thomas Weise (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16637627#comment-16637627
 ] 

Thomas Weise commented on BEAM-5442:


Yes, the PR is open.

> PortableRunner swallows custom options for Runner
> -
>
> Key: BEAM-5442
> URL: https://issues.apache.org/jira/browse/BEAM-5442
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core, sdk-py-core
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Major
>  Labels: portability, portability-flink
> Fix For: 2.8.0
>
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> The PortableRunner doesn't pass custom PipelineOptions to the executing 
> Runner.
> Example: {{--parallelism=4}} won't be forwarded to the FlinkRunner.
> (The option is just removed during proto translation without any warning)
> We should allow some form of customization through the options, even for the 
> PortableRunner. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-5442) PortableRunner swallows custom options for Runner

2018-10-03 Thread Ahmet Altay (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16637617#comment-16637617
 ] 

Ahmet Altay commented on BEAM-5442:
---

What is the plan for this JIRA? Are we targeting 2.8.0 with cut date 10/10?

> PortableRunner swallows custom options for Runner
> -
>
> Key: BEAM-5442
> URL: https://issues.apache.org/jira/browse/BEAM-5442
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core, sdk-py-core
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Major
>  Labels: portability, portability-flink
> Fix For: 2.8.0
>
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> The PortableRunner doesn't pass custom PipelineOptions to the executing 
> Runner.
> Example: {{--parallelism=4}} won't be forwarded to the FlinkRunner.
> (The option is just removed during proto translation without any warning)
> We should allow some form of customization through the options, even for the 
> PortableRunner. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-5442) PortableRunner swallows custom options for Runner

2018-09-21 Thread Robert Bradshaw (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16623368#comment-16623368
 ] 

Robert Bradshaw commented on BEAM-5442:
---

It seems like we should validate the options we know about, but pass all 
options through. (Maybe warn about ones we don't know about?)

> PortableRunner swallows custom options for Runner
> -
>
> Key: BEAM-5442
> URL: https://issues.apache.org/jira/browse/BEAM-5442
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core, sdk-py-core
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Major
>  Labels: portability, portability-flink
> Fix For: 2.8.0
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> The PortableRunner doesn't pass custom PipelineOptions to the executing 
> Runner.
> Example: {{--parallelism=4}} won't be forwarded to the FlinkRunner.
> (The option is just removed during proto translation without any warning)
> We should allow some form of customization through the options, even for the 
> PortableRunner. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-5442) PortableRunner swallows custom options for Runner

2018-09-20 Thread Maximilian Michels (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16622500#comment-16622500
 ] 

Maximilian Michels commented on BEAM-5442:
--

There are some pros to doing an upfront parsing/validation of the options. 
Maybe both can work complementary. The options can be parsed/validated but 
unknown options should still be forwarded.

> PortableRunner swallows custom options for Runner
> -
>
> Key: BEAM-5442
> URL: https://issues.apache.org/jira/browse/BEAM-5442
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core, sdk-py-core
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Major
>  Labels: portability, portability-flink
> Fix For: 2.8.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The PortableRunner doesn't pass custom PipelineOptions to the executing 
> Runner.
> Example: {{--parallelism=4}} won't be forwarded to the FlinkRunner.
> (The option is just removed during proto translation without any warning)
> We should allow some form of customization through the options, even for the 
> PortableRunner. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-5442) PortableRunner swallows custom options for Runner

2018-09-20 Thread Thomas Weise (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16622452#comment-16622452
 ] 

Thomas Weise commented on BEAM-5442:


Yeah, except that it would be nice to solve this generically (not depending on 
FlinkOptions) since other runners will have their own options. Can we not 
simply delegate the responsibility to validate those unknown / runner specific 
options to the job server?

> PortableRunner swallows custom options for Runner
> -
>
> Key: BEAM-5442
> URL: https://issues.apache.org/jira/browse/BEAM-5442
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core, sdk-py-core
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Major
>  Labels: portability, portability-flink
> Fix For: 2.8.0
>
>
> The PortableRunner doesn't pass custom PipelineOptions to the executing 
> Runner.
> Example: {{--parallelism=4}} won't be forwarded to the FlinkRunner.
> (The option is just removed during proto translation without any warning)
> We should allow some form of customization through the options, even for the 
> PortableRunner. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-5442) PortableRunner swallows custom options for Runner

2018-09-20 Thread Maximilian Michels (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16622449#comment-16622449
 ] 

Maximilian Michels commented on BEAM-5442:
--

Actually, if you look at {{get_all_options}} method you see it filter out all 
the options which are not part of a {{PipelineOptions}} class. See 
https://github.com/apache/beam/blob/master/sdks/python/apache_beam/options/pipeline_options.py#L212

So we just need to add a FlinkOptions derived from PipelineOptions to the 
Python SDK.

> PortableRunner swallows custom options for Runner
> -
>
> Key: BEAM-5442
> URL: https://issues.apache.org/jira/browse/BEAM-5442
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core, sdk-py-core
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Major
>  Labels: portability, portability-flink
> Fix For: 2.8.0
>
>
> The PortableRunner doesn't pass custom PipelineOptions to the executing 
> Runner.
> Example: {{--parallelism=4}} won't be forwarded to the FlinkRunner.
> (The option is just removed during proto translation without any warning)
> We should allow some form of customization through the options, even for the 
> PortableRunner. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-5442) PortableRunner swallows custom options for Runner

2018-09-20 Thread Thomas Weise (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16622268#comment-16622268
 ] 

Thomas Weise commented on BEAM-5442:


Ideally all of the options would be passed through, I suppose that was the 
intention:

[https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/portability/portable_runner.py#L108]

There are options for the runner, options for the sdk harness, ... that the 
user needs to be able to control. Perhaps we need to scope the runner specific 
options differently?

 

 

> PortableRunner swallows custom options for Runner
> -
>
> Key: BEAM-5442
> URL: https://issues.apache.org/jira/browse/BEAM-5442
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core, sdk-py-core
>Reporter: Maximilian Michels
>Priority: Major
>  Labels: portability, portability-flink
> Fix For: 2.8.0
>
>
> The PortableRunner doesn't pass custom PipelineOptions to the executing 
> Runner.
> Example: {{--parallelism=4}} won't be forwarded to the FlinkRunner.
> (The option is just removed during proto translation without any warning)
> We should allow some form of customization through the options, even for the 
> PortableRunner. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)