Re: Specifying environment for cross-language transform expansion

2021-07-02 Thread Jan Lukavský
I'm not sure what is the conclusion here. I'd suggest the following:  a) merge fix for BEAM-12538 ([1]), so that it can be fixed in 2.32.0, because the current state is not good  b) make a decision about whether to send expansion options via ExpansionRequest, or not  c) in either case, we p

Re: Specifying environment for cross-language transform expansion

2021-07-01 Thread Jan Lukavský
This really does not match my experience. Passing the correct "use_deprecated_read" flag to the expansion service had the expected impact on the Flink's execution DAG and - most of all - it started to work (at least seems so). The UI in Flink also started to reflect that and stopped using SDF (

Re: Specifying environment for cross-language transform expansion

2021-07-01 Thread Jan Lukavský
I don't have complete comprehension of the topic, but from what I have observed, the runner gets (possibly cross-language) proto description of the pipeline, and the post-processing there might be limited.  That is mainly due to the fact, that we have inverted the expansion flow - we expand Rea

Re: Specifying environment for cross-language transform expansion

2021-07-01 Thread Jan Lukavský
Hi, after today's experience I think I have some arguments about why we *should* pass (at least some) of the PipelineOptions from SDK to expansion service.  1) there are lots, and lots, and lots of bugs around SDF and around the "use_deprecated_read", sorry, but the switch to SDF as the defa

Re: Specifying environment for cross-language transform expansion

2021-07-01 Thread Jan Lukavský
On 7/1/21 3:26 AM, Kyle Weaver wrote: I think it should accept complete list of PipelineOptions (or at least some defined subset - PortabilityPipelineOptions, ExperimentalOptions, ...?) I'm not totally opposed to redefining some options, either. Using PipelineOptions could be conf

Re: Specifying environment for cross-language transform expansion

2021-06-30 Thread Jan Lukavský
> Not sure why we need the hacks with NoOpRunner As noted earlier (and that was why I started this thread in the first place :)), adding :runners:direct-java as runtime dependency of the expansion service causes something like 200 tests in pre-commit to fail. Looks like there is some kind of c

Re: Specifying environment for cross-language transform expansion

2021-06-30 Thread Jan Lukavský
> I'm totally fine with changes to ExpansionService (java) to support additional features. Looks like this is consensus, I'm with it as well, for the first round. The problem is how exactly to modify it. I think it should accept complete list of PipelineOptions (or at least some defined subset

Re: Specifying environment for cross-language transform expansion

2021-06-30 Thread Jan Lukavský
> java -jar beam-sdks-java-io-expansion-service-2.30.0.jar This does not accept any other parameters than the port. That is the first part of this thread - the intent was to enable this to accept additional arguments, but there are (still waiting to be addressed unresolved) issues. There curr

Re: Specifying environment for cross-language transform expansion

2021-06-30 Thread Jan Lukavský
On 6/30/21 1:16 AM, Robert Bradshaw wrote: Why doesn't docker in docker just work, rather than having to do ugly hacks when composing two technologies that both rely on docker... Presumably you're setting up a node for Kafka and Flink; why not set one up for the expansion service as well? The UX

Re: Specifying environment for cross-language transform expansion

2021-06-29 Thread Jan Lukavský
I believe we could change that more or less the same as we can deprecate / stop supporting any other parameter of any method. If python starts to support natively Kafka IO, then we can simply log warning / raise exception (one after the other). That seems like natural development. Maybe I shou

Re: Specifying environment for cross-language transform expansion

2021-06-29 Thread Jan Lukavský
I would absolutely understand this, if it would be mostly impossible or at least really hard to get the user friendly behavior. But we are mostly there in this case. When we can actually quite simply pass the supported environment via parameter, I think we should go for it. I have created a sk

Re: Specifying environment for cross-language transform expansion

2021-06-29 Thread Jan Lukavský
On 6/29/21 11:04 PM, Robert Bradshaw wrote: You can configure the environment in the current state, you just have to run your own expansion service that has a different environment backed into it (or, makes this configurable). Yes, that is true. On the other hand that lacks some user-friendliness

Re: Specifying environment for cross-language transform expansion

2021-06-29 Thread Jan Lukavský
Thanks for pointing to that thread. 1) I'm - as well as Kyle - fine with the approach that we use a "preferred environment" for the expansion service. We only need to pass it via command line. Yes, the command line might be generally SDK-dependent, and that makes it expansion dependent, becaus

Re: Specifying environment for cross-language transform expansion

2021-06-29 Thread Jan Lukavský
I would slightly disagree that this breaks the black box nature of the expansion, the "how the transform expands" remains unknown to the SDK requesting the expansion, the "how the transform executes" - on the other hand - is something that the SDK must cooperate on - it knows (or could or shoul

Re: Specifying environment for cross-language transform expansion

2021-06-29 Thread Jan Lukavský
The argument for being able to accept (possibly ordered list of) execution environments is in that this could make a single instance of execution service reusable by various clients with different requirements. Moreover, the two approaches are probably orthogonal - users could specify 'defaultE

Re: Specifying environment for cross-language transform expansion

2021-06-29 Thread Jan Lukavský
If I understand it correctly, there is currently no place to set the defaultEnvironmentType - python's KafkaIO uses either 'expansion_service' given by the user (which might be a host:port, or an object that has appropriate method), or calls 'default_io_expansion_service' - which in turn runs E

Re: Specifying environment for cross-language transform expansion

2021-06-29 Thread Alexey Romanenko
I’m sorry if I missed something but do you mean that PortablePipelineOptions.setDefaultEnvironmentType(String) doesn’t work for you? Or it’s only a specific case while using portable KafkaIO? > On 29 Jun 2021, at 09:51, Jan Lukavský wrote: > > Hi, > > I have come across an issue with cross-la