I'm not sure what is the conclusion here. I'd suggest the following:
a) merge fix for BEAM-12538 ([1]), so that it can be fixed in 2.32.0,
because the current state is not good
b) make a decision about whether to send expansion options via
ExpansionRequest, or not
c) in either case, we p
This really does not match my experience. Passing the correct
"use_deprecated_read" flag to the expansion service had the expected
impact on the Flink's execution DAG and - most of all - it started to
work (at least seems so). The UI in Flink also started to reflect that
and stopped using SDF (
I don't have complete comprehension of the topic, but from what I have
observed, the runner gets (possibly cross-language) proto description of
the pipeline, and the post-processing there might be limited. That is
mainly due to the fact, that we have inverted the expansion flow - we
expand Rea
Hi,
after today's experience I think I have some arguments about why we
*should* pass (at least some) of the PipelineOptions from SDK to
expansion service.
1) there are lots, and lots, and lots of bugs around SDF and around
the "use_deprecated_read", sorry, but the switch to SDF as the defa
On 7/1/21 3:26 AM, Kyle Weaver wrote:
I think it should accept complete list of PipelineOptions (or at
least some defined subset - PortabilityPipelineOptions,
ExperimentalOptions, ...?)
I'm not totally opposed to redefining some options, either. Using
PipelineOptions could be conf
> Not sure why we need the hacks with NoOpRunner
As noted earlier (and that was why I started this thread in the first
place :)), adding :runners:direct-java as runtime dependency of the
expansion service causes something like 200 tests in pre-commit to fail.
Looks like there is some kind of c
> I'm totally fine with changes to ExpansionService (java) to support
additional features.
Looks like this is consensus, I'm with it as well, for the first round.
The problem is how exactly to modify it. I think it should accept
complete list of PipelineOptions (or at least some defined subset
> java -jar beam-sdks-java-io-expansion-service-2.30.0.jar
This does not accept any other parameters than the port. That is the
first part of this thread - the intent was to enable this to accept
additional arguments, but there are (still waiting to be addressed
unresolved) issues. There curr
On 6/30/21 1:16 AM, Robert Bradshaw wrote:
Why doesn't docker in docker just work, rather than having to do
ugly hacks when composing two technologies that both rely on
docker...
Presumably you're setting up a node for Kafka and Flink; why not set
one up for the expansion service as well? The UX
I believe we could change that more or less the same as we can deprecate
/ stop supporting any other parameter of any method. If python starts to
support natively Kafka IO, then we can simply log warning / raise
exception (one after the other). That seems like natural development.
Maybe I shou
I would absolutely understand this, if it would be mostly impossible or
at least really hard to get the user friendly behavior. But we are
mostly there in this case. When we can actually quite simply pass the
supported environment via parameter, I think we should go for it.
I have created a sk
On 6/29/21 11:04 PM, Robert Bradshaw wrote:
You can configure the environment in the current state, you just have
to run your own expansion service that has a different environment
backed into it (or, makes this configurable).
Yes, that is true. On the other hand that lacks some user-friendliness
Thanks for pointing to that thread.
1) I'm - as well as Kyle - fine with the approach that we use a
"preferred environment" for the expansion service. We only need to pass
it via command line. Yes, the command line might be generally
SDK-dependent, and that makes it expansion dependent, becaus
I would slightly disagree that this breaks the black box nature of the
expansion, the "how the transform expands" remains unknown to the SDK
requesting the expansion, the "how the transform executes" - on the
other hand - is something that the SDK must cooperate on - it knows (or
could or shoul
The argument for being able to accept (possibly ordered list of)
execution environments is in that this could make a single instance of
execution service reusable by various clients with different
requirements. Moreover, the two approaches are probably orthogonal -
users could specify 'defaultE
If I understand it correctly, there is currently no place to set the
defaultEnvironmentType - python's KafkaIO uses either
'expansion_service' given by the user (which might be a host:port, or an
object that has appropriate method), or calls
'default_io_expansion_service' - which in turn runs E
I’m sorry if I missed something but do you mean that
PortablePipelineOptions.setDefaultEnvironmentType(String) doesn’t work for you?
Or it’s only a specific case while using portable KafkaIO?
> On 29 Jun 2021, at 09:51, Jan Lukavský wrote:
>
> Hi,
>
> I have come across an issue with cross-la
17 matches
Mail list logo