After the first value for a triggered side input, that side input is always
"ready" for the particular window, so there is no longer any
synchronization with the main input. You probably want to have tests with
the updated and the old value. I would expect that your steps 1-4 would
work with TestSt
As far as I know, that behavior is not specified. I do not think that
Dataflow streaming supports this sort of updating to side inputs, though
I've added Slava who might have more to add.
If updating side inputs is really not supported in Dataflow, you may be
able to use a LoadingCache, like so:
h
Hi Lukasz, many thanks for your responses.
I'm actually using them but I think I'm not being able to synchronise the
following steps:
1: The side input gets its first value (v1)
2: The main stream gets that side input applied and finds that v1 value
3: The side one gets updated (v2)
4: The main st
Your best bet is to use TestStreams[1] as it is used to validate
window/triggering behavior. Note that the transform requires special runner
based execution and currently only works with the DirectRunner. All
examples are marked with the JUnit category "UsesTestStream", for example
[2].
1:
https:/
Hi all!!
Basically that's what I'm trying to do. I'm building a pipeline that has a
refreshing, multimap, side input (BQ schemas) that then I apply to the main
stream of data (records that are ultimately saved to the corresponding BQ
table).
My job, although being of streaming nature, runs on the
This has been a long requested feature within Apache Beam:
https://issues.apache.org/jira/browse/BEAM-106
The short story is that this requires a lot of support from execution
engines since watermarks become a concept of time + loop iteration
(possibly multiple loop iterations if they are nested).
Hi Sahayaraj,
Yes, there is a module “beam-sdks-java-io-amazon-web-services” which can help
with this.
Also, I’d suggest you to take a look on this example which reads data from S3
bucket:
https://github.com/jbonofre/beam-samples/blob/master/amazon-web-services/src/main/java/org/apache/beam/sa
Beam works with S3 out of the box, you can provide s3://... paths to
anything that works with files. Don't remember if it's already available in
2.4 or only starting 2.5 - does it currently not work for you?
On Tue, May 29, 2018, 2:26 PM S. Sahayaraj wrote:
> Hello All,
>
> The d
Hello All,
The data source for our Beam pipleline is in S3 bucket, Is
there any built-in I/O Connector available with Java samples? If so, can you
please guide me how to integrate with them?.
I am using Bean SDK for Java version 2.4.0 and Spark runner in
clustere
Hi Jean-Baptiste,
Yes, I want to process data with the timestamp available in each record (event
time).
For the use case, I have a dataset (Json) with a timestamp in the past
available inside the record itself. This dataset is pushed in one topic Kafka.
With Beam, I want to detect some records
10 matches
Mail list logo