Hey all,
Is there any mechanism in place (or out of the box) to support reading the
results of statements executed by the JdbcSink or would I need to implement
my own to support it?
The problem that I'm trying to solve relates to observability (i.e.
metrics) and incrementing specific counters bas
Is anyone aware of a way to get Prometheus metrics to output via Beam (i.e.
using them over the Beam abstractions)? I seem to be able to define them as
expected, however I don’t see them being emitted with the rest of my
Prometheus metrics:
private class RouteEvent(): DoFn<...>() {
private
+user for additional reach
-- Forwarded message -
From: Rion Williams
Date: Mon, Feb 15, 2021 at 7:58 PM
Subject: Defining Custom Labels / Label Support in Beam Metrics
To: dev
Hey all,
I've been working extensively on the observability stories surrounding some
o
sages after it has
> started? We use this approach for some tests against Cloud PubSub.
> Note if using the DirectRunner you need to set the blockOnRun pipeline option
> to False to do this.
>
> Brian
>
>> On Mon, Feb 8, 2021 at 2:10 PM Rion Williams wrote:
>> Hey f
Hey folks,
I’ve been working on fleshing out a proof-of-concept pipeline that deals with
some out of order data (I.e. mismatching processing times / event-times) and
does quite a bit of windowing depending on the data. Most of the work I‘ve done
in a lot of streaming systems relies heavily on w
> watermark instead of having a watermark over these 3 partitions. Do I
> understand this correctly?
>
> If so, I think you need separated pipelines. If you only want to know
> which records come from which partitions, ReadFromKafkaDoFn emits a KV pair
> where the KafkaSourc
hat generates KafkaSourceDescriptor)
> .apply(ParDo.of(ReadFromKafkaDoFn))
> .apply(other parts)
>
>> On Mon, Feb 1, 2021 at 8:06 AM Rion Williams wrote:
>> Hey all,
>>
>> I'm currently in a situation where I have a single Kafka topic with data
>> acro
Hey all,
I'm currently in a situation where I have a single Kafka topic with data
across multiple partitions and covers data from multiple sources. I'm
trying to see if there's a way that I'd be able to accomplish reading from
these different sources as different pipelines and if a Splittable DoFn
Hey folks,
I’ve been mulling over how to solve a given problem in Beam and thought I’d
reach out to a larger audience for some advice. At present things seem to
be working sparsely and I was curious if someone could provide a
sounding-board to see if this workflow makes sense.
The primary high-l
+user
Begin forwarded message:
> From: Rion Williams
> Date: January 12, 2021 at 4:09:34 PM CST
> To: d...@beam.apache.org
> Subject: Accessing Custom Beam Metrics in Dataproc
> Reply-To: d...@beam.apache.org
>
> Hi all,
>
> I'm currently in the process of ad
Hi Teodor,
Although I’m sure you’ve come across it, this might have some valuable
resources or methodologies to consider as you explore this a bit more:
https://arxiv.org/pdf/1907.08302.pdf
I’m looking forward to reading about your finding, especially using a more
recent iteration of Beam!
Ri
+user for visibility
Begin forwarded message:
> From: Rion Williams
> Date: October 27, 2020 at 12:28:37 PM CDT
> To: d...@beam.apache.org
> Subject: Transform Logging Issues with Spark/Dataproc in GCP
>
> Hi all,
>
> Recently, I deployed a very simple Apache B
Hi Wesley,
In essence, yes, that’s what they did.
The three authors of Streaming Systems are folks that work on Google’s Dataflow
Project, which for all intents and purposes is essentially an implementation of
the Beam Model. Two of them are members of the Beam PMC (essentially a steering
comm
Hi Wesley,
I considered that one as well but was in the same boat in terms of not pulling
the trigger (lack of reviews, price point, etc.). I eventually landed on
Streaming Systems, which I highly, highly recommend if you want to learn more
about the Beam model:
- http://streamingsystems.net/
process, as
well as Pablo Estrada for helping this thing get merged in.
Thanks everyone and feel free to reach out if you run into any issues with the
course!
Rion Williams
.apache.org/contribute/
> 2: https://s.apache.org/beam-starter-tasks
>
>
>> On Sun, May 17, 2020 at 8:15 AM Rion Williams wrote:
>> Hi all,
>>
>> As I’ve been digging into Beam more, I’d love to have a chance to contribute
>> to the project (eithe
Hi all,
As I’ve been digging into Beam more, I’d love to have a chance to contribute to
the project (either through the existing ones or by assigning new ones to
myself).
I already have an existing ASF account (rionmonster), but it would be great to
have access to the Beam project itself.
Tha
+1 on the contributions front. My team and I have been working with Beam
primarily with Kotlin and I recently added the appropriate dependencies to
Gradle and performed a bit of conversions and have it working as expected
against the existing Java course.
I don’t know how many others are active
I've been exploring the pattern that Luke's thread refers to (as I'm the one
that asked the question in that thread) however I've been pulled away briefly
from getting around to implementing it. Just to chime in, another pattern that
I've seen recommended quite frequently when dealing with enric
> I have a lot of work stuff going on so I'll try to provide a response but it
> might take days. Also, if you find an answer to one of your questions or have
> additional questions while you investigate, feel free to update this thread.
>
>> On Mon, May 4, 2020 at 2:5
g the stateful DoFn would mean that you would only be
> retrieving/writing as much data that is ever associated with the key that
> you use and would have good parallelism if you have a lot of keys.
>
> 1: https://beam.apache.org/documentation/patterns/side-inputs/
> 2:
> https://lists.ap
sts.apache.org/thread.html/r2272040a06457cfdb867832a61f2933d1a3ba832057cffda89ee248a%40%3Cuser.beam.apache.org%3E
> ?
>
>
> On Tue, Apr 28, 2020 at 9:26 AM Rion Williams wrote:
>
> > Hi all,
> >
> > I'm trying to implement a process and I'm not quite sure what the best
> > approac
Hi all,
I'm in the process of migrating an existing pipeline over from the Kafka
Streams framework to Apache Beam and I'm reaching out in hopes of finding a
pattern that could address the use-case I'm currently dealing with.
In the most trivial sense, the scenario just involves an incoming mess
Hi all,
I'm trying to implement a process and I'm not quite sure what the best approach
to efficiently implement it might be while taking advantage of Beam's
parallelism and recommended patterns. Basically the problem itself can be
summarized as follows:
I have a series of incoming events whic
Hi all,
I posed this question over on the Apache Slack Community however didn't get
much of a response so I thought I'd reach out here. I've been looking for some
additional resources, specifically books, surrounding Apache Beam, the Beam
Model, etc. and was wondering if anyone had any recommen
use only own wrapper.
> Also, user won’t need to convert types between Read and Write (like in this
> topic case).
>
>> On 17 Apr 2020, at 19:28, Rion Williams wrote:
>>
>> Hi Alexey,
>>
>> So this is currently the approach that I'm taking. Bas
will help, but KafkaIO allows to keep all meta information
> while reading (using KafkaRecord) and writing (using ProducerRecord).
> So, you can keep your tracing id in the record headers as you did with Kafka
> Streams.
>
> > On 17 Apr 2020, at 18:58, Rion Williams wrote:
>
e how your elements go to the pipeline or do you want to
> see how every ParDo interacts with external systems?
>
> On Fri, Apr 17, 2020, 17:38 Rion Williams wrote:
>
> > Hi all,
> >
> > I'm reaching out today to inquire if Apache Beam has any support or
>
Hi all,
I'm reaching out today to inquire if Apache Beam has any support or mechanisms
to support some type of distributed tracing similar to something like Jaeger
(https://www.jaegertracing.io/). Jaeger itself has a Java SDK, however due to
the nature of Beam working with transforms that yield
29 matches
Mail list logo