Ben - yes, there is still some ambiguity regarding the querying of the
metrics results.
You've discussed in this thread the notion that metrics step names should
be converted to unique names when aggregating metrics, so that each step
will aggregate its own metrics, and not join with other steps
Fantastic !
Let me take a look on the Spark runner ;)
Thanks !
Regards
JB
On 01/26/2017 03:34 PM, Kenneth Knowles wrote:
Hi Etienne,
I was drafting a proposal about @OnWindowExpiration when this email
arrived. I thought I would try to quickly unblock you by responding with a
TL;DR: you can
The default timestamp should be BoundedWindow.TIMESTAMP_MIN_VALUE, which is
equivalent to -2**63 microseconds. We also occasionally refer to this
timestamp as "negative infinity".
The default watermark policy for a bounded source should be negative
infinity until all of the data is read, then
Wonderful !
Thanks Kenn !
Etienne
Le 26/01/2017 à 15:34, Kenneth Knowles a écrit :
Hi Etienne,
I was drafting a proposal about @OnWindowExpiration when this email
arrived. I thought I would try to quickly unblock you by responding with a
TL;DR: you can achieve your goals with state & timers
Hi Etienne,
Could you post some snippets of how your transform is to be used in a
pipeline? I think that would make it easier to discuss on this thread and
could save a lot of churn if the discussion ends up leading to a different
API.
On Thu, Jan 26, 2017 at 8:29 AM Etienne Chauchot
First off, let me say that a *correctly* batching DoFn is a lot of
value, especially because it's (too) easy to (often unknowingly)
implement it incorrectly.
My take is that a BatchingParDo should be a PTransform that takes a DoFn, ? extends
Iterable> as a parameter, as
It think relaxing the query to not be an exact match is reasonable. I'm
wondering if it should be substring or regex. either one preserves the
existing behavior of, when passed a full step path returning only the
metrics for that specific step, but it adds the ability to just know
approximately
Agree, I'm curious as well.
I guess it would be something like:
.apply(ParDo(new DoFn() {
@Override
public long batchSize() {
return 1000;
}
@ProcessElement
public void processElement(ProcessContext context) {
...
}
}));
If batchSize (overrided by user) returns a
On Thu, Jan 26, 2017 at 9:48 AM, Thomas Groh
wrote:
>
> The default watermark policy for a bounded source should be negative
> infinity until all of the data is read, then positive infinity.
Just to elaborate - there isn't a way for a bounded source to communicate a
On Thu, Jan 26, 2017 at 4:15 PM, Robert Bradshaw <
rober...@google.com.invalid> wrote:
> On Thu, Jan 26, 2017 at 3:42 PM, Eugene Kirpichov
> wrote:
> >
> > you can't wrap DoFn's, period
>
> As a simple example, given a DoFn it's perfectly natural to want
> to
Congratulations and thank you for your contributions thus far!
On Thu, Jan 26, 2017 at 6:00 PM, Robert Bradshaw <
rober...@google.com.invalid> wrote:
> Welcome and congratulations!
>
> On Thu, Jan 26, 2017 at 5:05 PM, Sourabh Bajaj
> wrote:
> > Congrats!!
> >
>
Le 25/01/2017 à 20:34, Kenneth Knowles a écrit :
There's actually not a JIRA filed beyond BEAM-25 for what Eugene is
referring to. Context: Prior to windowing and streaming it was safe to
buffer elements in @ProcessElement and then actually perform output in
@FinishBundle. This pattern is only
It makes sense.
Agreed.
Regards
JB
On 01/25/2017 08:34 PM, Kenneth Knowles wrote:
There's actually not a JIRA filed beyond BEAM-25 for what Eugene is
referring to. Context: Prior to windowing and streaming it was safe to
buffer elements in @ProcessElement and then actually perform output in
+1 to what Dan said
On Wed, 25 Jan 2017 at 21:40 Kenneth Knowles wrote:
> +1
>
> On Jan 25, 2017 11:15, "Jean-Baptiste Onofré" wrote:
>
> > +1
> >
> > It sounds good to me.
> >
> > Thanks Dan !
> >
> > Regards
> > JB
> >
> > On Jan 25, 2017, 19:39,
I don't think we should make batching a core feature of the Beam
programming model (by adding it to DoFn as this code snippet implies). I'm
reasonably sure there are less invasive ways of implementing it.
On Thu, Jan 26, 2017 at 12:22 PM Jean-Baptiste Onofré
wrote:
> Agree,
Congrats!
On Thu, Jan 26, 2017 at 7:49 PM, María García Herrero <
mari...@google.com.invalid> wrote:
> Congratulations and thank you for your contributions thus far!
>
> On Thu, Jan 26, 2017 at 6:00 PM, Robert Bradshaw <
> rober...@google.com.invalid> wrote:
>
> > Welcome and congratulations!
>
Congrats!
On Fri, Jan 27, 2017, 06:25 Thomas Weise wrote:
> Congrats!
>
>
> On Thu, Jan 26, 2017 at 7:49 PM, María García Herrero <
> mari...@google.com.invalid> wrote:
>
> > Congratulations and thank you for your contributions thus far!
> >
> > On Thu, Jan 26, 2017 at 6:00 PM,
Welcome aboard !
Regards
JB
On Jan 27, 2017, 01:27, at 01:27, Davor Bonaci wrote:
>Please join me and the rest of Beam PMC in welcoming the following
>contributors as our newest committers. They have significantly
>contributed
>to the project in different ways, and we look
Hi Eugene
A simple way would be to create a BatchedDoFn in an extension.
WDYT ?
Regards
JB
On Jan 26, 2017, 21:48, at 21:48, Eugene Kirpichov
wrote:
>I don't think we should make batching a core feature of the Beam
>programming model (by adding it to DoFn as
Congrats! Well deserved Stas
בתאריך 27 בינו' 2017 7:26, "Frances Perry" כתב:
> Woohoo! Congrats ;-)
>
> On Thu, Jan 26, 2017 at 9:05 PM, Jean-Baptiste Onofré
> wrote:
>
> > Welcome aboard !
> >
> > Regards
> > JB
> >
> > On Jan 27, 2017, 01:27, at
On Thu, Jan 26, 2017 at 6:58 PM, Kenneth Knowles
wrote:
> On Thu, Jan 26, 2017 at 4:15 PM, Robert Bradshaw <
> rober...@google.com.invalid> wrote:
>
>> On Thu, Jan 26, 2017 at 3:42 PM, Eugene Kirpichov
>> wrote:
>> >
>> > you can't wrap
Congrats all! Very exciting. :)
On Thu, Jan 26, 2017 at 4:48 PM, Jesse Anderson
wrote:
> Welcome!
>
> On Thu, Jan 26, 2017, 7:27 PM Davor Bonaci wrote:
>
> > Please join me and the rest of Beam PMC in welcoming the following
> > contributors as our
Congrats!!
On Thu, Jan 26, 2017 at 5:02 PM Jason Kuster
wrote:
> Congrats all! Very exciting. :)
>
> On Thu, Jan 26, 2017 at 4:48 PM, Jesse Anderson
> wrote:
>
> > Welcome!
> >
> > On Thu, Jan 26, 2017, 7:27 PM Davor Bonaci
On Thu, Jan 26, 2017 at 4:20 PM, Ben Chambers
wrote:
> Here's an example API that would make this part of a DoFn. The idea here is
> that it would still be run as `ParDo.of(new MyBatchedDoFn())`, but the
> runner (and DoFnRunner) could see that it has asked for
Welcome and congratulations!
On Thu, Jan 26, 2017 at 5:05 PM, Sourabh Bajaj
wrote:
> Congrats!!
>
> On Thu, Jan 26, 2017 at 5:02 PM Jason Kuster
> wrote:
>
>> Congrats all! Very exciting. :)
>>
>> On Thu, Jan 26, 2017 at 4:48 PM,
Welcome!
On Thu, Jan 26, 2017, 7:27 PM Davor Bonaci wrote:
> Please join me and the rest of Beam PMC in welcoming the following
> contributors as our newest committers. They have significantly contributed
> to the project in different ways, and we look forward to many more
>
I think that wrapping the DoFn is tricky -- we backed out
IntraBundleParallelization because it did that, and it has weird
interactions with both the reflective DoFn and windowing. We could maybe
make some kind of "DoFnDelegatingDoFn" that could act as a base class and
get some of that right,
On Thu, Jan 26, 2017 at 3:31 PM, Ben Chambers
wrote:
> I think that wrapping the DoFn is tricky -- we backed out
> IntraBundleParallelization because it did that, and it has weird
> interactions with both the reflective DoFn and windowing. We could maybe
> make some
I agree that wrapping the DoFn is probably not the way to go, because the
DoFn may be quite tricky due to all the reflective features: e.g. how do
you automatically "batch" a DoFn that uses state and timers? What about a
DoFn that uses a BoundedWindow parameter? What about a splittable DoFn?
What
On Thu, Jan 26, 2017 at 3:42 PM, Eugene Kirpichov <
kirpic...@google.com.invalid> wrote:
> The class for invoking DoFn's,
> DoFnInvokers, is absent from the SDK (and present in runners-core) for a
> good reason.
>
This would be true if it weren't for that pesky DoFnTester :-)
And even if we
The third option for batching:
- Functionality within the DoFn and DoFnRunner built as part of the SDK.
I haven't thought through Batching, but at least for the
IntraBundleParallelization use case this actually does make sense to expose
as a part of the model. Knowing that a DoFn supports
On Thu, Jan 26, 2017 at 12:48 PM, Eugene Kirpichov
wrote:
> I don't think we should make batching a core feature of the Beam
> programming model (by adding it to DoFn as this code snippet implies). I'm
> reasonably sure there are less invasive ways of implementing
32 matches
Mail list logo