Specifically, "We have no way of telling from the Runner side, if a length
prefix has been used or not." seems false. The runner has all the
information since length prefix is a model coder. Didn't we agree that all
coders should be self-delimiting in runner/SDK interactions, requiring
It is a good question, and the answer is good to remember. TL;DR it runs
against the merge commit from the moment you last pushed.
You can learn the answer by inspection of Jenkins logs and some knowledge
of GitHub. See
Hi,
this may be a dumb question. Let's imagine a hypothetical case, where I
open a pull request against master. I wrote the change on top of COMMIT#11,
so:
My branch:
COMMIT#11 -> MyCommit
Let's suppose that master has received a bunch of new commits (and a fix on
COMMIT#12), so it looks like
Aww.. that Hoover beaver is cute. But then lemur is also "taken" [1] and
the owl too [2].
Personally, I don't think it matters much which mascots are taken, as long
as the project is not too close in the same space as Beam. Also, it's good
to just get all ideas out. We should still consider
+1 to what Robert said.
On Tue, Nov 5, 2019 at 2:36 PM Robert Bradshaw wrote:
> The Coder used for State/Timers in a StatefulDoFn is pulled out of the
> input PCollection. If a Runner needs to partition by this coder, it
> should ensure the coder of this PCollection matches with the Coder
>
Sounds like we have consensus. Let's move forward. I'll follow up with
the discussions on the PRs themselves.
On Wed, Oct 30, 2019 at 2:38 PM Robert Bradshaw wrote:
>
> On Wed, Oct 30, 2019 at 1:26 PM Chad Dombrova wrote:
> >
> >> Do you believe that a future mypy plugin could replace pipeline
The Coder used for State/Timers in a StatefulDoFn is pulled out of the
input PCollection. If a Runner needs to partition by this coder, it
should ensure the coder of this PCollection matches with the Coder
used to create the serialized bytes that are used for partitioning
(whether or not this is
Hi,
I wanted to get your opinion on something that I have been struggling
with. It is about the coders for state requests in portable pipelines.
In contrast to "classic" Beam, the Runner is not guaranteed to know
which coder is used by the SDK. If the SDK happens to use a standard
coder
On Tue, Nov 5, 2019 at 10:32 AM Hai Lu wrote:
>
> Starting the expansion service in the job server is helpful. But having to
> expose the port number and to include the address in the
> beam.ExternalTransform is still a hassle. Giving a hard-coded port number
> might be the only solution right
I'm not sure if it's currently legal. However the watermark is generally
defined to be monotonic, so if it was allowed it would result in late data
in the pipeline.
On Tue, Nov 5, 2019 at 10:29 AM Aaron Dixon wrote:
> Thanks Reuven,
>
> So is my conclusion correct? That it is illegal for any
Starting the expansion service in the job server is helpful. But having to
expose the port number and to include the address in
the beam.ExternalTransform is still a hassle. Giving a hard-coded port
number might be the only solution right now but it's not a very clean
solution in our case.
Thanks Reuven,
So is my conclusion correct? That it is illegal for any custom window
function (+ combiner policy) to merge in a way that would regress the
watermark?
What do Runners (eg Dataflow) do if this occurs?
Does the API obligate runners to fail, or can insanity ensue? :)
On Tue, Nov 5,
+1 to moving the GCP tests outside of core. If there are issues that only
show up on GCP tests but not in core, it might be an indication that there
needs to be another test in core covering that, but I think that should be
pretty rare.
On Mon, Nov 4, 2019 at 8:33 PM Kenneth Knowles wrote:
> +1
On Tue, Nov 5, 2019 at 8:07 AM Aaron Dixon wrote:
> I noticed that if I use TimestampCombiner/EARLIEST for session windows
> that the watermark appears to get held up for sessions that never "close"
> (or that extend for a long time).
>
Correct - because the watermark is then being held up by
I noticed that if I use TimestampCombiner/EARLIEST for session windows that
the watermark appears to get held up for sessions that never "close" (or
that extend for a long time).
But if I use default (TimestampCombiner/END_OF_WINDOW) the watermark
doesn't get held.
Does this mean that the
Hi all,
As those of you that work on Jenkins jobs know, they can be a pain to work
with. Even simple changes are painful to run in the PR because of the seed
job - it reloads all the jobs sequentially and runs for over 10 minutes. If
someone else runs it against another branch - tough luck, you
Quick update: The mentioned designer has gotten back to me and offered
to sketch something until the end of the week. I've pointed him to this
thread and the existing logo material:
https://beam.apache.org/community/logos/
[I don't want to interrupt the discussion in any way, I just think
Hi,
I'd like to open a vote on accepting design document [1] as a base for
implementation of @RequiresTimeSortedInput annotation for stateful
DoFns. Associated JIRA [2] and PR [3] contains only subset of the whole
functionality (allowed lateness ignored and no possibility to specify
UDF for
How about fireflies in the Beam light rays? ;)
Feels like "Beam" would go well with an animal that has glowing bright eyes
such as a lemur
I love the lemur idea because it has almost orange eyes.
Thanks for starting this Aizhamal! I've recently talked to a designer
which is somewhat famous
19 matches
Mail list logo