done.
On Sun, Jan 12, 2020 at 6:27 PM Tomo Suzuki wrote:
> Hi Beam committers,
>
> Four Jenkins jobs did not report back for this PR
> https://github.com/apache/beam/pull/10554 .
> Can somebody trigger them?
>
> On Fri, Jan 10, 2020 at 4:51 PM Andrew Pilloud
> wrote:
> >
> > Done.
> >
> > On
High Priority Dependency Updates Of Beam Python SDK:
Dependency Name
Current Version
Latest Version
Release Date Of the Current Used Version
Release Date Of The Latest Release
JIRA Issue
cachetools
3.1.1
4.0.0
2019-12-23
Hi,
I wanted to decouple the conversation about solutions to the issue from job
execution requests.
We have 131 open PRs right now and 64 committers with job running
privileges. From what I counted, more than 80 of those PRs are not authored
by committers.
I think that having committers answer
Reuven, thank you much for your help and the clarity here, it's very
helpful..
Per your solution #2 -- This approach makes sense, seems semantically
right, and something I'll explore when the timer.withOutputTimetstamp(t)
releases. Just for clarity, there is no other way in Beam
I don't think that should be the case. Also SchemaCoder will automatically
set the UUID for such logical types.
On Mon, Jan 13, 2020 at 8:24 AM Alex Van Boxel wrote:
> OK, I've rechecked everything and eventually found the problem. The
> problem is when you use a LogicalType backed back a Row,
... what that means is that you can tag me on github, and I'll take a look,
yes : ) I'm 'pabloem'.
On Mon, Jan 13, 2020 at 9:59 AM Pablo Estrada wrote:
> I reviewed the first PR, so I'm happy to review others.
>
> On Mon, Jan 13, 2020 at 9:42 AM Robert Bradshaw
> wrote:
>
>> One thing you
SchemaCoder today recursively sets UUIDs for all schemas, including logical
types, in setSchemaIds. Is it possible that your changes modified that
logic somehow?
On Mon, Jan 13, 2020 at 9:39 AM Alex Van Boxel wrote:
> This is the stacktrace:
>
>
> java.lang.IllegalStateException at
>
Thanks Yifan (but Java Precommit is still missing).
Can somebody run "Run Java PreCommit" on
https://github.com/apache/beam/pull/10554?
On Mon, Jan 13, 2020 at 2:59 AM Yifan Zou wrote:
>
> done.
>
> On Sun, Jan 12, 2020 at 6:27 PM Tomo Suzuki wrote:
>>
>> Hi Beam committers,
>>
>> Four Jenkins
Thanks Matthias!
On Sun, Jan 12, 2020 at 7:51 AM Matthias Baetens
wrote:
> Hi everyone,
>
> It's our pleasure to share the recordings from the Beam Summit North
> America 2019.
> Please find them in the YouTube playlist
>
[RESULT] [VOTE] Vendored Dependencies Release
I'm happy to announce that we have unanimously approved this release.
There are 6 approving votes, 4 of which are binding:
* Luke Cwik
* Pablo Estrada
* Ahmet Altay
* Kenneth Knowles
There are no disapproving votes.
Thanks everyone!
On 2020/01/11
This is the stacktrace:
java.lang.IllegalStateException at
org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkState(Preconditions.java:491)
at
org.apache.beam.sdk.coders.RowCoderGenerator.getCoder(RowCoderGenerator.java:380)
at
Correct. This API is merged into Beam, so should be included in the next
Beam release.
On Mon, Jan 13, 2020 at 4:00 AM Aaron Dixon wrote:
> Reuven, thank you much for your help and the clarity here, it's very
> helpful..
>
> Per your solution #2 -- This approach makes sense, seems semantically
I want to make pull request about BEAM-9094 (
https://issues.apache.org/jira/browse/BEAM-9094)
My tree is https://github.com/ocworld/beam/tree/BEAM-9094-add-aws-s3-options
When trying to create a pull request issue, It is needed for me to assign
reviewers.
Who can review my request?
One thing you could do is ask for a history [1] of the file and see if
there are any possible candidates (e.g. apache beam comitters [2]).
[1]
https://github.com/ocworld/beam/blame/259f6174ce52e6317a5b4fe7ed3a126153d3/sdks/python/apache_beam/io/aws/clients/s3/boto3_client.py
[2]
Hello everyone!
I have noticed that Jenkins tests for Dataflow runner [1] are failing with
a runtime exception. It looks like the issue originated here [2], failed
Dataflow job [3].
We should look into fixing it.
Failing test:
:runners:google-cloud-dataflow-java:validatesRunnerLegacyWorkerTest »
done
On Mon, Jan 13, 2020 at 2:39 PM Yoshiki Obata
wrote:
> Hi Beam committers
>
> It would be appreciated if anyone could trigger python precommit job to
> this PR:
> https://github.com/apache/beam/pull/10141
>
> Regards,
> Yoshiki
>
OK, I've rechecked everything and eventually found the problem. The problem
is when you use a LogicalType backed back a Row, then the UUID needs to be
set to make it work. (this is the case for Proto based Timestamps). I'll
create a fix.
_/
_/ Alex Van Boxel
On Mon, Jan 13, 2020 at 8:36 AM
Hi Beam committers
It would be appreciated if anyone could trigger python precommit job to this PR:
https://github.com/apache/beam/pull/10141
Regards,
Yoshiki
Hi Devs and Users,
We are looking for speakers for future Meetups and Events. Who is
building cool things with Beam? We are looking at hosting a Meetup at
Spotify in February, and ideally keep some meetups going throughout
the year. For this to occur, we need to hear about what people are
This is being tracked in BEAM-9083
On Mon, Jan 13, 2020 at 11:23 AM Boyuan Zhang wrote:
> Thanks Kirill! I'm going to look into it.
>
> On Mon, Jan 13, 2020 at 11:18 AM Kirill Kozlov
> wrote:
>
>> Hello everyone!
>>
>> I have noticed that Jenkins tests for Dataflow runner [1] are failing
>>
If it indeed happened as you have described, I will be very interested in
the expected behaviour.
Something I remembered before: the trigger condition meets just gives the
runner/engine "permission" to fire, but runner/engine may not fire
immediately. But I don't know if the engine/runner will
So I think the following happens:
1. the schema tree is initialized at construction time. The tree get
serialized and send to the workers
2. the workers deserialize the tree, but as the Timestamp logical type
have a logical type with a *static* schema the schema will be
I have the following trigger:
.apply(Window
.configure()
.triggering(AfterWatermark
.pastEndOfWindow()
.withEarlyFirings(AfterPane
.elementCountAtLeast(1)))
.accumulatingFiredPanes()
.withAllowedLateness(Duration.ZERO)
But in Dataflow
I would have expected an empty on time pane since the default on time
behavior is FIRE_ALWAYS.
On Mon, Jan 13, 2020 at 1:54 PM Aaron Dixon wrote:
> Can anyone confirm?
>
> This is intermittent. Some (it seems, sparse) windows don't get an ON_TIME
> firing after watermark. Is this a bug or is
Yes. Using calendar day-based windows and watermark is completely caught up
to today ... calendar window ends several days ago. I got EARLY panes for
each element but never ON_TIME pane.
On Mon, Jan 13, 2020 at 4:16 PM Luke Cwik wrote:
> Is the watermark advancing past the end of the window?
>
Hello,
I've got a PR that affects the java gcp component, specifically
BigQueryUtils. Can anyone help me with a review? I've tagged the owner of
the component on the PR but haven't heard anything for a week, so I figured
I'd send an e-mail to this list.
I guess these are the first logical types we've defined with a base type of
row. It does seem reasonable that a static schema for a logical type could
have some fixed id, but it feels odd to have a fixed UUID, it would be nice
if we could give the schema some meaningful static identifier.
I think
I can do talks in either DC or NYC meetups. I can coordinate with
CapitalOne to see if they would be willing to host the DC meetup.
On Mon, Jan 13, 2020 at 4:02 PM Austin Bennett
wrote:
> Hi Devs and Users,
>
> We are looking for speakers for future Meetups and Events. Who is
> building cool
The window is not empty fwiw; it has elements; I get an early firing pane
for the window but well after the watermark passes there is no ON_TIME
pane. Would this be a bug in Dataflow? Seems fundamental, so I'm concerned
perhaps the Beam spec doesn't obligate ON_TIME firings?
On Mon, Jan 13,
Is the watermark advancing past the end of the window?
On Mon, Jan 13, 2020 at 2:02 PM Aaron Dixon wrote:
> The window is not empty fwiw; it has elements; I get an early firing pane
> for the window but well after the watermark passes there is no ON_TIME
> pane. Would this be a bug in Dataflow?
Thanks for taking care of this!
On Mon, Jan 13, 2020 at 2:00 PM Boyuan Zhang wrote:
> This problem is addressed by PR10564. Now all affected tests are back to
> green.
>
> On Mon, Jan 13, 2020 at 1:11 PM Luke Cwik wrote:
>
>> This is being tracked in BEAM-9083
>>
>> On Mon, Jan 13, 2020 at
Hi everyone,
I have a proposal that I think can unify two problem sets:
1) adding more IOs for Beam SQL, and
2) making more (Row-based) Java IOs available in Python as cross-language
transforms
The basic idea is to create a single cross-language transform that exposes
all Beam SQL IOs via the
Any confirmation on this from anyone? Whether per Beam spec, runners are
obligated to send ON_TIME panes for AfterWatermark triggers? I'm stuck
because this seems fundamental, so it's hard to imagine this is a Dataflow
bug, but OTOH it's also hard to imagine that trigger specs like
AfterWatermark
Fix in this PR:
[BEAM-9113] Fix serialization proto logical types
https://github.com/apache/beam/pull/10569
or we all agree to *promote* the logical types to top-level logical types
(as described in the design document, see ticket):
[BEAM-9037] Instant and duration as logical type
Can anyone confirm?
This is intermittent. Some (it seems, sparse) windows don't get an ON_TIME
firing after watermark. Is this a bug or is there a reason to not expect
ON_TIME firings for every window?
On Mon, Jan 13, 2020 at 3:47 PM Rui Wang wrote:
> If it indeed happened as you have
Regarding cross-language and Beam rows (and SQL!) - I have a PR up [1] that
adds an example script for using Beam's SqlTransform in Python by
leveraging the portable row coder. Unfortunately I got stalled figuring out
how to build/stage the Java artifacts for the SQL extensions so it hasn't
been
Thank you, Mark and Ismaël.
On Mon, Jan 13, 2020 at 2:34 PM Mark Liu wrote:
>
> done
>
> On Mon, Jan 13, 2020 at 8:03 AM Tomo Suzuki wrote:
>>
>> Thanks Yifan (but Java Precommit is still missing).
>> Can somebody run "Run Java PreCommit" on
>> https://github.com/apache/beam/pull/10554?
>>
>>
done
On Mon, Jan 13, 2020 at 8:03 AM Tomo Suzuki wrote:
> Thanks Yifan (but Java Precommit is still missing).
> Can somebody run "Run Java PreCommit" on
> https://github.com/apache/beam/pull/10554?
>
>
> On Mon, Jan 13, 2020 at 2:59 AM Yifan Zou wrote:
> >
> > done.
> >
> > On Sun, Jan 12,
This problem is addressed by PR10564. Now all affected tests are back to
green.
On Mon, Jan 13, 2020 at 1:11 PM Luke Cwik wrote:
> This is being tracked in BEAM-9083
>
> On Mon, Jan 13, 2020 at 11:23 AM Boyuan Zhang wrote:
>
>> Thanks Kirill! I'm going to look into it.
>>
>> On Mon, Jan 13,
Thanks for the update and I agree with the points that you have made.
On Fri, Jan 10, 2020 at 5:58 PM Robert Burke wrote:
> Thank you for sharing Daniel!
>
> Resolving SplittableDoFns for the Go SDK even just as far as initial
> splitting will take the SDK that much closer to exiting its
It's indeed the first Logical identifier with Row base type. The UUID is
generated from the name of the class, but doing it in code (from a string)
you need to create bytes from the string, then a UUID.
_/
_/ Alex Van Boxel
On Mon, Jan 13, 2020 at 10:40 PM Brian Hulette wrote:
> I guess
Hi everyone,
Please review and vote on the release candidate #3 for the version 1.2.3,
as follows:
[ ] +1, Approve the release
[ ] -1, Do not approve the release (please provide specific comments)
The complete staging area is available for your review, which includes:
* JIRA release notes [1],
*
The most important gain would be compatibility with Google internal code.
TLDR: I don't expect non-Googlers to fix pytype issues in Beam, nor would
they have access to internal code that is validated against pytype with
Beam.
Pytype seems to detect attribute errors that mypy has not, so it acts
> The most important gain would be compatibility with Google internal code.
I would like to clarify this. This refers to users of Beam who by default
are using pytype as part of the toolchain. Even though they are internal to
a one single company and not vocal on Beam, they still represent a large
On my phone, so I can't grab the jira so easily, but quickly: EARLY panes
are "race condition equivalent" to ON_TIME panes. The early panes consume
all the pending elements then the on time pane is "empty". This is WAI if
it is what is causing it. You need to explicitly set
>
> Pytype seems to detect attribute errors that mypy has not, so it acts as a
> kind-of linter in this case.
> Examples:
>
> https://github.com/apache/beam/pull/10528/files#diff-0cb34b4622b0b7d7256d28b1ee1d52fc
>
>
>
> > I agree with focusing one mypy for now, but I would propose soon after,
> or in parallel if it will be different folks, to work on pytype and enable
> it as a first class citizen similar to mypy. If there will be a large delta
> between the two then we can decide on what to do next.
>
> If
There are some issues in this message, part of the message is still a
template (1.2.3, TODO, MAVEN_VERSION).
Before I noticed these issues, I ran a few Batch and Streaming Python 3.7
pipelines using Direct and Dataflow runners, and they all succeeded.
On Mon, Jan 13, 2020 at 4:09 PM Udi Meiri
I would rather we focus on doing well with one type checker and it seems
that mypy is significantly more popular than pytype so its more natural for
users. I would support pytype if it covered more PEPs and was the newer and
upcoming thing but that doesn't seem to be the case.
On Sun, Jan 12,
I'm for going back to the status quo where anyone's PR ran the tests
automatically or to the suggestion where users marked as contributors had
their tests run automatically (with the documentation update about how link
your github/jira accounts).
On Mon, Jan 13, 2020 at 2:45 AM Michał Walenia
Kenn, thank you! There is OnTimeBehavior (default FIRE_ALWAYS) and
ClosingBehavior (default FIRE_IF_NON_EMPTY). Given that OnTimeBehavior is
always-fire, shouldn't I see empty ON_TIME panes?
Since my lateness config is 0, I'm going to try ClosingBehavior =
FIRE_ALWAYS and see if I can rely on
Looking at this from the outside, it seems like mypy is the obvious choice.
Also running pytype could potentially be informative in some cases but only
if there is a specific gap. What about maintenance/governance of the two
projects?
Kenn
On Sun, Jan 12, 2020 at 7:48 PM Chad Dombrova wrote:
>
Udi, what would we gain by using pytype?
Also, has anyone tried running pytype against Beam? If it's not too much
trouble, it might be helpful to diff the pytype and mypy results to get a
feel for exactly how big the discrepancy is.
On Mon, Jan 13, 2020 at 3:26 PM Kenneth Knowles wrote:
>
Thanks Brian. Added some comments.
On Mon, Jan 13, 2020 at 2:25 PM Brian Hulette wrote:
> Hi everyone,
> I have a proposal that I think can unify two problem sets:
> 1) adding more IOs for Beam SQL, and
> 2) making more (Row-based) Java IOs available in Python as
> cross-language transforms
On Mon, Jan 13, 2020 at 5:34 PM Chad Dombrova wrote:
>>
>> Pytype seems to detect attribute errors that mypy has not, so it acts as a
>> kind-of linter in this case.
>> Examples:
>> https://github.com/apache/beam/pull/10528/files#diff-0cb34b4622b0b7d7256d28b1ee1d52fc
>>
This sounds like a bug, as described.
Here's the logic, shared by all runners:
https://github.com/apache/beam/blob/master/runners/core-java/src/main/java/org/apache/beam/runners/core/ReduceFnRunner.java#L958
Regarding "race condition equivalent" I mean that when you have an early
trigger set up
I think AfterWatermark in particular should *alway* produce an ON_TIME
pane, regardless of whether there were early panes. (It's less clear
with non-watermark triggers like after count or processing time.) This
makes it feel like the on time behavior is a property of the trigger,
not the windowing
57 matches
Mail list logo