I was suggesting GCP support mainly because I don't think you want to share
the 2.36 and 2.40 version of your job file publicly as someone familiar
with the layout and format may spot a meaningful difference.
Also, if it turns out that there is no meaningful difference between the
two then the
This sounds great.
Since every language has a benchmarking tool, we can start with JMH and
expand from there.
A key point is that we will want to dedicate a Jenkins machine exclusively
to this when the microbenchmarks are running, otherwise we will have other
competing Jenkins jobs using up CPU
Does doing a pipeline update in 2.36 work or do you want to do an update to
get the latest version?
Feel free to share the job files with GCP support. It could be something
internal but the coders for ephemeral steps that Dataflow adds are based
upon existing coders within the graph.
On Tue, Jul
Hi all,
Please join me and the rest of the Beam PMC in welcoming a new committer:
Steven Niemitz (sniemitz@)
Steven started contributing to Beam in 2017 fixing bugs and improving
logging and usability. Stevens most recent focus has been on performance
optimizations within the Java SDK.
to
>> try out the design options. I think we can simplify the problem by
>> insisting that they are pure functions that do not access state or side
>> inputs.
>>
>> On Wed, Jul 13, 2022 at 7:52 PM Luke Cwik via dev
>> wrote:
>>
>>> I think a
First we'll want to choose whether we want to target Wasm, WASI or Wagi.
WASI adds a lot of simple things like access to a clock, random number
generator, ... that would expand the scope of what transpiled code can do.
It is debatable whether we'll want the power to run the transpiled code as
a
I think an easier target would be to support things like
DynamicDestinations for Java IO connectors that are exposed as XLang for
Go/Python.
This is because Go/Python have good transpiling support to WebAssembly and
we already exposed several Java IO XLang connectors already so its about
plumbing
I have a better understanding of the problem after reviewing the doc and we
need to decide on what lifecycle scope we want the `Connection`, `Session`,
and `MessageConsumer` to have.
It looks like for the `Connection` we should try to have at most one
instance for the entire process per
We should send this out to us...@beam.apache.org so that they are aware of
this change once commenting in the doc has settled.
On Tue, Sep 6, 2022 at 1:59 PM Robert Burke wrote:
> Thank you for already planning to *NOT* have this merged until after this
> week's 2.42.0 cut. This Release Manager
n is active in “advance” in order to receive
> message.
>
> Are we sure that all checkpoints are finalized when the reader is closed?
>
>
>
>1. Session scoped to the reader start/close
>
> It seems to be more or less the case currently.
>
>
>
> Regards
&
Thanks, I missed that when I was reviewing the issue.
On Tue, Oct 11, 2022 at 5:01 PM Robert Burke wrote:
> That merge commit doesn't appear in the 2.42.0 release branch, so I've
> moved that issue to the 2.43.0 release milestone.
>
> On Tue, Oct 11, 2022, 4:07 PM Luke Cwik via
I would like to point out that I found another regression due to the
bigdataoss library upgrade from 2.2.6 to 2.2.8 (
https://github.com/apache/beam/pull/23300), filed
https://github.com/apache/beam/issues/23588.
On Mon, Oct 10, 2022 at 1:17 PM Robert Burke wrote:
> Due to a process error on my
I was looking to update gRPC that we use to the latest (1.48.1) version to
move off of a vulnerable version of Netty that a user pointed out in
BEAM-14118. This would supersede the work done in
https://github.com/apache/beam/pull/17206 as that PR has stalled.
If there aren't any concerns I'll
+1
I verified the signatures of the artifacts, that the jar doesn't contain
classes outside of the org/apache/beam/vendor/grpc/v1p48p1 package and I
tested the artifact against our precommits using
https://github.com/apache/beam/pull/22595
On Fri, Aug 5, 2022 at 1:42 PM Luke Cwik wrote:
>
Please review the release of the following artifacts that we vendor:
* beam-vendor-grpc-1_48_1
Hi everyone,
Please review and vote on the release candidate #1 for the version 0.1, as
follows:
[ ] +1, Approve the release
[ ] -1, Do not approve the release (please provide specific comments)
The
I think you missed Kenn's earlier reply:
https://lists.apache.org/thread/v0nr6mv0rqhd76ox1bwt6qwo4q3g7w58
The vendored gRPC is built by transforming the released gRPC jar. Here is
where in the Beam git history you can find the source for the
transformation:
Thanks.
On Mon, Aug 8, 2022 at 8:12 AM Peter Simon wrote:
> Awesome web UI
>
> Peter Simon
>
> *Data Scientist*
>
>
>
> e peter.si...@fanatical.com
>
> w fanatical.com
>
> Focus Multimedia Limited.
>
> The Studios, Lea Hall Enterprise Park,
>
> Wheelhouse Road, Brereton, Rugeley,
>
>
Thanks!
> -P.
>
> On Mon, Aug 8, 2022 at 9:24 AM Chamikara Jayalath via dev <
> dev@beam.apache.org> wrote:
>
>> +1
>>
>> Thanks,
>> Cham
>>
>> On Fri, Aug 5, 2022 at 1:49 PM Luke Cwik via dev
>> wrote:
>>
>>> +1
>>
By default Beam Java only uploads artifacts that have changed but it looks
like this is not the case for Beam Python and you need to explicitly opt in
with the --enable_artifact_caching flag[1].
It looks like this feature was added 1 year ago[2], should we make this on
by default?
1:
The proto (java) -> bytes -> proto (python) sounds good.
Have you tried moving your DoFn outside of your main module into a new
module as per [1]. Other suggestions are to do the import in the function.
Can you do the import once in the setup()[2] function? Have you considered
using the cloud
I would suggest using BigtableIO which also returns a
protobuf com.google.bigtable.v2.Row. This should allow you to replicate
what SpannerIO is doing.
Alternatively you could provide a way to convert the HBase result into a
Beam row by specifying a converter and a schema for it and then you could
It looks like there is an existing issue[1]. I updated our correspondence
there and we should continue our communication there.
1: https://github.com/apache/beam/issues/24801,
On Tue, Jan 3, 2023 at 1:22 PM Reuven Lax wrote:
> Ah, that is fair. However right now that doesn't happen either.
>
>
I think in general ReadableState.read() should not be @Nullable but we
should allow for the overrides like ValueState to specify that T can
be @Nullable while others like ListState we should have List<@Nullable T>.
On Tue, Jan 3, 2023 at 12:37 PM Reuven Lax via dev
wrote:
> It should be
I would have expected
a META-INF/services/org.apache.beam.sdk.expansion.ExternalTransformRegistrar
file in the jar containing the fully qualified class name
of BigtableRegistrar in it. See
AutoService relies on Java's compiler annotation processor.
https://github.com/google/auto/tree/main/service#getting-started shows that
you need to configure Java's compiler to use the annotation processors
within AutoService.
I saw this public gist that seemed to enable using the AutoService
I have found the Gradle build reports very useful to enumerate deprecations
and an easier thing to look at over the command line output.
On Thu, Dec 8, 2022 at 8:26 AM Damon Douglas via dev
wrote:
> Thank you, Kerry, for your kind and encouraging words!
>
> Kenn, I wondered as well whether
checks.
>
> Best,
>
> Damon
>
> On Thu, Dec 8, 2022 at 8:59 AM Daniel Collins
> wrote:
>
>> We could probably add a lint that rejects the spelling `task("` pretty
>> easily that would catch most of these.
>>
>> On Thu, Dec 8, 2022 at 11:34 A
This is definitely not working for portable pipelines since the
GreedyPipelineFuser doesn't create a fusion boundary which as you pointed
out causes a single stage that has a non-deterministic function followed by
one that requires stable input. It seems as though we should have runners
check the
We do support JDK8, JDK11 and JDK17. Our story around newer features within
JDKs 9+ like modules is mostly non-existent though.
We rarely run into JDK specific issues, the latest were the TLS1 and TLS1.1
deprecation in newer patch versions of the JDK and also the docker cpu
share issues with
I would suggest adding it to the existing package(s) (either
sdks/java/extensions or sdks/java/zetasketch or both depending on if you're
replacing existing sketches or adding new ones) since we shouldn't expose
sketching libraries API surface. We should make the API take all the
relevant
Thanks, I took a look and left some comments.
On Mon, Oct 31, 2022 at 12:47 PM Ahmet Altay wrote:
> Thank you for the message Buqian. Adding @Reuven Lax
> @Lukasz
> Cwik explicitly (who are mentioned on the doc).
>
> On Mon, Oct 31, 2022 at 12:17 PM 郑卜千 wrote:
>
>> Gentle ping. Thanks!
>>
gt; >
>>>> > On Mon, Feb 13, 2023 at 5:17 AM Bruno Volpato via dev <
>>>> dev@beam.apache.org> wrote:
>>>> >>
>>>> >> +1 (non-binding)
>>>> >>
>>>> >> Tested with https://github.com/GoogleC
> >
>>>>> > On 13 Feb 2023, at 17:54, Ahmet Altay via dev
>>>>> wrote:
>>>>> >
>>>>> > +1 (binding) - I validated python quick starts on direct runner and
>>>>> python streaming quickstart o
I upgraded the docker version on Jenkins workers and the tests passed.
(also installed Python 3.11 so we are ready for that)
On Tue, Feb 14, 2023 at 3:21 PM Kenneth Knowles wrote:
> SGTM. I asked on the PR if this could impact users, but having read the
> docker release calendar I am not
Congrats, well deserved.
On Thu, Feb 16, 2023 at 10:32 AM Anand Inguva via dev
wrote:
> Congratulations!!
>
> On Thu, Feb 16, 2023 at 12:42 PM Chamikara Jayalath via dev <
> dev@beam.apache.org> wrote:
>
>> Congrats Jan!
>>
>> On Thu, Feb 16, 2023 at 8:35 AM John Casey via dev
>> wrote:
>>
>>>
Seems like a useful thing to me and will make it easier for Beam users
overall.
On Fri, Feb 10, 2023 at 3:56 PM Robert Bradshaw via dev
wrote:
> Thanks. I added some comments to the doc.
>
> On Mon, Feb 6, 2023 at 1:33 PM Chamikara Jayalath via dev
> wrote:
> >
> > Hi All,
> >
> > Beam
+1
Validated release artifact signatures and verified the Java Flink and Spark
quickstarts.
On Fri, Feb 10, 2023 at 9:27 AM John Casey via dev
wrote:
> Addendum to above email.
>
> Java artifacts were built with Gradle 7.5.1 and OpenJDK 1.8.0_362
>
> On Fri, Feb 10, 2023 at 11:14 AM John Casey
Our current container java 8 container is 262 MiBs and layers on top of
openjdk:8-bullseye which is 226 MiBs compressed while eclipse-temurin:8 is
92 MiBs compressed and eclipse-temurin:8-alpine is 65 MiBs compressed.
I would rather not get into issues with C library differences caused by the
The PCollection value comes from the key on the pipeline proto[1]. That key
is populated during pipeline construction time[2] and is based upon the
unique name of the PTransform + the name of the output being used (aka tag
with .output being a default).
It looks like the counter PTRANFORM is
As per [1], the JDK8 and JDK11 containers that Apache Beam uses have
stopped being built and supported since July 2022. I have filed [2] to
track the resolution of this issue.
Based upon [1], almost everyone is swapping to the eclipse-temurin
container[3] as their base based upon the linked
I made some progress in testing the container and did hit an issue where
Ubuntu 22.04 "Jammy" is dependent on the version of Docker installed. It
turns out that our boot.go crashes with "runtime/cgo: pthread_create
failed: Operation not permitted" because the Ubuntu 22.04 is using new
syscalls
41 matches
Mail list logo