Re: Embeddings generation in MLTransform

2023-11-06 Thread Anand Inguva via dev
Hi all, After the initial email, I went ahead and added a few more things as per the comments on the doc . Please take a look and let me know what you think. Thanks, Anand On

Embeddings generation in MLTransform

2023-10-30 Thread Anand Inguva via dev
Hi all, In Apache Beam 2.50.0 Python SDK, we added MLTransform , which is used to pre/post process data using common ML operations. Now, we are planning to

Re: [PYTHON] partitioner utilities?

2023-10-19 Thread Anand Inguva via dev
FYI, there is a Top transform[1] that will fetch the greatest n elements in Python SDK. It is not a partitioner but It may be useful for your reference. [1] https://github.com/apache/beam/blob/68e9c997a9085b0cb045238ae406d534011e7c21/sdks/python/apache_beam/transforms/combiners.py#L191 On Thu,

Re: Enable state cache in Python SDK

2023-10-16 Thread Anand Inguva via dev
My bad! It was on viewer access when I shared. I updated the doc access to commenters now. On Mon, Oct 16, 2023 at 12:26 PM Anand Inguva wrote: > Hello, > > In Python SDK, the user state and side input caching is disabled by > default for all the runners except FnAPI direct runner(intended for

Enable state cache in Python SDK

2023-10-16 Thread Anand Inguva via dev
Hello, In Python SDK, the user state and side input caching is disabled by default for all the runners except FnAPI direct runner(intended for testing purposes). I would like to propose that we enable the state cache for the Python SDK similar to other SDKs. I created a doc[1] on why we need to

Re: Proposal for pyproject.toml Support in Apache Beam Python

2023-10-12 Thread Anand Inguva via dev
; >> On Thu, Oct 12, 2023 at 4:01 PM Robert Bradshaw >> wrote: >> >>> Does this change any development practices? E.g. if I clone the repo, >>> I'm assuming I couldn't run "setup.py test" anymore. What about the >>> generated files (like proto

Re: Proposal for pyproject.toml Support in Apache Beam Python

2023-10-12 Thread Anand Inguva via dev
tup.py test" anymore. What about the generated > files (like protos, or the yaml definitions copied from other parts of the > repo)? > > On Thu, Oct 12, 2023 at 12:27 PM Anand Inguva via dev > wrote: > >> The PR https://github.com/apache/beam/pull/28385 is merged toda

Re: Proposal for pyproject.toml Support in Apache Beam Python

2023-10-12 Thread Anand Inguva via dev
gt;>> +1 >>> Hi Anand, >>> I appreciate this effort. Managing python dependencies has been a major >>> pain point for me, and I think this approach would help. >>> Kerry >>> >>> On Mon, Aug 28, 2023 at 10:14 AM Anand Inguva via

Re: [VOTE] Release 2.51.0, release candidate #1

2023-10-09 Thread Anand Inguva via dev
There was a regression[1] on fastavro latest release 1.8.4. Fix was merged at https://github.com/apache/beam/pull/28896. The RC1 includes that version in the range for fastavro[2]. I think we need to CP https://github.com/apache/beam/pull/28896 to solve the fastavro regression. [1]

Re: [ANNOUNCE] New PMC Member: Alex Van Boxel

2023-10-03 Thread Anand Inguva via dev
Congratulations. On Tue, Oct 3, 2023 at 2:50 PM Jack McCluskey via dev wrote: > Congrats, Alex! > > On Tue, Oct 3, 2023 at 2:49 PM XQ Hu via dev wrote: > >> Configurations, Alex! >> >> On Tue, Oct 3, 2023 at 2:40 PM Kenneth Knowles wrote: >> >>> Hi all, >>> >>> Please join me and the rest of

Re: [ANNOUNCE] New PMC Member: Robert Burke

2023-10-03 Thread Anand Inguva via dev
Congratulations!! On Tue, Oct 3, 2023 at 2:49 PM XQ Hu via dev wrote: > Congratulations, Robert! > > On Tue, Oct 3, 2023 at 2:40 PM Kenneth Knowles wrote: > >> Hi all, >> >> Please join me and the rest of the Beam PMC in welcoming Robert Burke < >> lostl...@apache.org> as our newest PMC

Re: [ANNOUNCE] New PMC Member: Valentyn Tymofieiev

2023-10-03 Thread Anand Inguva via dev
Congratulations!! On Tue, Oct 3, 2023 at 2:49 PM XQ Hu via dev wrote: > Congratulations, Valentyn! > > On Tue, Oct 3, 2023 at 2:40 PM Kenneth Knowles wrote: > >> Hi all, >> >> Please join me and the rest of the Beam PMC in welcoming Valentyn >> Tymofieiev as our newest PMC member. >> >>

Re: Proposal for pyproject.toml Support in Apache Beam Python

2023-09-01 Thread Anand Inguva via dev
hon dependencies has been a major >> pain point for me, and I think this approach would help. >> Kerry >> >> On Mon, Aug 28, 2023 at 10:14 AM Anand Inguva via dev < >> dev@beam.apache.org> wrote: >> >>> Hello Beam Dev Team, >>> >>> I've co

Proposal for pyproject.toml Support in Apache Beam Python

2023-08-28 Thread Anand Inguva via dev
Hello Beam Dev Team, I've compiled a design document [1] proposing the integration of pyproject.toml into Apache Beam's Python build process. Your insights and feedback would be

Re: [ANNOUNCE] New committer: Ahmed Abualsaud

2023-08-24 Thread Anand Inguva via dev
Congratulations Ahmed :) On Fri, Aug 25, 2023 at 1:17 AM Damon Douglas wrote: > Well deserved! Congratulations, Ahmed! I'm so happy for you. > > On Thu, Aug 24, 2023, 5:46 PM Byron Ellis via dev > wrote: > >> Congratulations! >> >> On Thu, Aug 24, 2023 at 5:34 PM Robert Burke wrote: >> >>>

Re: [DISCUSS] Enable Github Discussions?

2023-07-01 Thread Anand Inguva via dev
+1 for GitHub discussions as well. But I am also little concerned about multiple places for discussions. As Danny said, if we have a good plan on how to move forward on how/when to archive the current mailing list, that would be great. Thanks, Anand On Sat, Jul 1, 2023, 3:21 AM Damon Douglas

Re: [Notice] Jenkins seed job comment trigger no longer working, and possible solutions

2023-05-11 Thread Anand Inguva via dev
+1 to add committers to the list manually. Thanks Yi for doing this. On Thu, May 11, 2023 at 11:48 AM Danny McCormick via dev < dev@beam.apache.org> wrote: > I'm +1 on just adding committers to a list manually. Having the ability to > run seed jobs from a PR is nice, but adding a new committer

Re: [PROPOSAL] Preparing for 2.48.0 Release

2023-05-09 Thread Anand Inguva via dev
greed upon last time we discussed the version > support policy on dev@. > > On Fri, May 5, 2023 at 6:18 PM Robert Bradshaw via dev < > dev@beam.apache.org> wrote: > >> On Fri, May 5, 2023 at 6:27 AM Anand Inguva via dev >> wrote: >> > >> > >

Introducing beam.MLTransform

2023-05-09 Thread Anand Inguva via dev
Hi all, In Apache Beam, we plan to introduce a *beam.MLTransform* for carrying out common ML centric processing tasks. Using the tensorflow_transform as the backend, we will introduce several data processing transforms in Beam. These can be easily utilized by simply wrapping them with the

Re: [VOTE] Release 2.47.0, release candidate #3

2023-05-05 Thread Anand Inguva via dev
+1 (non-binding) Tested python quick start guide on Dataflow runner with Python 3.11. Thanks, Anand On Thu, May 4, 2023 at 10:53 PM Jack McCluskey via dev wrote: > Hi everyone, > > Please review and vote on the release candidate #3 for the version 2.47.0, > as follows: > [ ] +1, Approve the

Re: [PROPOSAL] Preparing for 2.48.0 Release

2023-05-05 Thread Anand Inguva via dev
via dev < > dev@beam.apache.org> wrote: > >> I'd suggest shooting for 2.48.0 so we're ahead of the end-of-support >> date. We're also supporting 5 different Python versions in 2.47.0, it's >> probably for the best to try and pare that down. >> >> On Thu, May 4, 2023

Re: [PROPOSAL] Preparing for 2.48.0 Release

2023-05-04 Thread Anand Inguva via dev
Thanks Ritesh!! Python 3.7 support is going to end on June 27th 2023. Beam 2.48.0 may get released ~1-2 weeks earlier of that date. My question here is should we target 2.48.0 or 2.49.0 to stop supporting Python 3.7 for beam? Thanks, Anand On Wed, May 3, 2023 at 10:25 PM Jeff Zhang wrote: >

Re: [ANNOUNCE] New committer: Anand Inguva

2023-04-21 Thread Anand Inguva via dev
Thanks everyone. Really excited to be a part of Beam Committers. On Fri, Apr 21, 2023 at 3:07 PM XQ Hu via dev wrote: > Congratulations, Anand!!! > > On Fri, Apr 21, 2023 at 2:31 PM Jack McCluskey via dev < > dev@beam.apache.org> wrote: > >> Congratulations, Anand! >> >> On Fri, Apr 21, 2023 at

Re: [Python SDK] Use pre-released dependencies for Beam python unit testing

2023-04-20 Thread Anand Inguva via dev
Wed, Apr 12, 2023 at 10:46 AM Danny McCormick via dev < >>>> dev@beam.apache.org> wrote: >>>> >>>>> Thanks for doing this Anand, I'm +1 on option 1 as well - I think >>>>> having the clear signal of the normal suite succeeding and the prerelea

Re: Python 3.11 support in Apache Beam

2023-04-13 Thread Anand Inguva via dev
thon 3.11 is an average of 25% faster than CPython 3.10 as measured >>> with the pyperformance benchmark suite, when compiled with GCC on Ubuntu >>> Linux. Depending on your workload, the overall speedup could be 10-60%." >>> >>> Have we measured this in Beam? A

Re: Python 3.11 support in Apache Beam

2023-04-13 Thread Anand Inguva via dev
the overall speedup could be 10-60%." > > Have we measured this in Beam? Are we seeing any benefits? If not, why? If > yes, this would be a cool blog post as well. > > Ahmet > > > On Wed, Apr 5, 2023 at 1:12 PM Anand Inguva via dev > wrote: > >> Python 3.1

Re: [Python SDK] Use pre-released dependencies for Beam python unit testing

2023-04-12 Thread Anand Inguva via dev
y. That makes it really easy to treat the prerelease suite as a (at >>> least temporary) signal on needing upper bounds on our dependencies. >>> >>> Thanks, >>> Danny >>> >>> On Wed, Apr 12, 2023 at 12:36 AM Anand Inguva via dev < >>

[Python SDK] Use pre-released dependencies for Beam python unit testing

2023-04-11 Thread Anand Inguva via dev
Hi all, For Apache Beam Python we are considering using pre-released dependencies for unit testing by using the --pre flag to install pre-released dependencies of packages. We believe that using pre-released dependencies may help us to identify and resolve bugs more quickly, and to take

Re: Python 3.11 support in Apache Beam

2023-04-05 Thread Anand Inguva via dev
;>> https://github.com/apache/beam/issues/24569. >>>> >>>> If we use older versions of these packages, then we have to depend on >>>>> installing those packages on Python 3.11 from source distributions which >>>>> is >>>>

Re: Beam Website Feedback

2023-04-04 Thread Anand Inguva via dev
Hi, You can follow a similar pattern to this https://github.com/apache/beam/blob/c6f8e0be1a51a32730754934c3f40eaae39d0f98/sdks/python/apache_beam/metrics/metric_test.py#L130 . ``` with self.assertRaises(ValueError): ``` On Tue, Apr 4, 2023 at 7:24 PM Grace Young via dev wrote: > How

Re: Regarding Project proposal review and feedback

2023-04-02 Thread Anand Inguva via dev
I left some comments on the sentiment analysis proposal. Thanks, Anand On Thu, Mar 30, 2023 at 9:59 AM Danny McCormick via dev wrote: > Thanks Siddharth! I left some comments on the sentiment analysis proposal, > I am probably not the best person to comment on the flink datastream api > one

Re: Project Proposal

2023-03-23 Thread Anand Inguva via dev
Hi, Thanks for the proposal. Can you share the google doc link for your proposal? It would be easier to go back and forth on reviews. I am happy to review it and provide feedback on it. Thanks, Anand On Sun, Mar 19, 2023 at 5:03 PM Siddharth Aryan (via Google Docs) <

Update on Protobuf and GCP packages for Apache Beam Python SDK

2023-03-15 Thread Anand Inguva via dev
Hi, For Apache Beam Python SDK, we updated the protobuf version to 'protobuf>=4.21.1,<4.23.0' from 'protobuf>3.12.2,<4' as Protobuf had a major upgrade in May 2022 https://protobuf.dev/news/2022-05-06/. This will take effect on the Beam 2.47.0 release. A tighter bound was placed on protobuf to

Re: [RESULT] [VOTE] Release 2.46.0, release candidate #1

2023-03-08 Thread Anand Inguva via dev
Thanks Danny!! On Wed, Mar 8, 2023 at 12:14 PM Danny McCormick via dev wrote: > I'm happy to announce that we have unanimously approved release 2.46.0 > There are 8 approving votes, 5 of which are binding: * Robert Bradshaw > (binding) * Chamikara Jayalath (binding) * Ahmet Altay (binding) *

Re: [VOTE] Release 2.46.0, release candidate #1

2023-03-03 Thread Anand Inguva via dev
+1 (non-binding) Tested python wordcount quick start https://beam.apache.org/get-started/quickstart-py/ on Direct Runner and Dataflow Runner. Thanks! On Fri, Mar 3, 2023 at 11:21 AM Bruno Volpato via dev wrote: > +1 (non-binding) > > Tested with

Re: Python 3.11 support in Apache Beam

2023-02-21 Thread Anand Inguva via dev
e/beam/pull/24599 but I think this issue should >>> be a blocker for Python 3.11 update. >>> >>> On Tue, Feb 7, 2023 at 5:25 PM Valentyn Tymofieiev >>> wrote: >>> >>>> Hi Anand, >>>> >>>> On Tue, Feb 7, 2023 at 1:35 PM Ana

Re: [ANNOUNCE] New PMC Member: Jan Lukavský

2023-02-16 Thread Anand Inguva via dev
Congratulations!! On Thu, Feb 16, 2023 at 12:42 PM Chamikara Jayalath via dev < dev@beam.apache.org> wrote: > Congrats Jan! > > On Thu, Feb 16, 2023 at 8:35 AM John Casey via dev > wrote: > >> Thanks Jan! >> >> On Thu, Feb 16, 2023 at 11:11 AM Danny McCormick via dev < >> dev@beam.apache.org>

Re: Beam Website Feedback

2023-02-10 Thread Anand Inguva via dev
Some of the imports are missing. ``` # try importing import unittest import apache_beam as beam # rest of the code. ``` try running with pytest as *pytest test.py *and it should work. Thanks, Anand On Fri, Feb 10, 2023 at 10:09 AM Julian Ogando via dev wrote: > Hi, > I'm reading the

Re: Python 3.11 support in Apache Beam

2023-02-09 Thread Anand Inguva via dev
ions which is >> not desired. >> >> I am working parallely on that issue in a different PR >> https://github.com/apache/beam/pull/24599 but I think this issue should >> be a blocker for Python 3.11 update. >> >> On Tue, Feb 7, 2023 at 5:25 PM Valentyn Tymofiei

Re: Python 3.11 support in Apache Beam

2023-02-07 Thread Anand Inguva via dev
2023 at 1:35 PM Anand Inguva via dev > wrote: > >> Hi all, >> >> We are planning to work on adding support for Python 3.11[1] to Apache >> Beam Python SDK. >> >> As part of this effort, we are going to update the python build >> dependencies defined

Python 3.11 support in Apache Beam

2023-02-07 Thread Anand Inguva via dev
Hi all, We are planning to work on adding support for Python 3.11[1] to Apache Beam Python SDK. As part of this effort, we are going to update the python build dependencies defined at [2]. Right now, there is an error with the newer version of protobuf(4.21.11). It is not generating _urn files.

Re:

2023-02-01 Thread Anand Inguva via dev
Hi, You can send an email to dev-subscr...@beam.apache.org instead to subscribe to the dev list. Thanks, Anand On Wed, Feb 1, 2023 at 6:52 PM Anand Inguva wrote: > Hi, > > You can send an email to dev-subscr...@beam.apache.org instead to > subscribe the dev list. > > Thanks, > Anand > > On

Re:

2023-02-01 Thread Anand Inguva via dev
Hi, You can send an email to dev-subscr...@beam.apache.org instead to subscribe the dev list. Thanks, Anand On Wed, Feb 1, 2023 at 6:41 PM Martin Chi wrote: > Hi, > > I want to subscribe it. >

Re: [RFC] Tensorflow model handler in Beam Repository

2023-01-26 Thread Anand Inguva via dev
Thanks Ritesh for the proposal. Left couple of comments. On Thu, Jan 26, 2023 at 11:04 AM Danny McCormick via dev < dev@beam.apache.org> wrote: > Thanks Ritesh! I left a couple comments, but overall this looks like a > great proposal! > > On Thu, Jan 26, 2023 at 10:43 AM Ritesh Ghorse via dev <

Streaming model updates for the RunInference transform.

2022-11-21 Thread Anand Inguva via dev
Hi, I created a doc [1] on a feature that I am working on for the RunInference

Re: [VOTE] Release 2.43.0, release candidate #2

2022-11-14 Thread Anand Inguva via dev
+1(non-binding) Validated Python wordcount example on Direct and Dataflow runner. Staging of the Python dependencies works as expected now. Thanks, Anand On Sun, Nov 13, 2022 at 9:52 AM Chamikara Jayalath via dev < dev@beam.apache.org> wrote: > Hi everyone, > Please review and vote on the

Re: [VOTE] Release 2.43.0, release candidate #1

2022-11-11 Thread Anand Inguva via dev
Thanks Valentyn for catching this. There is a PR[1] in flight, almost ready to be merged with the fix. Thanks, Anand PR: https://github.com/apache/beam/pull/24114 On Thu, Nov 10, 2022 at 8:46 PM Chamikara Jayalath wrote: > Ack. Thanks for finding this. > > - Cham > > On Thu, Nov 10, 2022

Re: [VOTE] Release 2.43.0, release candidate #1

2022-11-10 Thread Anand Inguva via dev
+1 (non-binding) validated Python SDK QuickStart, Beam RunInference examples on Direct and Dataflow Runner. Also, verified the Python 3.10 artifacts. On Wed, Nov 9, 2022 at 1:40 PM Chamikara Jayalath via dev < dev@beam.apache.org> wrote: > Ack. There's another potential cherry-pick here: >

Re: [ANNOUNCE] New committer: Yi Hu

2022-11-09 Thread Anand Inguva via dev
Congratulations Yi! On Wed, Nov 9, 2022 at 1:35 PM Ritesh Ghorse via dev wrote: > Congratulations Yi! > > On Wed, Nov 9, 2022 at 1:34 PM Ahmed Abualsaud via dev < > dev@beam.apache.org> wrote: > >> Congrats Yi! >> >> On Wed, Nov 9, 2022 at 1:33 PM Sachin Agarwal via dev < >>

Re: [ANNOUNCE] New committer: Ritesh Ghorse

2022-11-03 Thread Anand Inguva via dev
Congratulations Ritesh. On Thu, Nov 3, 2022 at 7:51 PM Yi Hu via dev wrote: > Congratulations Ritesh! > > On Thu, Nov 3, 2022 at 7:23 PM Byron Ellis via dev > wrote: > >> Congratulations! >> >> On Thu, Nov 3, 2022 at 4:21 PM Austin Bennett < >> whatwouldausti...@gmail.com> wrote: >> >>>

Regression Alerts for Python Performance tests

2022-11-02 Thread Anand Inguva via dev
Hi, Python load tests/perf tests compute metrics and publish those metrics to Grafana for visualization. As of now, there is no automated way of creating alerts(Github issues) when a performance regression is detected in the perf tests. I created a doc[1] which outlines an API to perform Change

Avoid breaking change when adding an additional parameter to the RunInference PTransform

2022-09-29 Thread Anand Inguva via dev
Hi, Recently I encountered a breaking change when adding an additional parameter to the RunInference transform[1] and ModelHandler[2] in pull/23266 . I explained what happened in a doc[3] and provided a few suggestions on how to avoid this kind of

Re: Pass Custom namespaces to RunInference metrics

2022-09-13 Thread Anand Inguva via dev
s should be on the user to add an >> appropriate grouping prefix such as 'RunInference_' in front of all the >> custom namespaces. >> >> Best, >> Andy >> >> On Mon, Sep 12, 2022 at 1:31 PM Anand Inguva via dev >> wrote: >> >>> Hi al

Pass Custom namespaces to RunInference metrics

2022-09-12 Thread Anand Inguva via dev
Hi all, I created a doc [1] to outline a few solutions on how to pass custom namespace to the RunInference[2] transform for better tracking and usage of metrics. Please go through it and let me know

Benchmark tests for the Beam RunInference API

2022-08-16 Thread Anand Inguva via dev
Hi, I created a doc [1] which outlines the plan for the RunInference API[2] benchmark/performance tests. I would appreciate feedback on the following, - Models used for the benchmark tests. - Metrics