Re: ByteBuddy DoFnInvokers Write Up

2024-01-11 Thread Ismaël Mejía
Neat! I remember passing long time trying to decipher the DoFnInvoker behavior so this will definitely be helpful. Maybe a good idea to add the link to the Design Documents list for future reference https://cwiki.apache.org/confluence/display/BEAM/Design+Documents On Wed, Jan 10, 2024 at 9:15 PM

Re: Lakehouse Formats with IO/Integration --> Hudi? Iceberg?

2023-11-07 Thread Ismaël Mejía
For iceberg there has been a long time opened issue and some WIP for a sink https://github.com/apache/beam/issues/20327 On Tue, Nov 7, 2023 at 2:08 AM Austin Bennett wrote: > Beam Devs, > > I was looking through GH Issue and online more generally and hadn't seen > much... Has anyone written

Re: [ANNOUNCE] New PMC Member: Alex Van Boxel

2023-10-05 Thread Ismaël Mejía
Congratulations Alex, well deserved! On Wed, Oct 4, 2023 at 11:59 PM Chamikara Jayalath wrote: > Congrats Alex! > > On Wed, Oct 4, 2023 at 1:43 AM Jan Lukavský wrote: > >> Congrats Alex! >> On 10/4/23 10:29, Alexey Romanenko wrote: >> >> Congrats Alex, very well deserved! >> >> — >> Alexey >>

Re: [ANNOUNCE] New PMC Member: Robert Burke

2023-10-05 Thread Ismaël Mejía
Congratulations Robert, well deserved ! long live go ! On Wed, Oct 4, 2023 at 11:58 PM Chamikara Jayalath wrote: > Congrats Rebo! > > On Wed, Oct 4, 2023 at 1:42 AM Jan Lukavský wrote: > >> Congrats Robert! >> On 10/4/23 10:29, Alexey Romanenko wrote: >> >> Congrats Robert, very well deserved!

Re: [ANNOUNCE] New PMC Member: Valentyn Tymofieiev

2023-10-05 Thread Ismaël Mejía
Congratulations Valentyn, well deserved ! On Wed, Oct 4, 2023 at 11:58 PM Chamikara Jayalath wrote: > Congrats Valentyn! > > On Wed, Oct 4, 2023 at 1:42 AM Jan Lukavský wrote: > >> Congrats Valentyn! >> On 10/4/23 10:26, Alexey Romanenko wrote: >> >> Congrats Valentyn, very well deserved! >>

Re: Automatic signing of releases

2023-08-24 Thread Ismaël Mejía
published some more formal documentation/process around this. Previously I > had to ask the VP of Security for special permission to do this  > > Thanks, > Danny > > On Thu, Aug 24, 2023 at 10:48 AM Ismaël Mejía wrote: > >> Hi, >> >> I just saw an interesting cha

Automatic signing of releases

2023-08-24 Thread Ismaël Mejía
Hi, I just saw an interesting change on the ASF side that could be of interest for Beam releases. The ASF now allows to do signing of releases by automated infrastructure. https://issues.apache.org/jira/browse/LEGAL-647 This is a good step for automation that I remember we discussed at the

Re: FOSDEM 2023 is back as in person event

2022-10-21 Thread Ismaël Mejía
Hi Aizhamal, You might be interested on this thread where the ASF people are also discussing about FOSDEM participation. https://lists.apache.org/thread/kv4fhldmc9mo6v5lwtkwqtwg97l64lx1 It seems the call for devrooms is closed so maybe it us too late for Beam, but we have had talks in the past

Re: [DISCUSS] Jenkins -> GitHub Actions ?

2022-10-21 Thread Ismaël Mejía
+1 Github Actions are more intuitive and easy to modify and test for everyone. Also Beam wins because that makes one less system to maintain. Regards, Ismaël On Wed, Oct 19, 2022 at 5:50 PM Danny McCormick via dev wrote: > > Thanks for kicking this conversation off. I'm +1 on migrating, but

Re: unvendoring bytebuddy

2022-03-17 Thread Ismaël Mejía
+1 Probably worth to check if we have dependencies that rely on Byte Buddy that can produce conflicts but I doubt it. My only worry was ASM leaking into the classpath, but it seems that Byte Buddy already shades ASM so that should not be an issue. Ismaël On Thu, Mar 17, 2022 at 5:09 PM Liam

Re: Beam job details not available on Spark History Server

2022-02-24 Thread Ismaël Mejía
Hello Jozef this change was not introduced in the PR you referenced, that PR was just a refactor. The conflicting change was added in [1] via [2] starting on Beam 2.29.0. It is not clear for me why this was done but maybe Kyle Weaver or someone else have a better context. Let's continue the

Re: Are runners-spark2 and beam-site branches obsolete?

2022-02-16 Thread Ismaël Mejía
**beam-site** is there for legacy reasons I suppose we can remove them without any consequence. Most of the history is in the other repo and the actual site in the master branch. **runners-spark2** I think we can go ahead and remove it, this was the in-progress work of Amit Sela who has not been

Re: Re: [Question][Contribution] Python SDK ByteKeyRange

2022-02-15 Thread Ismaël Mejía
Oh, forgot to add also the link to the tests that cover most of those unexpected cases: [2] https://github.com/apache/beam/blob/master/sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/splittabledofn/ByteKeyRangeTrackerTest.java On Tue, Feb 15, 2022 at 10:17 AM Ismaël Mejía wrote

Re: Re: [Question][Contribution] Python SDK ByteKeyRange

2022-02-15 Thread Ismaël Mejía
Great idea, please take a look at the Java ByteKeyRestrictionTracker implementation for consistency [1] I remember we had to deal with lots of corner cases so probably worth a look. [1]

Re: Developing on an M1 Mac

2022-02-09 Thread Ismaël Mejía
g the images >> and running our tests (using self-hosted runners). >> As soon as I get it I will be able to share the code/experiences with you. >> >> J >> >> On Tue, Feb 8, 2022 at 2:50 PM Ismaël Mejía wrote: >>> >>> For awareness with the

Re: Developing on an M1 Mac

2022-02-08 Thread Ismaël Mejía
For awareness with the just released Beam 2.36.0 Beam works out of the box to develop on a Mac M1. I tried Java and Python pipelines with success running locally on both Flink/Spark runner. I found one issue using zstd and created [1] that was merged today, with this the sdks:core tests and Spark

Re: [ANNOUNCE] Apache Beam 2.36.0 Release

2022-02-08 Thread Ismaël Mejía
Great work Emily and everyone! I am glad to see that with the dependency updates this is the first Beam release that works correctly out of the box on ARM64, I tried some helloword examples on a Mac M1 with both Java and Python and it works ok. Ismaël On Tue, Feb 8, 2022 at 9:49 AM Jarek

Re: [DISCUSS] propdeps removal and what to do going forward

2022-01-13 Thread Ismaël Mejía
Optional dependencies should not be a major issue. What matters to validate that we are not breaking users is to compare the generated POM files with the previous (pre gradle 7 / 2.35.0) version and see that what was provided is still provided. In particular the Hadoop/Spark and Kafka

Re: Contributor permission for Beam Jira tickets

2021-11-27 Thread Ismaël Mejía
Done, I assigned the issue to you too. Welcome to Beam! On Sat, Nov 27, 2021 at 12:53 AM Alexander Dahl wrote: > > Hi, > > My name is Alex, working as a data engineer at ICA, a large Swedish retailer. > At work I'm writing beam code for Google Cloud Dataflow jobs. I want to > contribute to the

Re: Performance tests dashboard not working

2021-07-29 Thread Ismaël Mejía
at 9:41 PM Ismaël Mejía wrote: > > Seems to be a networking issue in my side, they fail on Firefox for > some weird timeout but they work perfectly on Chrome. > Thanks for confirming Andrew > > On Wed, Apr 21, 2021 at 6:45 PM Andrew Pilloud wrote: > > > > Looks like i

Re: [PROPOSAL] Vendored gRPC 1.36.0 0.2 Release

2021-06-30 Thread Ismaël Mejía
+1 I just merged Tomo's security fix so this should be ready to go On Wed, Jun 30, 2021 at 12:26 AM Luke Cwik wrote: > > Sounds good, will wait for you PR. > > On Tue, Jun 29, 2021 at 2:35 PM Tomo Suzuki wrote: >> >> I have this https://issues.apache.org/jira/browse/BEAM-12422 "Vendored gRPC

Re: Beam Contributor Request

2021-06-29 Thread Ismaël Mejía
Hello Jack, You were added to the contributors group so you can self assign tickets. Welcome to Beam! Ismaël On Mon, Jun 28, 2021 at 9:35 PM Jack McCluskey wrote: > Hey everyone, > > My name is Jack McCluskey and I'd like to be added to the contributor > list. My Jira username is

Re: Apache Beam Contributor List Request

2021-06-29 Thread Ismaël Mejía
Hello Marco, You were added to the contributor group so you can now self-assign JIRA tickets. Welcome to Beam! Ismaël On Mon, Jun 28, 2021 at 6:58 PM Marco Robles Pulido wrote: > > Hi team, > > This is Marco and I would like to be added to the Apache Beam contributor > list, my username is

Re: [Proposal] Go SDK Exits Experimental

2021-06-17 Thread Ismaël Mejía
Oups forgot to write one question. Will this come with revamped website instructions/doc for golang too? On Thu, Jun 17, 2021 at 3:21 PM Ismaël Mejía wrote: > > Huge +1 > > This is definitely something many people have asked about, so it is > great to see it finally happening. >

Re: [Proposal] Go SDK Exits Experimental

2021-06-17 Thread Ismaël Mejía
Huge +1 This is definitely something many people have asked about, so it is great to see it finally happening. On Wed, Jun 16, 2021 at 7:56 PM Kenneth Knowles wrote: > > +1 awesome > > On Wed, Jun 16, 2021 at 10:33 AM Robert Burke wrote: >> >> Sounds reasonable to me. I agree. We'll aim to get

Re: Multiple architectures support on Beam (ARM)

2021-06-10 Thread Ismaël Mejía
cross-compile. That would suss out >> whether we're inadvertently taking on any incompatible dependencies.) >> >>>> >> >>>> Theoretically, if one does that and manually specifies the >> container, it could just work for Python (assuming no wheel

Re: contributor permission for Beam Jira tickets

2021-06-10 Thread Ismaël Mejía
Hello Pascal, I added you as a contributor so you can now self assign issues if you want. I assigned BEAM-12471 to you since I saw you opened a PR to fix it. Best, Ismaël On Wed, Jun 9, 2021 at 11:05 PM Pascal Gillet wrote: > Hi, > > This is Pascal. I identified some little but nonetheless

Re: Beam SNAPSHOTS not working since friday

2021-06-08 Thread Ismaël Mejía
Just to finish this thread I double checked and the SNAPSHOTs are published correctly now. On Tue, Jun 8, 2021 at 5:31 PM Ismaël Mejía wrote: > Thanks for clarifying Brian. So we shall wait for Infra then > > On Tue, Jun 8, 2021, 5:15 PM Brian Hulette wrote: > >> You may

Re: Beam SNAPSHOTS not working since friday

2021-06-08 Thread Ismaël Mejía
org/thread.html/r658cdfa643c44a3fa18c226238e537ad221c8f65337f0eab3ad6dad9%40%3Cdev.beam.apache.org%3E > [2] https://issues.apache.org/jira/browse/INFRA-21976 > > On Tue, Jun 8, 2021 at 12:53 AM Ismaël Mejía wrote: > >> While trying to check on the new 2.32.0-SNAPSHOTs this morning I noticed >> that the daily

Beam SNAPSHOTS not working since friday

2021-06-08 Thread Ismaël Mejía
While trying to check on the new 2.32.0-SNAPSHOTs this morning I noticed that the daily SNAPSHOTs have not been updating since last friday: https://ci-beam.apache.org/job/beam_Release_NightlySnapshot/ https://repository.apache.org/content/groups/snapshots/org/apache/beam/beam-sdks-java-core/ Can

[DISCUSS] Drop support for Flink 1.10

2021-05-28 Thread Ismaël Mejía
Hello, With Beam support for Flink 1.13 just merged it is the time to discuss the end of support for Flink 1.10 following the agreed policy on supporting only the latest three Flink releases [1]. I would like to propose that for Beam 2.31.0 we stop supporting Flink 1.10 [2]. I prepared a PR for

Re: [VOTE] Vendored Dependencies Release Byte Buddy 1.11.0

2021-05-20 Thread Ismaël Mejía
I'm happy to announce that we have unanimously approved this release. There are 7 approving votes, 4 of which are binding: * Pablo Estrada * Etienne Chauchot * Jean-Baptiste Onofre * Ismaël Mejía There are no disapproving votes. Thanks everyone! On Thu, May 20, 2021 at 9:17 PM Ismaël Mejía

Re: [VOTE] Vendored Dependencies Release Byte Buddy 1.11.0

2021-05-20 Thread Ismaël Mejía
+1 (binding)

Re: [VOTE] Vendored Dependencies Release Byte Buddy 1.11.0

2021-05-19 Thread Ismaël Mejía
ine whether this upgrade > is going to cause problems or not. Are there tests I should look at, or > some validation I should perform? > > On Wed, May 19, 2021 at 11:29 AM Ismaël Mejía wrote: > >> Kind reminder, the vote is ongoing >> >> On Mon, May 17, 2021 at 5:32 P

Re: [VOTE] Vendored Dependencies Release Byte Buddy 1.11.0

2021-05-19 Thread Ismaël Mejía
Kind reminder, the vote is ongoing On Mon, May 17, 2021 at 5:32 PM Ismaël Mejía wrote: > Please review the release of the following artifacts that we vendor: > * beam-vendor-bytebuddy-1_11_0 > > Hi everyone, > Please review and vote on the release candidate #1 for the version 0.

[VOTE] Vendored Dependencies Release Byte Buddy 1.11.0

2021-05-17 Thread Ismaël Mejía
Please review the release of the following artifacts that we vendor: * beam-vendor-bytebuddy-1_11_0 Hi everyone, Please review and vote on the release candidate #1 for the version 0.1, as follows: [ ] +1, Approve the release [ ] -1, Do not approve the release (please provide specific comments)

Re: [PROPOSAL] Vendored bytebuddy dependency release

2021-05-17 Thread Ismaël Mejía
e time it took to generate bytcode. While this his > minimal impact on real pipelines (since bytecode is generated on worker > startup), it has an outsized impact on microbencmark run time. > > On Wed, May 12, 2021 at 5:55 AM Ismaël Mejía wrote: > >> Testing this particula

Github Actions requires to approve the CI jobs for new contributors

2021-05-17 Thread Ismaël Mejía
For awareness Github Actions requires now to approve the CI runs for new contributors, so this is a call for committers to pay attention to this and help when they see new users PRs so they don't stay just waiting without running.

Is implementing DisplayData on Beam Transforms worth?

2021-05-12 Thread Ismaël Mejía
Running a pipeline on Dataflow I noticed it was not showing the 'display data' of ParquetIO on the Dataflow UI, after digging deeper I found that composite transforms are not shown on Dataflow. BEAM-366 Support Display Data on Composite Transforms https://issues.apache.org/jira/browse/BEAM-366 I

Re: [PROPOSAL] Vendored bytebuddy dependency release

2021-05-12 Thread Ismaël Mejía
es wrote: > >> If nothing breaks, and we check perf, then absolutely this seems good. >> >> Kenn >> >> On Mon, May 10, 2021 at 12:38 AM Ismaël Mejía wrote: >> >>> Most issues on the previous migration were related to changes on >>> beha

Re: [DISCUSS] Enable automatic dependency updates with Github's dependabot

2021-05-12 Thread Ismaël Mejía
; https://github.com/apache/beam/blob/master/sdks/python/container/base_image_requirements.txt > > [2] > https://github.com/apache/beam/blob/985e2f095d150261e998f58cf048e48a909d5b2b/sdks/python/tox.ini#L231 > > On Fri, Apr 16, 2021 at 7:16 AM Ismaël Mejía wrote: > >> Oh forgo

Re: LGPL-2.1 in beam-vendor-grpc

2021-05-10 Thread Ismaël Mejía
, 2021 at 2:46 PM Ismaël Mejía wrote: > We have been discussing about updating the vendored dependency in > BEAM-11227 <https://issues.apache.org/jira/browse/BEAM-11227>, if I > remember correctly the newer version of gRPC does not require the jboss > dependency, so probably is the

Re: LGPL-2.1 in beam-vendor-grpc

2021-05-10 Thread Ismaël Mejía
We have been discussing about updating the vendored dependency in BEAM-11227 , if I remember correctly the newer version of gRPC does not require the jboss dependency, so probably is the best upgrade path, can you confirm Tomo Suzuki

Re: Upgrading vendored gRPC from 1.26.0 to 1.36.0

2021-05-10 Thread Ismaël Mejía
29.0 branch. >>>> >>>> The counter argument is that we will be pulling in all the bugs >>>> introduced to `master` since the branch cut. >>>> >>>> As far as effort goes, I have been mostly focused on burning down the >>>> b

Re: [PROPOSAL] Vendored bytebuddy dependency release

2021-05-10 Thread Ismaël Mejía
ges in ByteBuddy (I > think related to new Java versions) that required rewriting code in Beam. > > On Sat, May 8, 2021 at 10:46 PM Ismaël Mejía wrote: > >> What were the issues last time Reuven? I remember that the release and >> upgrade PR were pretty smooth, were there u

Re: [PROPOSAL] Vendored bytebuddy dependency release

2021-05-08 Thread Ismaël Mejía
ght be a > difficult upgrade to do. > > On Sat, May 8, 2021 at 12:57 AM Ismaël Mejía wrote: > >> The version of bytebuddy Beam is vendoring (1.10.8) is already 16 months >> old and >> it is not compatible with more recent versions of Java. I would like to >> propos

Re: Lots of branches

2021-05-08 Thread Ismaël Mejía
Big +1 If you want to know if you have accidentally let some branches in the 'apache/beam' origin this command may help: git for-each-ref --format='%(authorname) %09 %(refname)' --sort=authorname | grep "origin" | grep -v "release" On Sat, May 8, 2021 at 5:01 AM Daniel Oliveira wrote: >

Re: Extremely Slow DirectRunner

2021-05-08 Thread Ismaël Mejía
Can you try running direct runner with the option `--experiments=use_deprecated_read` Seems like an instance of https://issues.apache.org/jira/browse/BEAM-10670?focusedCommentId=17316858=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17316858 also reported in

[PROPOSAL] Vendored bytebuddy dependency release

2021-05-08 Thread Ismaël Mejía
The version of bytebuddy Beam is vendoring (1.10.8) is already 16 months old and it is not compatible with more recent versions of Java. I would like to propose that we upgrade it [1] to the most recent version (1.11.0) [2] so we can benefit of the latest improvements for Java 16/17 and upgraded

Re: [PROPOSAL] Upgrade Cassandra driver from 3.x to 4.x in CassandraIO

2021-04-30 Thread Ismaël Mejía
Hello, My excuses for not having commented on this thread before. Thanks for bringing this new IO connector! After reading the proposal I think we need to create this as a new independent CassandraIO (v4) IO connector different from the existing one based on v3 for the following reasons: 1. The

Re: [DISCUSSION] TPC-DS benchmark via Beam SQL, issues

2021-04-28 Thread Ismaël Mejía
> Not every query can be supported by BeamSQL easily. I have one related question. Would we be able to apply SQL specific optimizations that apply only to batch only pipelines? Asking this because I can imagine that covering the full Beam model should constraint the optimization possibilities no?

Re: Contributor permissions for Beam Jira tickets

2021-04-26 Thread Ismaël Mejía
Done, you are now a contributor and I assigned BEAM-12225 to you. Welcome to Beam and don't feel bad about your accents I have to deal with the same issues regularly :) Regards, Ismaël Mejía On Mon, Apr 26, 2021 at 6:22 PM Rafal Ochyra wrote: > Hi, > > I have created the account on

Re: Issues and PR names and descriptions (or should we change the contribution guide)

2021-04-22 Thread Ismaël Mejía
c005 4e3decbb4e > <-- a merge commit that merges 2 commit, 4e3decbb4e and > it's parent. Author history is preserved on 4e3decbb4e > Author: Ismaël Mejía ><-- this is the author of merge commit > Date: Thu Apr 22 12:46:38 2

Re: Issues and PR names and descriptions (or should we change the contribution guide)

2021-04-22 Thread Ismaël Mejía
t;>>> So at this point, I think I am OK with a 1 commit per PR policy. I think >>>>>> the net benefits to our commit history would be good. I have grown tired >>>>>> of repeating the conversation. Rebase-and-squash edits commit ids in >>>>>> w

Re: Performance tests dashboard not working

2021-04-21 Thread Ismaël Mejía
Seems to be a networking issue in my side, they fail on Firefox for some weird timeout but they work perfectly on Chrome. Thanks for confirming Andrew On Wed, Apr 21, 2021 at 6:45 PM Andrew Pilloud wrote: > > Looks like it is working now? > > On Wed, Apr 21, 2021 at 7:34 AM Ismaël

Issues and PR names and descriptions (or should we change the contribution guide)

2021-04-21 Thread Ismaël Mejía
Hello, I have noticed an ongoing pattern of carelessness around issues/PR titles and descriptions. It is really painful to see more and more examples like: BEAM-12160 Add TODO for fixing the warning BEAM-12165 Fix ParquetIO BEAM-12173 avoid intermediate conversion (PR) and BEAM-12173 use

Performance tests dashboard not working

2021-04-21 Thread Ismaël Mejía
Following the conversation on the performance regression on Flink runner I wanted to take a look at the performance dashboards (Nexmark + Load Tests) but when I open the dashboards it says there is a connectivity error "NetworkError when attempting to fetch resource.". Can someone with more

Re: [DISCUSS] Enable automatic dependency updates with Github's dependabot

2021-04-16 Thread Ismaël Mejía
Oh forgot to mention one alternative that we do in the Avro project, it is that we don't create issues for the dependabot PRs and then we search all the commits authored by dependabot and include them in the release notes to track dependency upgrades. On Fri, Apr 16, 2021 at 4:02 PM Ismaël Mejía

Re: [DISCUSS] Enable automatic dependency updates with Github's dependabot

2021-04-16 Thread Ismaël Mejía
it's quite easy to forget to do. I’d prefer to > have a dedicated Jira issue for every upgrade and it will be included into > releases notes almost automatically. > > > On 16 Apr 2021, at 14:15, Ismaël Mejía wrote: > > > > Hello, > > > > Github has a bot that

[DISCUSS] Enable automatic dependency updates with Github's dependabot

2021-04-16 Thread Ismaël Mejía
Hello, Github has a bot that creates automatically Dependency Update PRs and report security issues called dependabot. I was wondering if we should enable it for Beam. I tested it in my personal Beam fork and it seems to be working well, it created dependency updates for both Python and JS

Re: Long term support versions of Beam Java

2021-04-16 Thread Ismaël Mejía
As Kenn points clearly, everyone can do an Apache release of an earlier version, so this should cover most maintenance fixes for old versions. So any person (or company) can decide to work on supporting one version. The real deal of having a LTS "backed by the community" is that ALL the community

Re: [Question] Amazon Neptune I/O connector

2021-04-16 Thread Ismaël Mejía
, Apr 16, 2021 at 9:58 AM Ismaël Mejía wrote: > > Hello Gabriel, > > Other interesting reference because of the Batch loads API like use + > Amazon is the unfinished Amazon Redshift connector PR from this ticket > https://issues.apache.org/jira/browse/BEAM-3032 > >

Re: [Question] Amazon Neptune I/O connector

2021-04-16 Thread Ismaël Mejía
Hello Gabriel, Other interesting reference because of the Batch loads API like use + Amazon is the unfinished Amazon Redshift connector PR from this ticket https://issues.apache.org/jira/browse/BEAM-3032 The reason why that one was not merged into Beam is because if lacked tests. You should

Re: [ANNOUNCE] New committer: Tomo Suzuki

2021-04-02 Thread Ismaël Mejía
Congrats Tomo, so well deserved. It has been a pleasure to work with you! On Fri, Apr 2, 2021 at 8:29 PM Tyson Hamilton wrote: > Congrats! > > On Fri, Apr 2, 2021 at 11:02 AM Pablo Estrada wrote: > >> Thank you Tomo! And congrats : ) >> >> On Fri, Apr 2, 2021 at 10:24 AM Robert Bradshaw >>

Re: Upgrading vendored gRPC from 1.26.0 to 1.36.0

2021-03-25 Thread Ismaël Mejía
Precommit is quite unstable in the last days, so worth to check if something is wrong in the CI. I have a question Kenn. Given that cherry picking this might be a bit big as a change can we just reconsider cutting the 2.29.0 branch again after the updated gRPC version use gets merged and mark the

Re: Write to multiple IOs in linear fashion

2021-03-24 Thread Ismaël Mejía
+dev Since we all agree that we should return something different than PDone the real question is what should we return. As a reminder we had a pretty interesting discussion about this already in the past but uniformization of our return values has not happened. This thread is worth reading for

Re: BEAM-11023: tests failing on Spark Structured Streaming runner

2021-03-17 Thread Ismaël Mejía
Actually there are many reasons that could have produced this regression even if the code of the runner has not changed at all: (1) those tests weren't enabled before and now are and they weren't passing or (2) the tests were changed or (3) my principal guess: the translation strategy of a

Re: Contributor permission for Beam Jira tickets

2021-03-16 Thread Ismaël Mejía
Hello Vitaly, What is your jira id? On Tue, Mar 16, 2021 at 5:54 PM Vitaly Terentyev wrote: > > This is Vitaly from Akvelon. > Could you please add me as a contributor to Beam's Jira issue tracker? > I would like to assign some tickets for my work. > > Best regards, > > Vitaly

Re: Null checking in Beam

2021-03-15 Thread Ismaël Mejía
+1 Even if I like the strictness for Null checking, I also think that this is adding too much extra time for builds (that I noticed locally when enabled) and also I agree with Jan that the annotations are really an undesired side effect. For reference when you try to auto complete some method

Re: [DISCUSS] Drop support for Flink 1.8 and 1.9

2021-03-12 Thread Ismaël Mejía
te: >> >> +1 >> >> D. >> >> On Thu, Mar 11, 2021 at 8:33 PM Ismaël Mejía wrote: >>> >>> +user >>> >>> > Should we add a warning or something to 2.29.0? >>> >>> Sounds like a good idea. >>> >>

Re: [DISCUSS] Drop support for Flink 1.8 and 1.9

2021-03-11 Thread Ismaël Mejía
+user > Should we add a warning or something to 2.29.0? Sounds like a good idea. On Thu, Mar 11, 2021 at 7:24 PM Kenneth Knowles wrote: > > Should we add a warning or something to 2.29.0? > > On Thu, Mar 11, 2021 at 10:19 AM Ismaël Mejía wrote: >> >> Hello, &g

[DISCUSS] Drop support for Flink 1.8 and 1.9

2021-03-11 Thread Ismaël Mejía
Hello, We have been supporting older versions of Flink that we had agreed in previous discussions where we said we will be supporting only the latest three releases [1]. I would like to propose that for Beam 2.30.0 we stop supporting Flink 1.8 and 1.9 [2]. I prepared a PR for this [3] but of

Re: [VOTE] Release vendor-calcite-1_26_0 version 0.1, release candidate #1

2021-03-10 Thread Ismaël Mejía
jiras blocked by a Calcite upgrade. See > https://issues.apache.org/jira/browse/BEAM-9379 > > On Tue, Mar 9, 2021 at 5:17 AM Ismaël Mejía wrote: >> >> Just out of curiosity is there some feature we are expecting from >> Calcite that pushes this upgrade or is this just catchi

Re: Debezium integration

2021-03-09 Thread Ismaël Mejía
Hello Gunnar, Thanks for the message and willingness to collaborate. Most connectors on Beam are called based on the target system name + the IO suffix, e.g. KafkaIO, PubsubIO, KinesisIO, etc. so naming it DebeziumIO makes sense from the Beam side. So far nobody has requested us to rename a Beam

Re: [VOTE] Release vendor-calcite-1_26_0 version 0.1, release candidate #1

2021-03-09 Thread Ismaël Mejía
Just out of curiosity is there some feature we are expecting from Calcite that pushes this upgrade or is this just catching up for the sake of security improvements + not having old dependencies? On Tue, Mar 9, 2021 at 12:23 AM Ahmet Altay wrote: > > +1 (binding) > > On Mon, Mar 8, 2021 at 3:21

Re: Contributor permission for Beam Jira tickets

2021-03-07 Thread Ismaël Mejía
Done, Welcome to Beam! On Sun, Mar 7, 2021 at 8:09 AM Manav Garg wrote: > > Hi, > > This is Manav from Google. I plan on taking up BEAM-4152 for adding session > windowing support to Go sdk. Can someone add me as a contributor for Beam's > Jira issue > tracker? My ASF Jira username would be

Re: Migrate S3FileSystem

2021-02-10 Thread Ismaël Mejía
o adapt the beam > classes to the new AWS API, then I have no questions and I will start the > task and send out a PR for the review soon. > > Thank you, > Raphael. > > -- > *От:* Ismaël Mejía > *Отправлено:* 28 января 2021 г. 15:37:04 > *Кому:*

Re: Builds Meeting this Thursday

2021-02-08 Thread Ismaël Mejía
on, Jan 18, 2021 at 1:28 PM Elliotte Rusty Harold wrote: > > On Mon, Jan 18, 2021 at 10:49 AM Ismaël Mejía wrote: > > > > Thanks for sharing this Pablo, This looks super interesting. We should > > see if it could make sense to migrate our Jenkins infra to GitHub > > A

Re: Migrate S3FileSystem

2021-01-28 Thread Ismaël Mejía
Hello Raphael, You don't need to change the version of the SDK because at the moment we do support AWS SDK for Java 2, you just have to put the classes in the correct module. https://github.com/apache/beam/tree/master/sdks/java/io/amazon-web-services2 The expected outcome is just to reproduce

Re: Multiple architectures support on Beam (ARM)

2021-01-27 Thread Ismaël Mejía
plored by many of the other Java big data systems in use; it'd be >>>> interesting to know what solutions are out there. >>>> >>>> For go, the executable is uploaded directly into the container. We'd >>>> probably have to do something fancier like

Multiple architectures support on Beam (ARM)

2021-01-26 Thread Ismaël Mejía
I stumbled today on this user request: BEAM-10982 Wheel support for linux aarch64 It made me wonder if with the advent of ARM64 processors not only in the client but server side (Graviton and others) if it is worth that we start to think about having support for this architecture on the python

Re: [ANNOUNCE] New committer: Piotr Szuberski

2021-01-22 Thread Ismaël Mejía
Congratulations Piotr ! Thanks for all your work ! On Fri, Jan 22, 2021 at 5:33 PM Alexey Romanenko wrote: > > Hi everyone, > > Please join me and the rest of the Beam PMC in welcoming a new committer: > Piotr Szuberski . > > Piotr started to contribute to Beam about one year ago and he did it

Re: [ANNOUNCE] New PMC Member: Chamikara Jayalath

2021-01-22 Thread Ismaël Mejía
Congrats Cham, well deserved! On Fri, Jan 22, 2021 at 9:02 AM Michał Walenia wrote: > Congratulations, Cham! Thanks for your work! > > > On Fri, Jan 22, 2021 at 3:13 AM Charles Chen wrote: > >> Congrats Cham! >> >> On Thu, Jan 21, 2021, 5:39 PM Chamikara Jayalath >> wrote: >> >>> Thanks

Re: Making preview (sample) time consistent on Direct runner

2021-01-21 Thread Ismaël Mejía
not matter much, probably >>> 2. you want to checkpoint and emit values so that the end output of the >>> pipeline can receive them to cancel it; you don't want to read a whole >>> restriction like in a batch case >>> >>> I don't know the status

Re: JIRA access (and hello!)

2021-01-19 Thread Ismaël Mejía
You were added to the Contributors group so you can now self assign issues. Welcome to the project! On Tue, Jan 19, 2021 at 3:50 PM David Huntsperger wrote: > > Hey Beam devs, > > I'm a maintainer of the Dataflow documentation, and I'd like to do some work > on Beam doc issues as well. May I

Re: Builds Meeting this Thursday

2021-01-18 Thread Ismaël Mejía
Thanks for sharing this Pablo, This looks super interesting. We should see if it could make sense to migrate our Jenkins infra to GitHub Actions given that it is free and quickly becoming the new 'standard', Good points it is 'free' because we will bring our machines and Google pays :) bad points

Re: [VOTE] Release 2.27.0, release candidate #4

2021-01-07 Thread Ismaël Mejía
> Also I wonder if we now need to clarify both Java 8 and Java 11 versions > separately? You mean for the docker images? Otherwise we should not be using Java 11 at all to produce the artifacts. On Thu, Jan 7, 2021 at 4:51 PM Valentyn Tymofieiev wrote: > > Noting that announcement does not

Re: Making preview (sample) time consistent on Direct runner

2021-01-06 Thread Ismaël Mejía
a performance benchmarking reason. We > have seen others send in known elements which are tracked throughout the > pipeline to generate timings for each transform/stage. > > -Sam > > On Fri, Dec 18, 2020 at 8:24 AM Ismaël Mejía wrote: >> >> Hello, >>

Re: [VOTE] Release 2.27.0, release candidate #1

2020-12-28 Thread Ismaël Mejía
alidations. I'm cancelling this RC, and >> I'll perform cherry picks to prepare the next one. >> >> Please update this thread with any other cherry pick requests! >> -P. >> >> On Thu, Dec 24, 2020, 3:17 AM Ismaël Mejía wrote: >>> >>> It might be

Re: [VOTE] Release 2.26.0, release candidate #1

2020-12-28 Thread Ismaël Mejía
It seems the tag of the docker image for java8 was not updated after the release went out, can somebody please fix this. https://hub.docker.com/r/apache/beam_java8_sdk/tags?page=1=last_updated On Sat, Dec 12, 2020 at 7:19 AM Jean-Baptiste Onofre wrote: > +1 (binding) > > Sorry for the delay.

Re: [VOTE] Release 2.27.0, release candidate #1

2020-12-24 Thread Ismaël Mejía
It might be a good idea to include also: [BEAM-11403] Cache UnboundedReader per UnboundedSourceRestriction in SDF Wrapper DoFn https://github.com/apache/beam/pull/13592 So Java development experience is less affected (as with 2.26.0) (There is a flag to exclude but defaults matter). On Thu, Dec

Re: Combine with multiple outputs case Sample and the rest

2020-12-23 Thread Ismaël Mejía
fact should work well (unless there's duplicate elements, in > which case you'd have to uniquify them somehow to filter out only the "right" > copies). > > - Robert > > > > On Fri, Dec 18, 2020 at 8:20 AM Ismaël Mejía wrote: >> >> I had a question tod

Making preview (sample) time consistent on Direct runner

2020-12-18 Thread Ismaël Mejía
Hello, The use of direct runner for interactive local use cases has increased with the years on Beam due to projects like Scio, Kettle/Hop and our own SQL CLI. All these tools have in common one thing, they show a sample of some source input to the user and interactively apply transforms to it to

Combine with multiple outputs case Sample and the rest

2020-12-18 Thread Ismaël Mejía
I had a question today from one of our users about Beam’s Sample transform (a Combine with an internal top-like function to produce a uniform sample of size n of a PCollection). They wanted to obtain also the rest of the PCollection as an output (the non sampled elements). My suggestion was to

Possible issue with bounded Read translation using SDF

2020-12-18 Thread Ismaël Mejía
Hello, I was trying to profile some pipeline using Java's direct runner. It reads ~30 60MB text files (CSV). When I started the profiler it reported more than 40K instances of TextSource being built which really surprised me given the small size of the data being processed. I wonder if I found

Re: Usability regression using SDF Unbounded Source wrapper + DirectRunner

2020-12-17 Thread Ismaël Mejía
ote: >>>> >>>> Making it as the PipelineOptions was my another proposal but it might take >>>> some time to do so. On the other hand, tuning the number into something >>>> acceptable is low-hanging fruit. >>>> >>>> On Wed

Re: Usability regression using SDF Unbounded Source wrapper + DirectRunner

2020-12-16 Thread Ismaël Mejía
java > [2] > https://github.com/apache/beam/blob/3bb232fb098700de408f574585dfe74bbaff7230/runners/direct-java/src/main/java/org/apache/beam/runners/direct/SplittableProcessElementsEvaluatorFactory.java#L178-L181 > > On Wed, Dec 16, 2020 at 9:02 AM Ismaël Mejía wrote: >> &g

Re: Farewell mail

2020-12-16 Thread Ismaël Mejía
Thanks Piotr, You made an impact on Beam! Best wishes in the future projects and feel welcome whenever you want to contribute again. Ismaël On Wed, Dec 16, 2020 at 9:02 PM Brian Hulette wrote: > > Thank you for all your contributions! Good luck in your future endeavors :) > > Brian > > On

Re: Usability regression using SDF Unbounded Source wrapper + DirectRunner

2020-12-16 Thread Ismaël Mejía
I can guess that the same issues mentioned here probably will affect the usability for people trying Beam's interactive SQL on Unbounded IO too. We should really take into account that the performance of the SDF based path should be as good or better than the previous version before considering

Re: Tests for compatibility with Avro 1.8 and 1.9

2020-12-04 Thread Ismaël Mejía
After some offline discussion with Piotr we discovered two issues: 1. The gradle avro plugin we use needs a specific version of Avro in each of his versions, so we would need to use different versions of the plugin to generate the Avro objects for our tests with each version, because the

  1   2   3   4   5   6   7   8   >