Re: Growing Beam -- A call for ideas? What is missing? What would be good to see?

2018-10-29 Thread Gleb Kanterov
I'm a scio contributor, and I have a lot of experience with Scala. However, I would advise for NOT using Scala. There are several problems with maintaining Scala libraries: - have to build different artifacts for each Scala version - artifacts have dependencies to Scala standard library - it

Fixing equality of Rows

2018-10-29 Thread Gleb Kanterov
With adding BYTES type, we broke equality. `RowCoder#consistentWithEquals` is always true, but this property doesn't hold for exotic types such as `Map`, `List`. The root cause is `byte[]`, where `equals` is implemented as reference equality instead of structural. Before we jump into solution

Re: Fixing equality of Rows

2018-10-29 Thread Gleb Kanterov
va) must have a schema >> type-driven equality that matches this spec >> - also each type (hence Row type) should have portable encoding(s) that >> respect this equality so shuffling is consistent >> - Row in Java should be able to decode these encodings to different >

Re: Fixing equality of Rows

2018-10-29 Thread Gleb Kanterov
nt for portability, which seems a >> hard searching (and not possible?) process. >> >> -Rui >> >> On Mon, Oct 29, 2018 at 10:15 AM Gleb Kanterov wrote: >> >>> There is an indirect connection to RowCoder because `MapCoder` isn't >>> determ

Re: A new contributor

2018-10-05 Thread Gleb Kanterov
Hi all, My name is Gleb and I work on Data Infrastructure at Spotify. We use Apache Beam and develop spotify/scio . Time-to-time I create JIRA issues and submit pull requests. Can I get contributor access to JIRA (username: kanterov) and Slack? Thanks, Gleb On

Re: Beam snapshots broken

2018-12-27 Thread Gleb Kanterov
I can reproduce this on my machine, and reverting https://github.com/apache/beam/pull/7324 fixed the problem. There is a separate thread in dev@ about releasing vendored gRPC v0.2, I'm wondering if it will this issue. On Thu, Dec 27, 2018 at 5:20 PM Ismaël Mejía wrote: > Looks like snapshots

Re: [spark runner based on dataset POC] your opinion

2019-01-18 Thread Gleb Kanterov
Agree with Kenn. It should be possible, Spark has a similar concept called ExpressionEncoder, I was doing similar derivation using Scala macro in typelevel/frameless . Most of the code in Beam is a blackbox function in ParDo, and the only way to translate

Re: Merge of vendored Guava (Some PRs need a rebase)

2019-01-20 Thread Gleb Kanterov
I didn't look deep into it, but it seems we can put .idea/codeInsightSettings.xml into our repository where we blacklist packages from auto-import. See an example in JetBrains/kotlin/.idea/codeInsightSettings.xml . On

Re: Vendoring Calcite

2019-01-14 Thread Gleb Kanterov
Great initiative. I was thinking about making a similar proposal. I tried using Beam SQL in a project that has Calcite dependency, and it doesn't work because Calcite does internal JDBC connection on "jdbc:calcite:" URL, and you can't register two drivers for the same scheme. Not sure how it's

Re: ContainerLaunchException in precommit [BEAM-6497]

2019-01-23 Thread Gleb Kanterov
I'm looking into it. This image exists in docker hub [1], but for some reason, it wasn't picked up. [1] https://hub.docker.com/r/yandex/clickhouse-server/tags On Wed, Jan 23, 2019 at 10:01 PM Alex Amato wrote: > >1. > See: BEAM-6497 >

Re: [ANNOUNCEMENT] [SQL] [BEAM-6133] Support for user defined table functions (UDTF)

2018-12-14 Thread Gleb Kanterov
rry for the slow reply & review. Having UDTF support in Beam SQL is > extremely useful. Are both table functions and table macros part of > "standard" SQL or is this a distinction between different Calcite concepts? > > Kenn > > On Wed, Nov 28, 2018 at 10:36 AM Gleb Kanterov w

[ANNOUNCEMENT] [SQL] [BEAM-6133] Support for user defined table functions (UDTF)

2018-11-28 Thread Gleb Kanterov
At the moment we support only ScalarFunction UDF, it's functions that operate on row fields. In Calcite, there are 3 kinds of UDFs: aggregate functions (that we already support), table macro and table functions. The difference between table functions and macros is that macros expand to relations,

Re: AvroIO read from unknown schema to generic record.

2019-01-14 Thread Gleb Kanterov
One approach could be creating PTransform with expand method that wraps AvroIO and reads AVRO writer schema from one of files matching read pattern. It will work if the set of sources with different schemas is fixed at pipeline construction step. ``` public abstract class GenericAvroIORead

Re: Adding ":beam-runners-direct-java:needsRunnerTests" to "Run Java PreCommit"

2018-12-28 Thread Gleb Kanterov
2018 at 3:24 PM Reuven Lax wrote: >> > >> > Kenn and I both noticed that some needsRunner tests time out, and we >> were both wondering why our PreCommit was still green. This tests are meant >> to be quick, and IMO should definitely be part of Java PreCommit. >>

Adding ":beam-runners-direct-java:needsRunnerTests" to "Run Java PreCommit"

2018-12-28 Thread Gleb Kanterov
After reading Beam Testing I had an impression that NeedsRunner tests are executed as a part of Java PreCommit using Direct runner. However, it doesn't seem to be the case. I've tried running these tests locally, and few of them are failing or timeout.

Re: [Go SDK] User Defined Coders

2019-01-03 Thread Gleb Kanterov
Reuven, it sounds great. I see there is a similar thing to Row coders happening in Apache Arrow , and there is a similarity between Apache Arrow Flight and data exchange service in

Re: [ANNOUNCE] New committer announcement: Mark Liu

2019-03-25 Thread Gleb Kanterov
Congratulations! On Mon, Mar 25, 2019 at 10:23 AM Łukasz Gajowy wrote: > Congrats! :) > > > > pon., 25 mar 2019 o 08:11 Aizhamal Nurmamat kyzy > napisał(a): > >> Congratulations, Mark! >> >> On Sun, Mar 24, 2019 at 23:18 Pablo Estrada wrote: >> >>> Yeaah Mark! : ) Congrats : D >>> >>> On

Re: Merge of vendored Guava (Some PRs need a rebase)

2019-02-28 Thread Gleb Kanterov
m/pull/7957 [2]: https://github.com/apache/beam/blob/61de62ecbe8658de866280a8976030a0cb877041/sdks/java/extensions/sql/build.gradle#L30-L39 Gleb On Sun, Jan 20, 2019 at 11:43 AM Gleb Kanterov wrote: > I didn't look deep into it, but it seems we can put > .idea/codeInsightSettings.

Re: [ANNOUNCE] New committer announcement: Michael Luckey

2019-02-27 Thread Gleb Kanterov
Congratulations and welcome! On Wed, Feb 27, 2019 at 8:57 PM Connell O'Callaghan wrote: > Excellent thank you for sharing Kenn!!! > > Michael congratulations for this recognition of your contributions to > advancing BEAM > > On Wed, Feb 27, 2019 at 11:52 AM Kenneth Knowles wrote: > >> Hi

Re: Merge of vendored Guava (Some PRs need a rebase)

2019-03-05 Thread Gleb Kanterov
o that's gross. > > > > Kenn > > > > [1] > https://github.com/apache/beam/blob/master/buildSrc/src/main/groovy/org/apache/beam/gradle/BeamModulePlugin.groovy#L353 > > > > > > On Thu, Feb 28, 2019 at 2:29 AM Gleb Kanterov wrote: > >> > &g

Re: Merge of vendored Guava (Some PRs need a rebase)

2019-03-05 Thread Gleb Kanterov
Ismaël, I was looking into BEAM-5723, is it possible to relocate both guava and Cassandra client instead of not relocating Guava in BEAM-6620? On Tue, Mar 5, 2019 at 11:23 PM Gleb Kanterov wrote: > I agree with the points that Kenneth has raised, mainly: > > > In both of the abov

Re: [ANNOUNCE] New committer announcement: Raghu Angadi

2019-03-08 Thread Gleb Kanterov
Congratulations! On Thu, Mar 7, 2019 at 11:52 PM Michael Luckey wrote: > Congrats Raghu! > > On Thu, Mar 7, 2019 at 8:06 PM Mark Liu wrote: > >> Congrats! >> >> On Thu, Mar 7, 2019 at 10:45 AM Rui Wang wrote: >> >>> Congrats Raghu! >>> >>> >>> -Rui >>> >>> On Thu, Mar 7, 2019 at 10:22 AM

Always get to LGTM in Committer Guide

2019-03-12 Thread Gleb Kanterov
Before pressing merge button I was familiarizing myself with committer guide [1]. It's saying: > A committer (who is not the author of the code) should signal this either by GitHub “approval” or by a comment such as “Looks good to me!” (LGTM). Any committer can then merge the pull request. It is

Re: BEAM-6639. ClickHouseIOTest flakey failure failing in precomiits

2019-02-09 Thread Gleb Kanterov
I'm looking into it, it seems that previous mitigation didn't help. I added extra logging and going to try to reproduce flakey failure again. Sorry for the inconvenience, I've never experienced such problems with testcontainers before. On Sat, Feb 9, 2019 at 12:36 AM Alex Amato wrote: >

Re: ContainerLaunchException in precommit [BEAM-6497]

2019-02-05 Thread Gleb Kanterov
It seems ClickHouse tests aren't flaky anymore, please reopen JIRA issue if you find it flaky again. On Thu, Jan 31, 2019 at 4:08 PM Gleb Kanterov wrote: > There are two tests using testcontainers. I've noticed that in one of the > failed builds > <https://builds.ap

Re: ContainerLaunchException in precommit [BEAM-6497]

2019-01-24 Thread Gleb Kanterov
testcontainers to use never docker-java. https://github.com/apache/beam/pull/7610 On Thu, Jan 24, 2019 at 12:27 AM Alex Amato wrote: > Thank you Gleb, appreciate it. > > On Wed, Jan 23, 2019 at 2:40 PM Gleb Kanterov wrote: > >> I'm looking into it. This image exists in docker hub

Re: Another new contributor!

2019-01-31 Thread Gleb Kanterov
Welcome! Would be interesting to hear your thoughts on Arrow, Arrow Flight, and Beam Portability relation, this topic was recently discussed in dev@. On Thu, Jan 31, 2019 at 2:00 PM Ismaël Mejía wrote: > Welcome Brian! > Great to have someone with Apache experience already and also with > Arrow

Re: Findbugs -> Spotbugs ?

2019-01-31 Thread Gleb Kanterov
Agree, spotbugs brings static checks that aren't covered in error-prone, it's a good addition. There are few conflicts between error-prone and spotbugs, for instance, the approach to enum switch exhaustiveness, but it can be configured. On Thu, Jan 31, 2019 at 10:53 AM Ismaël Mejía wrote: > Not

Re: ContainerLaunchException in precommit [BEAM-6497]

2019-01-31 Thread Gleb Kanterov
. >>>> It may be a specific environmental issue to the beam1 machine the tests >>>> ran on? >>>> https://builds.apache.org/job/beam_PreCommit_Java_Commit/3722/ >>>> >>>> >>>> On Thu, Jan 24, 2019 at 8:16 AM Gleb Kanterov wrote: >>

Re: [ANNOUNCE] New committer announcement: Gleb Kanterov

2019-01-28 Thread Gleb Kanterov
at 8:54 AM Ismaël Mejía >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>>> Well deserved, congratulations Gleb! >>>>>>>>>>> >>>>>>>>>>> On Fri, Jan

Re: [ANNOUNCE] New PMC member: Etienne Chauchot

2019-01-28 Thread Gleb Kanterov
Congratulations Etienne! On Mon, Jan 28, 2019 at 11:36 AM Maximilian Michels wrote: > Congrats Etienne! It's been great to work with you. > > On 26.01.19 07:16, Ismaël Mejía wrote: > > Congratulations Etienne! > > > > Le sam. 26 janv. 2019 à 06:42, Reuven Lax > > a

Re: JDK11 support?

2019-04-10 Thread Gleb Kanterov
Is there a way to try JDK11 harness for Dataflow without building own docker image? On Wed, Apr 10, 2019 at 2:10 AM Yi Pan wrote: > Hi, Pablo, > > Thanks for the clarification. Does that mean that there needs to be a > separate effort to ensure KafkaIO to be Java 8 source compat and Java 11 >

Re: [ANNOUNCE] New committer announcement: Boyuan Zhang

2019-04-16 Thread Gleb Kanterov
Congratulations! On Sat, Apr 13, 2019 at 12:53 AM Thomas Weise wrote: > Congrats! > > > On Thu, Apr 11, 2019 at 6:03 PM Reuven Lax wrote: > >> Congratulations Boyuan! >> >> On Thu, Apr 11, 2019 at 4:53 PM Ankur Goenka wrote: >> >>> Congrats Boyuan! >>> >>> On Thu, Apr 11, 2019 at 4:52 PM Mark

Re: [ANNOUNCE] New committer: Mikhail Gryzykhin

2019-06-25 Thread Gleb Kanterov
Congratulations! On Tue, Jun 25, 2019 at 2:03 AM Connell O'Callaghan wrote: > Thomas thank you for sharing this > > Congratulations on this Mikhail!!! > > On Mon, Jun 24, 2019 at 3:19 PM Kai Jiang wrote: > >> Congrats! >> >> On Mon, Jun 24, 2019 at 1:46 PM Chamikara Jayalath >> wrote: >>

Re: Contributor Registration

2019-06-20 Thread Gleb Kanterov
Welcome Matt! On Thu, Jun 20, 2019 at 11:09 AM Aizhamal Nurmamat kyzy wrote: > Welcome Matt! > > On Thu, Jun 20, 2019 at 11:06 AM Robert Bradshaw > wrote: > >> Welcome! I added you to the contributors group. >> >> On Thu, Jun 20, 2019 at 11:03 AM Matt Helm wrote: >> > >> > Hi Beam community,

Re: [ANNOUNCE] New PMC Member: Pablo Estrada

2019-06-10 Thread Gleb Kanterov
Congratulations! On Fri, May 24, 2019 at 9:50 PM Joana Filipa Bernardo Carrasqueira < joanafil...@google.com> wrote: > Congratulations Pablo! Well deserved :D > > On Fri, May 17, 2019 at 3:14 PM Hannah Jiang > wrote: > >> Congratulations, Pablo, you deserve it! >> >> *From: *Mark Liu >> *Date:

Re: Congrats to Beam's first 6 Google Open Source Peer Bonus recipients!

2019-05-02 Thread Gleb Kanterov
Congratulations! Well deserved! On Thu, May 2, 2019 at 10:00 AM Ismaël Mejía wrote: > Congrats everyone ! > > On Thu, May 2, 2019 at 9:14 AM Robert Bradshaw > wrote: > >> Congratulation, and thanks for all the great contributions each one of >> you has made to Beam! >> >> On Thu, May 2, 2019

Re: [ANNOUNCE] New committer announcement: Udi Meiri

2019-05-06 Thread Gleb Kanterov
Congratulations! On Mon, May 6, 2019 at 2:34 PM Valentyn Tymofieiev wrote: > Congrats, Udi! > > *From: *Thomas Weise > *Date: *Mon, May 6, 2019 at 7:50 AM > *To: * > > Congrats! >> >> >> On Mon, May 6, 2019 at 2:25 AM Łukasz Gajowy wrote: >> >>> Congrats! :) >>> >>> pon., 6 maj 2019 o 10:45

Java serialization for coders and compatibility

2019-08-13 Thread Gleb Kanterov
I'm looking into the code of AvroCoder, and I was wondering what happens when users upgrade Beam for streaming pipelines? As I understand it, we should be able to deserialize coder from previous Beam version. Looking into guava vendoring, it's going to break serialization when we are going to

Re: [VOTE] Support ZetaSQL as another SQL dialect for BeamSQL in Beam repo

2019-08-13 Thread Gleb Kanterov
+1 On Tue, Aug 13, 2019 at 10:47 AM Ismaël Mejía wrote: > +1 > Wishing that this goes to calcite too someday (hoping that it makes > Beam side maintenance simpler) > > On Tue, Aug 13, 2019 at 6:18 AM Manu Zhang > wrote: > > > > +1 > > > > On Tue, Aug 13, 2019 at 11:55 AM Mingmin Xu wrote: >

Re: [ANNOUNCE] New committer: Valentyn Tymofieiev

2019-08-27 Thread Gleb Kanterov
Congratulations Valentyn! On Tue, Aug 27, 2019 at 7:22 AM jincheng sun wrote: > Congrats Valentyn! > > Best, > Jincheng > > Ankur Goenka 于2019年8月27日周二 上午10:37写道: > >> Congratulations Valentyn! >> >> On Mon, Aug 26, 2019, 5:02 PM Yifan Zou wrote: >> >>> Congratulations, Valentyn! Well

Re: [DISCUSS] Portability representation of schemas

2019-09-03 Thread Gleb Kanterov
Recently there was a pull request (that was reverted) for adding portable representation of schemas. It's great to see things moving forward, I'm worried that it doesn't support any logical types, especially fixed bytes. That makes runners using portable schemas unusable, for instance, when

Re: [DISCUSS] Portability representation of schemas

2019-09-03 Thread Gleb Kanterov
-7855 > [2] https://issues.apache.org/jira/browse/BEAM-8111 > > On Tue, Sep 3, 2019 at 7:27 AM Gleb Kanterov wrote: > >> Recently there was a pull request (that was reverted) for adding portable >> representation of schemas. It's great to see things moving forward, I'

Re: Improve container support

2019-08-28 Thread Gleb Kanterov
Google Doc doesn't seem to be shared with dev@. Can anybody double-check? On Wed, Aug 28, 2019 at 7:36 AM Hannah Jiang wrote: > add dev@ > > On Tue, Aug 27, 2019 at 9:29 PM Hannah Jiang > wrote: > >> Thanks for commenting and discussions. >> I created a Google Docs >>

Re: clickhouse tests failing

2019-09-12 Thread Gleb Kanterov
These tests are using testcontainers and assume that you have Docker environment locally. On Sun, Sep 8, 2019 at 5:14 PM Lukasz Cwik wrote: > Is passing at head on Jenkins: > https://builds.apache.org/job/beam_PreCommit_Java_Cron/1771/testReport/org.apache.beam.sdk.io.clickhouse/ > > What are

Re: [ANNOUNCE] New committer: Rui Wang

2019-08-07 Thread Gleb Kanterov
Congratulations Rui! Well done! On Wed, Aug 7, 2019 at 7:01 AM Connell O'Callaghan wrote: > Well done Rui!!! > > On Tue, Aug 6, 2019 at 15:41 Chamikara Jayalath > wrote: > >> Congrats Rui. >> >> On Tue, Aug 6, 2019 at 2:00 PM Melissa Pashniak >> wrote: >> >>> Congrats Rui! >>> >>> On Tue, Aug

Re: [ANNOUNCE] New committer: Kyle Weaver

2019-08-07 Thread Gleb Kanterov
Congratulations! On Wed, Aug 7, 2019 at 7:01 AM Connell O'Callaghan wrote: > Well done congratulations Kyle!!! > > On Tue, Aug 6, 2019 at 21:58 Thomas Weise wrote: > >> Congrats! >> >> On Tue, Aug 6, 2019, 7:24 PM Reza Rokni wrote: >> >>> Congratz! >>> >>> On Wed, 7 Aug 2019 at 06:40,

Re: [ANNOUNCE] New committer: Jan Lukavský

2019-08-01 Thread Gleb Kanterov
Congratulations! On Thu, Aug 1, 2019 at 3:11 PM Reza Rokni wrote: > Congratulations , awesome stuff ! > > On Thu, 1 Aug 2019, 12:11 Maximilian Michels, wrote: > >> Congrats, Jan! Good to see you become a committer :) >> >> On 01.08.19 12:37, Łukasz Gajowy wrote: >> > Congratulations! >> > >> >

Re: Discussion/Proposal: support Sort Merge Bucket joins in Beam

2019-07-17 Thread Gleb Kanterov
document, but please do not >>>> build on either FileBasedSink or FileBasedReader. They are both remnants of >>>> the old, non-composable IO world; and in fact much of the composable IO >>>> work emerged from frustration with their limitations and recognizing t

Re: Discussion/Proposal: support Sort Merge Bucket joins in Beam

2019-07-17 Thread Gleb Kanterov
n this case, we would split ranges of possible values. On Wed, Jul 17, 2019 at 6:37 PM Robert Bradshaw wrote: > On Wed, Jul 17, 2019 at 4:26 PM Gleb Kanterov wrote: > > > > I find there is an interesting point in the comments brought by Ahmed > Eleryan. Similar to WindowFn,

Re: [ANNOUNCE] New committer: Robert Burke

2019-07-17 Thread Gleb Kanterov
Congratulations, Robert! On Wed, Jul 17, 2019 at 1:50 PM Robert Bradshaw wrote: > Congratulations! > > On Wed, Jul 17, 2019, 12:56 PM Katarzyna Kucharczyk < > ka.kucharc...@gmail.com> wrote: > >> Congratulations! :) >> >> On Wed, Jul 17, 2019 at 12:46 PM Michał Walenia < >>

Re: Discussion/Proposal: support Sort Merge Bucket joins in Beam

2019-07-15 Thread Gleb Kanterov
I share the same concern with Robert regarding re-implementing parts of IO. At the same time, in the past, I worked on internal libraries that try to re-use code from existing IO, and it's hardly possible because it feels like it wasn't designed for re-use. There are a lot of classes that are

Re: Sort Merge Bucket - Action Items

2019-07-25 Thread Gleb Kanterov
What is the long-term plan for org.apache.beam.sdk.io.Read? Is it going away in favor of SDF, or we are always going to have both? I was looking into AvroIO.read and AvroIO.readAll, both of them use AvroSource. AvroIO.readAll is using SDF, and it's implemented with ReadAllViaFileBasedSource that

Re: [ANNOUNCE] New committer: Alan Myrvold

2019-09-30 Thread Gleb Kanterov
Congratulations! On Sat, Sep 28, 2019 at 12:07 AM Valentyn Tymofieiev wrote: > Congratulations, Alan. Well deserved. > > On Fri, Sep 27, 2019 at 2:09 PM Chamikara Jayalath > wrote: > >> Congrats Alan!! >> >> On Fri, Sep 27, 2019 at 1:49 PM Jan Lukavský wrote: >> >>> Congrats Alan! >>> On

Re: [ANNOUNCE] New committer: Brian Hulette

2019-11-14 Thread Gleb Kanterov
Congratulations! On Fri, Nov 15, 2019 at 5:44 AM Valentyn Tymofieiev wrote: > Congratulations, Brian! > > On Thu, Nov 14, 2019 at 6:25 PM jincheng sun > wrote: > >> Congratulation Brian! >> >> Best, >> Jincheng >> >> Kyle Weaver 于2019年11月15日周五 上午7:19写道: >> >>> Thanks for your contributions

Re: [PROPOSAL] Add support for writing flattened schemas to pubsub

2019-11-17 Thread Gleb Kanterov
Expanding on what Kenn said regarding having fewer dependencies on SQL. Can the whole thing be seen as extending PubSubIO, that would implement most of the logic from the proposal, given column annotations, and then having a thin layer that connects it with Beam SQL tables? On Sun, Nov 17, 2019

Re: [ANNOUNCE] New committer: Daniel Oliveira

2019-11-21 Thread Gleb Kanterov
Congratulations! On Thu, Nov 21, 2019 at 6:24 AM Thomas Weise wrote: > Congratulations! > > > On Wed, Nov 20, 2019, 7:56 PM Chamikara Jayalath > wrote: > >> Congrats!! >> >> On Wed, Nov 20, 2019 at 5:21 PM Daniel Oliveira >> wrote: >> >>> Thank you everyone! I won't let you down. o7 >>> >>>

Re: goVet and clickHouse tests failing

2019-11-21 Thread Gleb Kanterov
:sdks:java:io:clickhouse:test is using testcontainers. Testcontainers is a Java library that supports JUnit tests, providing lightweight, throwaway instances of common databases, Selenium web browsers, or anything else that can run in a Docker container. Therefore, ClickHouse tests require a local

Re: [VOTE] Beam Mascot animal choice: vote for as many as you want

2019-11-22 Thread Gleb Kanterov
[ ] Beaver [ ] Hedgehog [ ] Lemur [X] Owl [ ] Salmon [ ] Trout [ ] Robot dinosaur [ ] Firefly [ ] Cuttlefish [ ] Dumbo Octopus [ ] Angler fish On Fri, Nov 22, 2019 at 11:33 PM Andrew Pilloud wrote: > > [ ] Beaver > [ ] Hedgehog > [ ] Lemur > [ ] Owl > [X] Salmon > [X] Trout > [ ] Robot

Re: [VOTE] Beam's Mascot will be the Firefly (Lampyridae)

2019-12-13 Thread Gleb Kanterov
+1 (non-binding) On Fri, Dec 13, 2019 at 12:47 PM jincheng sun wrote: > +1 (non-binding) > > Alex Van Boxel 于2019年12月13日 周五16:21写道: > >> +1 >> >> On Fri, Dec 13, 2019, 05:58 Kenneth Knowles wrote: >> >>> Please vote on the proposal for Beam's mascot to be the Firefly. This >>> encompasses the

Re: [UPDATE] Preparing for Beam 2.17.0 release

2019-10-28 Thread Gleb Kanterov
It looks like BigQueryIO DIRECT_READ is broken since 2.16.0, I've added a ticket describing the problem and possible fix, see BEAM-8504 [1]. [1]: https://issues.apache.org/jira/browse/BEAM-8504 On Wed, Oct 23, 2019 at 9:19 PM Kenneth Knowles

Re: Detecting resources to stage

2019-11-27 Thread Gleb Kanterov
ations if needed (SPI > pattern that was mentioned at some point in a jira ticket). I think I'm > pretty close to finishing that. > > Thanks! > > śr., 27 lis 2019 o 15:24 Gleb Kanterov napisał(a): > >> Today I tried using classgraph [1] library to scan classpath in Java

Re: Detecting resources to stage

2019-11-27 Thread Gleb Kanterov
Today I tried using classgraph [1] library to scan classpath in Java 11 instead of using URLClassLoader, and after that, the job worked on Dataflow. The logic of scanning classpath is pretty sophisticated [2], and classgraph doesn't have any dependencies. I'm wondering if we can relocate it to

Re: Detecting resources to stage

2019-11-27 Thread Gleb Kanterov
ions.java#L144 > 3: > https://github.com/apache/beam/blob/3e7865ee6c6a56e51199515ec5b4b16de1ddd166/sdks/java/extensions/google-cloud-platform-core/src/main/java/org/apache/beam/sdk/extensions/gcp/options/GcpOptions.java#L159 > > On Wed, Nov 27, 2019 at 8:19 AM Gleb Kanterov wrote: >

Re: [VOTE] Upgrade gradle to 6.2

2020-02-25 Thread Gleb Kanterov
+1 (non-binding) On Tue, Feb 25, 2020 at 9:38 AM Ismaël Mejía wrote: > +1 great to have our build updated, please share if there are new > interesting features/plugin advantages we can benefit from too. > > On Tue, Feb 25, 2020 at 8:24 AM Jean-Baptiste Onofré > wrote: > >> Hi Alex >> >> I also

Re: [ANNOUNCE] New committer: Chad Dombrova

2020-02-25 Thread Gleb Kanterov
Congratulations! On Tue, Feb 25, 2020 at 9:44 AM Ismaël Mejía wrote: > Congratulations, so well deserved for the lots of amazing work and new > perspectives you have broght into the project !!! > > On Tue, Feb 25, 2020 at 8:24 AM Austin Bennett < > whatwouldausti...@gmail.com> wrote: > >>

Re: [ANNOUNCE] New committer: Jincheng Sun

2020-02-24 Thread Gleb Kanterov
Congratulations! On Mon, Feb 24, 2020 at 1:18 PM Hequn Cheng wrote: > Congratulations Jincheng, well deserved! > > Best, > Hequn > > On Mon, Feb 24, 2020 at 7:21 PM Reza Rokni wrote: > >> Congrats! >> >> On Mon, Feb 24, 2020 at 7:15 PM Jan Lukavský wrote: >> >>> Congrats Jincheng! >>> >>>

Re: Deterministic field ordering in derived schemas

2020-02-06 Thread Gleb Kanterov
allow the runner to inspect >>>> the previous graph on an update, to ensure that we maintain the previous >>>> order. >>>> >>>> If you know a way to ensure deterministic ordering, I would love to >>>> know. I even went so far as to try and ope

Deterministic field ordering in derived schemas

2020-02-05 Thread Gleb Kanterov
There are Beam schema providers that use Java reflection to get fields for classes with fields and auto-value classes. It isn't relevant for POJOs with "creators", because function arguments are ordered. We cache instances of schema coders, but there is no guarantee that it's deterministic between

Re: [ANNOUNCE] New committer: Michał Walenia

2020-01-28 Thread Gleb Kanterov
Congratulations! On Tue, Jan 28, 2020 at 6:03 PM Łukasz Gajowy wrote: > Congratulations Michał!  > > wt., 28 sty 2020 o 16:33 Ryan Skraba napisał(a): > >> Congratulations! >> >> On Tue, Jan 28, 2020 at 11:26 AM Jan Lukavský wrote: >> >>> Congrats Michał! >>> On 1/28/20 11:16 AM, Katarzyna

Re: Custom 2.20 failing on Dataflow: what am I doing wrong?

2020-02-17 Thread Gleb Kanterov
You need to pass custom Dataflow worker jar. One of the ways of doing that is adding it as a dependency, and using following code snippet: opts.setDataflowWorkerJar( BatchDataflowWorker.class .getProtectionDomain() .getCodeSource() .getLocation() .toString());

Re: [ANNOUNCE] New committer: Alex Van Boxel

2020-02-18 Thread Gleb Kanterov
Congratulations! On Tue, Feb 18, 2020 at 5:02 PM Brian Hulette wrote: > Congratulations Alex! Well deserved! > > On Tue, Feb 18, 2020 at 7:49 AM Pablo Estrada wrote: > >> Hi everyone, >> >> Please join me and the rest of the Beam PMC in welcoming a new committer: >> Alex Van Boxel >> >> Alex

Re: Beam's Avro 1.8.x dependency

2020-01-16 Thread Gleb Kanterov
There are significant changes between Avro 1.8 and Avro 1.9. I'm not sure it's possible for beam-sdks-java-core to support both at the same time. The fact that AvroIO is a part of the beam-sdks-java-core doesn't make it simpler. However, I can see how we can build two binary artifacts with the

Re: Beam's Avro 1.8.x dependency

2020-01-16 Thread Gleb Kanterov
a situation where the only solution to the issue is to do (1), move > Avro out > of core as an extension but then the question is would we sacrifice > breaking > backwards compatibility for this issue. I am in the 'we should do it' camp. > What do others think? > > > On Thu, J

Re: /zetasql/local_service/liblocal_service_jni.jnilib was not found inside JAR

2020-03-26 Thread Gleb Kanterov
, I could use help testing. > > Check out this PR: https://github.com/apache/beam/pull/11223 > Run: ./gradlew :sdks:java:extensions:sql:zetasql:check > > Thanks! > > Andrew > > On Tue, Mar 17, 2020 at 2:50 AM Gleb Kanterov wrote: > >> There is a branch that bu

Re: [ANNOUNCE] New committer: Robin Qiu

2020-05-19 Thread Gleb Kanterov
Congratulations! On Tue, May 19, 2020 at 7:31 AM Aizhamal Nurmamat kyzy wrote: > Congratulations, Robin! Thank you for your contributions! > > On Mon, May 18, 2020, 7:18 PM Boyuan Zhang wrote: > >> Congrats~~ >> >> On Mon, May 18, 2020 at 7:17 PM Reza Rokni wrote: >> >>> Congratulations! >>>

Re: /zetasql/local_service/liblocal_service_jni.jnilib was not found inside JAR

2020-03-17 Thread Gleb Kanterov
There is a branch that builds ZetaSQL on Mac, it only works with Bazel 0.25.3. You would need XCode to build it locally. After you build it, and put jnilib into classpath, it just works. One of my colleagues has updated this branch to the latest release [1]. /Gleb [1]:

Re: Percentile metrics in Beam

2020-08-18 Thread Gleb Kanterov
unt/max/min would stay the same but we would want a single object >> that abstracts this complexity away for users as well. >> >> On Mon, Aug 17, 2020 at 3:42 AM Gleb Kanterov wrote: >> >>> Didn't see proposal by Alex before today. I want to add a few more cen

Re: Percentile metrics in Beam

2020-08-17 Thread Gleb Kanterov
Didn't see proposal by Alex before today. I want to add a few more cents from my side. There is a paper Moment-based quantile sketches for efficient high cardinality aggregation queries [1], a TL;DR that for some N (around 10-20 depending on accuracy) we need to collect SUM(log^N(X)) ... log(X),

Re: [ANNOUNCE] New committer: Reza Ardeshir Rokni

2020-09-17 Thread Gleb Kanterov
Congratulations! On Tue, Sep 15, 2020 at 5:44 PM Ismaël Mejía wrote: > Congratulations Reza, well done ! > > On Mon, Sep 14, 2020 at 10:10 AM Katarzyna Kucharczyk > wrote: > > > > Congratulations Reza! :) > > > > On Mon, Sep 14, 2020 at 10:05 AM Alexey Romanenko < > aromanenko@gmail.com>

Re: [ANNOUNCE] New PMC Member: Alexey Romanenko

2020-06-17 Thread Gleb Kanterov
Congratulations! Thanks for your hard work On Wed, Jun 17, 2020 at 1:11 PM Alexey Romanenko wrote: > Thank you Ismaël and everybody! > Happy to be a part of Beam community! > > On 17 Jun 2020, at 09:31, Jan Lukavský wrote: > > Congrats Alexey! > On 6/17/20 9:22 AM, Reza Rokni wrote: > >

Re: Chronically flaky tests

2020-07-16 Thread Gleb Kanterov
I agree with what Ahmet is saying. I can share my perspective, recently I had to retrigger build 6 times due to flaky tests, and each retrigger took one hour of waiting time. I've seen examples of automatic tracking of flaky tests, where a test is considered flaky if both fails and succeeds for

Re: Chronically flaky tests

2020-07-16 Thread Gleb Kanterov
There is something called test-retry-gradle-plugin [1]. It retries tests if they fail, and have different modes to handle flaky tests. Did we ever try or consider using it? [1]: https://github.com/gradle/test-retry-gradle-plugin On Thu, Jul 16, 2020 at 1:15 PM Gleb Kanterov wrote: > I ag

Re: [ANNOUNCE] New PMC Member: Chamikara Jayalath

2021-01-22 Thread Gleb Kanterov
Congratulations! On Fri, Jan 22, 2021 at 9:29 AM Ismaël Mejía wrote: > Congrats Cham, well deserved! > > > On Fri, Jan 22, 2021 at 9:02 AM Michał Walenia > wrote: > >> Congratulations, Cham! Thanks for your work! >> >> >> On Fri, Jan 22, 2021 at 3:13 AM Charles Chen wrote: >> >>> Congrats

Re: [DISCUSSION] Docker based development environment issue

2021-05-21 Thread Gleb Kanterov
Is it possible to mount the Docker socket inside the build-env Docker container? We run a lot of similar tests in CI, and it always worked: --mount type=bind,source=/var/run/docker.sock,target=/var/run/docker.sock On Fri, May 21, 2021 at 12:26 PM Alexey Romanenko wrote: > Hello, > > Beam