Re: [DISCUSS] Backwards compatibility of @Experimental features

2019-08-12 Thread Anton Kedin
Concrete user feedback: https://stackoverflow.com/questions/57453473/was-the-beamrecord-type-removed-from-apache-beam/57463708#57463708 Short version: we moved BeamRecord from Beam SQL to core Beam and renamed it to Row (still @Experimental, BTW). But we never mentioned it anywhere where it would

Re: [Update] Beam 2.15 Release Progress

2019-08-07 Thread Anton Kedin
Perf regression is seemingly gone now. If this is caused by a PR we might want to find out which one and cherry-pick it into the release. Regards, Anton On Tue, Aug 6, 2019 at 4:52 PM Yifan Zou wrote: > Hi, > > There is a perf regression on SQL Query3 on dataflow runner. This was > treated as

Re: [ANNOUNCE] New committer: Kyle Weaver

2019-08-06 Thread Anton Kedin
Congrats! On Tue, Aug 6, 2019, 9:37 AM Ankur Goenka wrote: > Congratulations Kyle! > > On Tue, Aug 6, 2019 at 9:35 AM Ahmet Altay wrote: > >> Hi, >> >> Please join me and the rest of the Beam PMC in welcoming a new committer: >> Kyle >> Weaver. >> >> Kyle has been contributing to Beam for a

Re: [ANNOUNCE] New committer: Rui Wang

2019-08-06 Thread Anton Kedin
Congrats! On Tue, Aug 6, 2019, 9:36 AM Ankur Goenka wrote: > Congratulations Rui! > Well deserved  > > On Tue, Aug 6, 2019 at 9:35 AM Ahmet Altay wrote: > >> Hi, >> >> Please join me and the rest of the Beam PMC in welcoming a new committer: Rui >> Wang. >> >> Rui has been an active

Perf regression

2019-08-06 Thread Anton Kedin
I noticed there is a perf regression that appeared on Nexmark dashboard on July 30. It seems to be limited to SQL Query3 and most obvious in Dataflow runner. Direct runner shows a slight increase as well but Spark runner doesn't seem to be affected. I looked at the history of changes of Beam SQL

[RESULT] [VOTE] Release 2.14.0, release candidate #1

2019-07-31 Thread Anton Kedin
I'm happy to announce that we have unanimously approved this release. There are 7 approving votes, 4 of which are binding (in order): * Ahmet (al...@google.com); * Robert (rober...@google.com); * Pablo (pabl...@google.com); * Ismaël (ieme...@gmail.com); There are no disapproving votes. Thanks

Re: [VOTE] Release 2.14.0, release candidate #1

2019-07-30 Thread Anton Kedin
be any special prerequisites for this. >>>>>>>> Things the script does including: >>>>>>>> 1. download the python rc in zip >>>>>>>> 2. start virtualenv and install the sdk. >>>>>>>>

Re: [VOTE] Release 2.14.0, release candidate #1

2019-07-26 Thread Anton Kedin
Cool, will make the post and will update the release guide as well then On Fri, Jul 26, 2019 at 10:20 AM Chad Dombrova wrote: > I think the release guide needs to be updated to remove the optionality of >> blog creation and avoid confusion. Thanks for pointing that out. >> > > +1 > >

Re: [VOTE] Release 2.14.0, release candidate #1

2019-07-26 Thread Anton Kedin
Rui Wang wrote: > >> Tried to verify RC1 by running Nexmark on Dataflow but found it's broken >> (at least based commands from Running+Nexmark >> <https://cwiki.apache.org/confluence/display/BEAM/Running+Nexmark>). >> Will try to debug it and rerun the process. >&g

[VOTE] Release 2.14.0, release candidate #1

2019-07-25 Thread Anton Kedin
Hi everyone, Please review and vote on the release candidate #3 for the version 2.14.0, as follows: [ ] +1, Approve the release [ ] -1, Do not approve the release (please provide specific comments) The complete staging area is available for your review, which includes: * JIRA release notes [1], *

Re: [2.14.0] Release Progress Update

2019-07-25 Thread Anton Kedin
Planning to send out the RC1 within the next couple of hours. Regards, Anton On Thu, Jul 25, 2019 at 1:21 PM Pablo Estrada wrote: > Hi Anton, > are there updates on the release? > Thanks! > -P. > > On Fri, Jul 19, 2019 at 12:33 PM Anton Kedin wrote: > >> Verific

Re: How to run DynamoDBIOTest?

2019-07-19 Thread Anton Kedin
ava version? > Adding Cam to the discussion since he contributed this feature to see > if he may have any extra context. > > On Fri, Jul 19, 2019 at 7:15 PM Anton Kedin wrote: > > > > Hi dev@, > > > > Does anyone know if there's anything extra needed to run > `Dy

Re: [2.14.0] Release Progress Update

2019-07-19 Thread Anton Kedin
-validate it when the issue is resolved. Regards, Anton On Thu, Jul 18, 2019 at 8:54 AM Anton Kedin wrote: > All cherry-picks are merged, blocker jiras closed, running the > verification build. > > On Mon, Jul 15, 2019 at 4:53 PM Ahmet Altay wrote: > >> Anton, any updates on

How to run DynamoDBIOTest?

2019-07-19 Thread Anton Kedin
Hi dev@, Does anyone know if there's anything extra needed to run `DynamoDBIOTest`? If I do `./graldew :sdks:java:io:amazon-web-services:build --debug` it passes few tests during `:test` but then seems to sit on `DynamoDBIOTest` forever. No errors, last meaningful log is `INFO: Container

Re: [2.14.0] Release Progress Update

2019-07-18 Thread Anton Kedin
All cherry-picks are merged, blocker jiras closed, running the verification build. On Mon, Jul 15, 2019 at 4:53 PM Ahmet Altay wrote: > Anton, any updates on this release? Do you need help? > > On Fri, Jun 28, 2019 at 11:42 AM Anton Kedin wrote: > >> I have been running vali

Re: [ANNOUNCE] New committer: Robert Burke

2019-07-16 Thread Anton Kedin
Congrats! On Tue, Jul 16, 2019 at 10:36 AM Ankur Goenka wrote: > Congratulations Robert! > > Go GO! > > On Tue, Jul 16, 2019 at 10:34 AM Rui Wang wrote: > >> Congrats! >> >> >> -Rui >> >> On Tue, Jul 16, 2019 at 10:32 AM Udi Meiri wrote: >> >>> Congrats Robert B.! >>> >>> On Tue, Jul 16, 2019

Re: [2.14.0] Release Progress Update

2019-06-28 Thread Anton Kedin
Kedin wrote: > Not much progress today. Debugging build issues when running global > `./gradlew build -PisRelease --scan` > > Regards, > Anton > > On Thu, Jun 20, 2019 at 4:12 PM Anton Kedin wrote: > >> Published the snapshots, working through the verify_release_

Re: Change of Behavior - JDBC Set Command

2019-06-27 Thread Anton Kedin
I think we thought about this approach but decided to get rid of the map representation wherever we can while still supporting setting of the options by name. One of the lesser important downsides of keeping the map around is that we will need to do `fromArgs` at least twice. Another downside is

Spotless exclusions

2019-06-26 Thread Anton Kedin
Currently our spotless is configured globally [1] (for java at least) to include all source files by '**/*.java'. And then we exclude things explicitly. Don't know why, but these exclusions are ignored for me sometimes, for example `./gradlew :sdks:java:core:spotlessJavaCheck` always fails when

Golang dependencies in .test-infra/tools

2019-06-25 Thread Anton Kedin
Hi, I am trying to verify the release and seeing failures when running `./gradlew :beam-test-tools:build` (it is run as part of the global build). The problem seems to be that it fails to cache one of the dependencies: ``` .gogradle/project_gopath/src/

Re: [2.14.0] Release Progress Update

2019-06-21 Thread Anton Kedin
Not much progress today. Debugging build issues when running global `./gradlew build -PisRelease --scan` Regards, Anton On Thu, Jun 20, 2019 at 4:12 PM Anton Kedin wrote: > Published the snapshots, working through the verify_release_validation > script > > Got another blocker

Re: [ANNOUNCE] New committer: Mikhail Gryzykhin

2019-06-21 Thread Anton Kedin
Congrats! On Fri, Jun 21, 2019 at 3:55 AM Reza Rokni wrote: > Congratulations! > > On Fri, 21 Jun 2019, 12:37 Robert Burke, wrote: > >> Congrats >> >> On Fri, Jun 21, 2019, 12:29 PM Thomas Weise wrote: >> >>> Hi, >>> >>> Please join me and the rest of the Beam PMC in welcoming a new >>>

Re: [2.14.0] Release Progress Update

2019-06-20 Thread Anton Kedin
Published the snapshots, working through the verify_release_validation script Got another blocker to be cherry-picked when merged: https://issues.apache.org/jira/browse/BEAM-7603 Regards, Anton On Wed, Jun 19, 2019 at 4:17 PM Anton Kedin wrote: > I have cut the release branch for 2.1

[2.14.0] Release Progress Update

2019-06-19 Thread Anton Kedin
I have cut the release branch for 2.14.0 and working through the release process. Next step is building the snapshot and release branch verification. There are two issues [1] that are still not resolved that are marked as blockers at the moment: * [2] BEAM-7478 - remote cluster submission from

Re: [Final Reminder] Beam 2.14 release branch will be cut tomorrow at 6pm UTC

2019-06-19 Thread Anton Kedin
M%20AND%20status%20in%20(Open%2C%20%22In%20Progress%22%2C%20Reopened)%20AND%20fixVersion%20%3D%202.14.0 > > > On Wed, Jun 19, 2019 at 3:19 AM Chamikara Jayalath > wrote: > > > > > > > > On Tue, Jun 18, 2019 at 6:00 PM Anton Kedin wrote: > >> > >

Re: [Final Reminder] Beam 2.14 release branch will be cut tomorrow at 6pm UTC

2019-06-18 Thread Anton Kedin
wse/BEAM-7424 was > marked as a blocker and we'd like to get the fix to Python SDK into the > 2.14 release. > > Thanks, > Cham > > On Tue, Jun 18, 2019 at 4:16 PM Anton Kedin wrote: > >> It's a reminder, I am planning to cut the release branch tomorrow, on >> Wednesd

[Final Reminder] Beam 2.14 release branch will be cut tomorrow at 6pm UTC

2019-06-18 Thread Anton Kedin
It's a reminder, I am planning to cut the release branch tomorrow, on Wednesday, June 19, at 11am PDT (Seattle local time, corresponds to [19:00@GMT+1] and [18:00@UTC]). Please make sure all the code you want in the release is submitted by that time, and that all blocking Jiras have the release

Re: GitHub checks not running

2019-06-17 Thread Anton Kedin
They are getting triggered now. On Mon, Jun 17, 2019 at 9:10 AM Anton Kedin wrote: > Hi dev@, > > Does anyone has context on why the checks might not get triggered on pull > requests today? E.g. https://github.com/apache/beam/pull/8822 > > Regards, > Anton >

GitHub checks not running

2019-06-17 Thread Anton Kedin
Hi dev@, Does anyone has context on why the checks might not get triggered on pull requests today? E.g. https://github.com/apache/beam/pull/8822 Regards, Anton

[Reminder] Beam 2.14 Release to be cut on Wed, June 19 at 6pm UTC

2019-06-17 Thread Anton Kedin
It's a reminder, I am planning to cut the release branch on Wednesday, June 19, at 11am PDT (Seattle local time, corresponds to [19:00@GMT+1] and [18:00@UTC]). Please make sure all the code you want in the release is submitted by that time, and that all blocking Jiras have the release version

[SQL] Let's split the TableProvider

2019-06-14 Thread Anton Kedin
Hi dev@, and especially anyone interested in SQL, We have an interface called TableProvider (and some other related classes) in Beam SQL that manages how we resolve the table schemas, construct IOs and do other related and unrelated things when parsing the queries. At the moment it feels very

[Reminder] Beam 2.14.0 Release Soon

2019-06-12 Thread Anton Kedin
Reminder, the plan is to cut the branch a week from now, on June 19th. Please mark all release blocking issues with fix version 2.14. Thank you, Anton [1] https://calendar.google.com/calendar/embed?src=0p73sl034k80oob7seouanigd0%40group.calendar.google.com [2]

Re: [DISCUSS] Portability representation of schemas

2019-06-07 Thread Anton Kedin
The topic of schema registries probably does not block the design and implementation of logical types and portable schemas by themselves, however I think we should spend some time discussing it (probably in a separate thread) so that all SDKs have similar mechanisms for schema registration and

Re: [PROPOSAL] Preparing for Beam 2.14.0 release

2019-06-06 Thread Anton Kedin
gt;> >>> +1 >>> >>> On Thu, Jun 6, 2019, 9:13 AM Ahmet Altay wrote: >>> >>>> +1, thank you for keeping the cadence. >>>> >>>> On Thu, Jun 6, 2019 at 9:04 AM Anton Kedin wrote: >>>> >>>>> Hello

[PROPOSAL] Preparing for Beam 2.14.0 release

2019-06-06 Thread Anton Kedin
Hello Beam community! Beam 2.14 release branch cut date is June 19 according to the release calendar [1]. I would like to volunteer myself to do this release. The plan is to cut the branch on that date, and cherrypick fixes if needed. If you have release blocking issues for 2.14 please mark

Re: 1 Million Lines of Code (1 MLOC)

2019-05-31 Thread Anton Kedin
And to reduce the effort of future rewrites we should start doing it on a schedule. I propose we start over once a week :) On Fri, May 31, 2019 at 4:02 PM Lukasz Cwik wrote: > 1 million lines is too much, time to delete the entire project and start > over again, :-) > > On Fri, May 31, 2019 at

Re: SqlTransform Metadata

2019-05-14 Thread Anton Kedin
Reza, can you share more thoughts on how you think this can work end-to-end? Currently the approach is that populating the rows with the data happens before the SqlTransform, and within the query you can only use the things that are already in the rows or in the catalog/schema (or built-in

Re: Unexpected behavior of StateSpecs

2019-05-09 Thread Anton Kedin
Does it look similar to https://issues.apache.org/jira/browse/BEAM-6813 ? I also stumbled on a problem with a state in DirectRunner but wasn't able to figure it out yet: https://lists.apache.org/thread.html/dae8b605a218532c085a0eea4e71338eae51922c26820f37b24875c0@%3Cdev.beam.apache.org%3E

Re: Pipeline options validation

2019-04-30 Thread Anton Kedin
Java8 Optional is not serializable. I think this may be a blocker. Or not? Regards, Anton On Tue, Apr 30, 2019 at 12:18 PM Lukasz Cwik wrote: > The migration to requiring @Nullable on methods that could take/return > null didn't update PipelineOptions contract and its validation to respect >

Re: Sharing plan to support complex equi-join condition in BeamSQL

2019-04-26 Thread Anton Kedin
Thank you for sharing this. This is a great overview. Left few comments in the doc. Regards, Anton On Fri, Apr 26, 2019 at 10:12 AM Rui Wang wrote: > Hi Community, > > TL;DR: > > BeamSQL only supports equi-join, and its join condition can only be forms > of `col_a = col_b` or `col_a = col_b

Re: [PROPOSAL] Preparing for Beam 2.13.0 release

2019-04-26 Thread Anton Kedin
Following Ankur's link I see a "[+]GoogleCalendar" button in the bottom right corner of the page. Clicking it opens the google calendar and prompts to add the Beam Calendar (at least in Chrome). Ismael, do you have a similar button in your case? [image: image.png] Regards, Anton On Fri, Apr

Re: Removing Java Reference Runner code

2019-04-26 Thread Anton Kedin
If there is no plans to invest in ULR then it makes sense to remove it. Going forward, however, I think we should try to document the higher level approach we're taking with runners (and portability) now that we have something working and can reflect on it. For example, couple of things that are

Re: Dependency management for multiple IOs

2019-02-19 Thread Anton Kedin
;>> >>>>> On Fri, Feb 15, 2019 at 6:06 PM Chamikara Jayalath < >>>>> chamik...@google.com> wrote: >>>>> >>>>>> I think the underlying problem is two modules of Beam transitively >>>>>> depending on conf

Dependency management for multiple IOs

2019-02-15 Thread Anton Kedin
Hi dev@, I have a problem, I don't know a good way to approach the dependency management between Beam SQL and Beam IOs, and want to collect thoughts about it. Beam SQL depends on specific IOs so that users can query them. The IOs need their dependencies to work. Sometimes the IOs also leak their

[SQL] External schema providers

2019-02-14 Thread Anton Kedin
Hi dev@, A quick update about a new Beam SQL feature. In short, we have wired up the support for plugging table providers through Beam SQL API to allow obtaining table schemas from external sources. *What does it even mean?* Previously, in Java pipelines, you could apply a Beam SQL query to

Re: Findbugs -> Spotbugs ?

2019-01-31 Thread Anton Kedin
It would be nice. How fast is it on Beam codebase? Regards, Anton On Thu, Jan 31, 2019 at 10:38 AM Udi Meiri wrote: > +1 for spotbugs > > On Thu, Jan 31, 2019 at 5:09 AM Gleb Kanterov wrote: > >> Agree, spotbugs brings static checks that aren't covered in error-prone, >> it's a good addition.

Re: [ANNOUNCE] New committer announcement: Gleb Kanterov

2019-01-25 Thread Anton Kedin
Congrats! On Fri, Jan 25, 2019 at 8:54 AM Ismaël Mejía wrote: > Well deserved, congratulations Gleb! > > On Fri, Jan 25, 2019 at 10:47 AM Etienne Chauchot > wrote: > > > > Congrats Gleb and welcome onboard ! > > > > Etienne > > > > Le vendredi 25 janvier 2019 à 10:39 +0100, Alexey Romanenko a

Re: compileJava broken on master see: BEAM-6495

2019-01-23 Thread Anton Kedin
On Wed, Jan 23, 2019 at 11:13 AM Alex Amato wrote: > Okay, make sense perhaps we can somehow make it fail when it fails to > generate the dep, rather than when compiling the java code later on > > On Wed, Jan 23, 2019 at 11:12 AM Anton Kedin wrote: > >> ParserImpl is autogenerate

Re: compileJava broken on master see: BEAM-6495

2019-01-23 Thread Anton Kedin
ParserImpl is autogenerated by Calcite at build time. It seems that there's a race condition there and it sometimes fails. Rerunning the build works for me. Regards, Anton On Wed, Jan 23, 2019, 11:06 AM Alex Amato wrote: > https://jira.apache.org/jira/browse/BEAM-6495?filter=-2 > > Any ideas,

Re: Why does Beam not use the google-api-client libraries?

2019-01-02 Thread Anton Kedin
I don't have enough context to answer all of the questions, but looking at PubsubIO it seems to use the official libraries, e.g. see Pubsub doc [1] vs Pubsub IO GRPC client [2]. Correct me if I misunderstood your question. [1]

Re: [RFC] I made a new tabbed Beam view in Jenkins

2018-12-18 Thread Anton Kedin
This is really helpful, didn't realize it was possible. Categories and contents look reasonable. I think something like this definitely should be the top-level Beam view. Regards, Anton On Tue, Dec 18, 2018 at 12:05 PM Kenneth Knowles wrote: > Hi all, > > I made a new view to split Beam builds

Re: [DISCUSS] Structuring Java based DSLs

2018-11-30 Thread Anton Kedin
I think this approach makes sense in general, Euphoria can be the implementation detail of SQL, similar to Join Library or core SDK Schemas. I wonder though whether it would be better to bring Euphoria closer to core SDK first, maybe even merge them together. If you look at Reuven's recent work

Re: Design review for supporting AutoValue Coders and conversions to Row

2018-11-15 Thread Anton Kedin
One reason is that @AutoValue is not guaranteed to be retained at runtime: https://github.com/google/auto/blob/master/value/src/main/java/com/google/auto/value/AutoValue.java#L44 On Thu, Nov 15, 2018 at 11:36 AM Kenneth Knowles wrote: > Just some low-level detail: If there is no @DefaultSchema

Re: Design review for supporting AutoValue Coders and conversions to Row

2018-11-09 Thread Anton Kedin
Hi Jeff, I think this is a great idea! Thank you for working on the proposal. I left couple of comments in the doc. Have you tried prototyping this? Regards, Anton On Fri, Nov 9, 2018 at 1:50 PM Jeff Klukas wrote: > Hi all - I'm looking for some review and commentary on a proposed design >

Stackoverflow Questions

2018-11-05 Thread Anton Kedin
Hi dev@, I was looking at stackoverflow questions tagged with `apache-beam` [1] and wanted to ask your opinion. It feels like it's easier for some users to ask questions on stackoverflow than on user@. Overall frequency between the two channels seems comparable but a lot of stackoverflow

Re: Fixing equality of Rows

2018-10-29 Thread Anton Kedin
About these specific use cases, how useful is it to support Map and List? These seem pretty exotic (maybe they aren't) and I wonder whether it would make sense to just reject them until we have a solid design. And wouldn't the same problems arise even without RowCoder? Is the path in that case to

Re: Java postcommits duration almost hit 4 hours

2018-10-12 Thread Anton Kedin
Not sure where other perf issues are coming from, but this specific BQ test suite was disabled yesterday: https://github.com/apache/beam/pull/6658 On Fri, Oct 12, 2018 at 3:20 PM Kenneth Knowles wrote: > Nice catch. Here is a build that went from 2.5 to 3 hours: >

Re: Java compilation errors

2018-10-11 Thread Anton Kedin
It's being discussed on Slack at the moment, the issue seems to be the new errorprone version which has new checks. On Thu, Oct 11, 2018 at 10:23 AM Mikhail Gryzykhin wrote: > Hi everyone, > > Just a heads up: > > I see that Java builds have compilation failures. Can someone help look > into

Re: [Proposal] Euphoria DSL - looking for reviewers

2018-10-10 Thread Anton Kedin
I think the code looks good and we should probably just merge it (unless there are other blockers, e.g. formal approvals), considering: - it has been reviewed; - it is tested and used in production; - it was discussed on the list and there were no objections to having it as part of Beam; - it

Re: Jira Integration with Github

2018-10-09 Thread Anton Kedin
Assuming this is a github-only plugin, why does it have to go through ASF? On Tue, Oct 9, 2018 at 3:20 AM Maximilian Michels wrote: > Hi Kai, > > This needs to be supported by the ASF first. So the best idea would be > to propose this to the INFRA team. Or post it on the ASF community > mailing

Java SDK Extensions

2018-10-03 Thread Anton Kedin
Hi dev@, *TL;DR:* `sdks/java/extensions` is hard to discover, navigate and understand. *Current State:* I was looking at `sdks/java/extensions`[1] and realized that I don't know what half of those things are. Only `join library` and `sorter` seem to be documented and discoverable on Beam

Re: [DISCUSS] Committer Guidelines / Hygene before merging PRs

2018-09-28 Thread Anton Kedin
Is there an actual problem caused by squashing or not squashing the commits that we face in the project? I personally have never needed to revert something complicated that would be problematic either way (and don't have a strong opinion about which way we should do it). From what I see so far in

Re: [ANNOUNCEMENT] New Beam chair: Kenneth Knowles

2018-09-19 Thread Anton Kedin
Congrats! On Wed, Sep 19, 2018 at 1:36 PM Ankur Goenka wrote: > Congrats Kenn! > > On Wed, Sep 19, 2018 at 1:35 PM Amit Sela wrote: > >> Well deserved! Congrats Kenn. >> >> On Wed, Sep 19, 2018 at 4:25 PM Kai Jiang wrote: >> >>> Congrats, Kenn! >>> ᐧ >>> >>> On Wed, Sep 19, 2018 at 1:23 PM

Re: Migrating Beam SQL to Calcite's code generation

2018-09-17 Thread Anton Kedin
This is pretty amazing! Thank you for doing this! Regards, Anton On Mon, Sep 17, 2018 at 2:27 PM Andrew Pilloud wrote: > I've adapted Calcite's EnumerableCalc code generation to generate the > BeamCalc DoFn. The primary purpose behind this change is so we can take > advantage of Calcite's

Re: [VOTE] Donating the Dataflow Worker code to Apache Beam

2018-09-14 Thread Anton Kedin
+1 On Fri, Sep 14, 2018 at 3:22 PM Alan Myrvold wrote: > +1 > > On Fri, Sep 14, 2018 at 3:16 PM Boyuan Zhang wrote: > >> +1 >> >> On Fri, Sep 14, 2018 at 3:15 PM Henning Rohde wrote: >> >>> +1 >>> >>> On Fri, Sep 14, 2018 at 2:40 PM Ahmet Altay wrote: >>> +1 (binding) On Fri,

Re: [Discuss] Add EXTERNAL keyword to CREATE TABLE statement

2018-09-14 Thread Anton Kedin
> I agree on this. Enforcing users to look up documentation for the correct >> way is better than letting them use an ambiguous way that could fail their >> expectation. >> >> >> -Rui >> >> On Wed, Aug 15, 2018 at 1:46 PM Anton Kedin wrote: >&g

Re: Nexmark pseudo code in the wiki

2018-08-17 Thread Anton Kedin
Thank you! On Fri, Aug 17, 2018 at 9:44 AM Thomas Weise wrote: > Anton, you should be all set. > > On Fri, Aug 17, 2018 at 9:11 AM Anton Kedin wrote: > >> Sure, I can do that. >> Can someone give me permissions? >> >> Thank you, >> Anton >> >

Re: Nexmark pseudo code in the wiki

2018-08-17 Thread Anton Kedin
; Thanks > Etienne > > Le jeudi 16 août 2018 à 09:10 -0700, Anton Kedin a écrit : > > This is nice! Thank you for publishing this! > > The only thing I would add is the pseudo-SQL versions of the queries, > similar to how they're described in the original Nexmark paper. >

Re: Nexmark pseudo code in the wiki

2018-08-16 Thread Anton Kedin
This is nice! Thank you for publishing this! The only thing I would add is the pseudo-SQL versions of the queries, similar to how they're described in the original Nexmark paper. Regards, Anton On Thu, Aug 16, 2018 at 5:57 AM Etienne Chauchot wrote: > Hi guys, > > I've also created a page on

Re: [Discuss] Add EXTERNAL keyword to CREATE TABLE statement

2018-08-15 Thread Anton Kedin
quot;operation >> CREATE TABLE not found".) >> >> If the goal is clarity of the operation, how about 'REGISTER EXTERNAL DATA >> SOURCE' and 'REGISTER EXTERNAL DATA SOURCE PROVIDER'? Those names remove >> the ambiguity around the operation creating and the data source be

Re: How do we run pipeline using gradle?

2018-08-15 Thread Anton Kedin
Huygaa, Not sure about existing options for WordCount specifically, but nothing stops us from having it. In SQL we have a couple of tasks to simplify launching the examples: https://github.com/apache/beam/blob/master/sdks/java/extensions/sql/build.gradle#L149

Re: Policy for Python ValidatesRunner vs IT tests?

2018-08-14 Thread Anton Kedin
IT tests exist in java, similar to unit tests and not marked in a special way, except they're called *IT.java instead of *Test.java. They're run from corresponding tasks: -

Re: [SQL] Create External Schema

2018-08-13 Thread Anton Kedin
t; On Mon, Aug 13, 2018 at 4:06 PM Anton Kedin wrote: > >> Hi, >> >> I am planning to work on implementing a support for external schema >> providers for Beam SQL and wanted to share a high level idea how I think >> this can work. >> >> *Short

[SQL] Create External Schema

2018-08-13 Thread Anton Kedin
Hi, I am planning to work on implementing a support for external schema providers for Beam SQL and wanted to share a high level idea how I think this can work. *Short Version* Implement CREATE FOREIGN SCHEMA statement: CREATE FOREIGN SCHEMA TYPE 'bigquery' LOCATION 'dataset_example' AS

Re: Schema Aware PCollections

2018-08-08 Thread Anton Kedin
Yes, this should be possible eventually. In fact, limited version of this functionality is already supported for Beans (e.g. see this test

Re: [Vote] Dev wiki engine

2018-07-19 Thread Anton Kedin
+1 for Confluence On Thu, Jul 19, 2018 at 2:56 PM Andrew Pilloud wrote: > +1 Apache Confluence > > Because .md files in code repo require code review and commit. > > On Thu, Jul 19, 2018, 2:22 PM Mikhail Gryzykhin wrote: > >> Hi everyone, >> >> There is a long lasting discussion on starting

Re: [ANNOUNCEMENT] Nexmark included to the CI

2018-07-18 Thread Anton Kedin
These dashboards look great! Can publish the links to the dashboards somewhere, for better visibility? E.g. in the jenkins website / emails, or the wiki. Regards, Anton On Wed, Jul 18, 2018 at 10:08 AM Andrew Pilloud wrote: > Hi Etienne, > > I've been asking around and it sounds like we

Re: Permissions for confluence

2018-07-13 Thread Anton Kedin
I may be mistaken, but there was no final conclusion reached, so probably guidance from PMC will be needed where specifically to put things. I personally think that this kind of documentation is a right thing to put under cwiki/contributors. >From the last thread I think Kenn, Jean-Baptiste

Re: Automatically create JIRA tickets for failing post-commit tests

2018-07-11 Thread Anton Kedin
I think this looks good, we should enable the plugin and try it out. Concrete details of the follow-up tasks (auto-assignment, triage, and dashboarding) will probably depend on how functional the plugin is and what the test failures data looks like. Regards, Anton On Wed, Jul 11, 2018 at 5:00 PM

Re: Building and visualizing the Beam SQL graph

2018-06-13 Thread Anton Kedin
s wrote: >>>> >>>>> Agree with that. It will be kind of tricky to generalize. I think >>>>> there are some criteria in this case that might apply in other cases: >>>>> >>>>> 1. Each rel node (or construct of a DSL) should have

Re: Building and visualizing the Beam SQL graph

2018-06-11 Thread Anton Kedin
Not answering the original question, but doesn't "explain" satisfy the SQL use case? Going forward we probably want to solve this in a more general way. We have at least 3 ways to represent the pipeline: - how runner executes it; - what it looks like when constructed; - what the user was

Re: [DISCUSS] Use Confluence wiki for non-user-facing stuff

2018-06-08 Thread Anton Kedin
+1 (a) we should; (b) I think it will be a good place for all of the things you list; (c) introductory things, like "getting started", or "programming guide" that people not deeply involved in the project would expect to find on beam.apache.org should stay there, not in the wiki; On Fri, Jun 8,

Re: [SQL] Unsupported features

2018-06-01 Thread Anton Kedin
This looks very helpful, thank you. Can you file Jiras for the major problems? Or maybe a single jira for the whole thing with sub-tasks for specific problems. Regards, Anton On Wed, May 30, 2018 at 9:12 AM Kenneth Knowles wrote: > This is extremely useful. Thanks for putting so much

Re: [ANNOUNCEMENT] New committers, May 2018 edition!

2018-05-31 Thread Anton Kedin
Congrats! On Thu, May 31, 2018 at 7:29 PM Kenneth Knowles wrote: > Huzzah! > > On Thu, May 31, 2018 at 7:27 PM Ahmet Altay wrote: > >> Congratulations to all of you! >> >> On Thu, May 31, 2018 at 7:26 PM, Chamikara Jayalath > > wrote: >> >>> Congrats to all three!! >>> >>> On Thu, May 31, 2018

Re: Java code under main depends on junit?

2018-05-17 Thread Anton Kedin
Opened PR to fix the current build issue, opened BEAM-4358 to extract test dependencies. Should we keep maven precommits running for now if we have to fix the issues like these? In the PR I had to fix

Re: Java code under main depends on junit?

2018-05-17 Thread Anton Kedin
My fault, I'll fix the maven issue. I added this file and it is not in test intentionally. The purpose of this class is similar to TestPipeline, in that other packages which depend on GCP IO can use this class in tests, including integration tests. For example, right now Beam SQL project depends

Re: JDBC support for Beam SQL

2018-05-16 Thread Anton Kedin
Among these options I would lean towards option 1. We already support a lot of infrastructure to call into Calcite for non-JDBC path, so adding some code to generate config does not seem like a big of a deal, especially if it will be a supported way at some point in Calcite. Pulling

Eventual PAssert

2018-05-14 Thread Anton Kedin
Hi, While working on an integration test for Pubsub-related functionality I couldn't find a good solution to test the pipelines that don't reliably stop. I propose we extend PAssert to support eventual verification. In this case some success/failure

Re: Pubsub to Beam SQL

2018-05-10 Thread Anton Kedin
can't hide it without changing the definition of GROUP BY. > I like Anton's proposal of adding it as an annotation in the column > definition. That seems even simpler and more user friendly. We might even > be able to get away with using the PRIMARY KEY keyword. > > > Andrew &g

Re: Pubsub to Beam SQL

2018-05-04 Thread Anton Kedin
e possible. Thank you, Anton On Fri, May 4, 2018 at 9:48 AM Raghu Angadi <rang...@google.com> wrote: > On Thu, May 3, 2018 at 12:47 PM Anton Kedin <ke...@google.com> wrote: > >> I think it makes sense for the case when timestamp is provided in the >> payload (including

Complex Types Support for Beam SQL DDL

2018-05-04 Thread Anton Kedin
Hi, I am working on adding support for non-primitive types in Beam SQL DDL. *Goal* Allow users to define tables with Rows, Arrays, Maps as field types in DDL. This enables defining schemas for complex sources, e.g. describing JSON sources or other sources which support complex field types (BQ,

Re: Pubsub to Beam SQL

2018-05-03 Thread Anton Kedin
itself. On Thu, May 3, 2018 at 11:44 AM Reuven Lax <re...@google.com> wrote: > Are you planning on integrating this directly into PubSubIO, or add a > follow-on transform? > > On Wed, May 2, 2018 at 10:30 AM Anton Kedin <ke...@google.com> wrote: > >> Hi >>

Re: Pubsub to Beam SQL

2018-05-03 Thread Anton Kedin
t; Andrew > > On Wed, May 2, 2018 at 10:30 AM Anton Kedin <ke...@google.com> wrote: > >> Hi >> >> I am working on adding functionality to support querying Pubsub messages >> directly from Beam SQL. >> >> *Goal* >> Provide Beam users a pure SQ

Pubsub to Beam SQL

2018-05-02 Thread Anton Kedin
Hi I am working on adding functionality to support querying Pubsub messages directly from Beam SQL. *Goal* Provide Beam users a pure SQL solution to create the pipelines with Pubsub as a data source, without the need to set up the pipelines in Java before applying the query. *High level

Re: Beam SQL Improvements

2018-04-27 Thread Anton Kedin
<rmannibu...@gmail.com> wrote: > > > Le 26 avr. 2018 23:13, "Anton Kedin" <ke...@google.com> a écrit : > > BeamRecord (Row) has very little in common with JsonObject (I assume > you're talking about javax.json), except maybe some similarities of the > A

Re: Beam SQL Improvements

2018-04-26 Thread Anton Kedin
eam SQL ? >>>>> >>>>> That will be part of schema support: generic record could be one of >>>>> the payload with across schema. >>>>> >>>>> Regards >>>>> JB >>>>> Le 26 avr. 2018, à 11:39, "Is

Re: Beam SQL Improvements

2018-04-26 Thread Anton Kedin
ël Mejía" < ieme...@gmail.com> a écrit: >>>> >>>> Hello Anton, >>>> >>>> Thanks for the descriptive email and the really useful work. Any plans >>>> to tackle PCollections of GenericRecord/IndexedRecords? it seems Av

Beam SQL Improvements

2018-04-25 Thread Anton Kedin
Hi, I want to highlight a couple of improvements to Beam SQL we have been working on recently which are targeted to make Beam SQL API easier to use. Specifically these features simplify conversion of Java Beans and JSON strings to Rows. Feel free to try this and send any bugs/comments/PRs my

Re: New beam contributor experience?

2018-03-14 Thread Anton Kedin
Not sure if it was mentioned in other threads, but it probably makes sense to add gradle instructions there. On Wed, Mar 14, 2018 at 11:48 AM Alan Myrvold wrote: > There is a contribution guide at > https://beam.apache.org/contribute/contribution-guide/ > Has anyone had

Re: slack @the-asf?

2018-03-14 Thread Anton Kedin
What's the plan for users without `@apache.org` email? The page says to contact a workspace administrator for an invitation. Will all existing users be automatically invited to the new workspace? On Wed, Mar 14, 2018 at 9:58 AM Thomas Weise wrote: > After you enter the ASF ID

  1   2   >