JDBC support for Beam SQL

2018-05-16 Thread Andrew Pilloud
I'm currently adding JDBC support to Beam SQL! Unfortunately Calcite has two distinct entry points, one for JDBC and one for everything else (see CALCITE-1525). Eventually that will change, but I'd like to avoid having two versions of Beam SQL until Calcite converges on a single path for parsing

Re: [SQL] Cross Join Operation

2018-05-15 Thread Andrew Pilloud
Calcite does not have the concept of a "CROSS JOIN". It shows up in the plan as a LogicalJoin with condition=[true]. We could try rejecting the cross join at the planning stage by returning null for them in BeamJoinRule.convert(), which might result in a different plan. But looking at your query,

Re: Fwd: Closing (automatically?) inactive pull requests

2018-05-14 Thread Andrew Pilloud
Warnings are really helpful, I've forgotten about PRs on projects I rarely contribute to before. Also authors can reopen their closed pull requests if they decide they want to work on them again. This seems to be already covered in the Stale pull requests section of the contributor guide. Seems

Re: Graal instead of docker?

2018-05-11 Thread Andrew Pilloud
Json and Protobuf aren't the same thing. Json is for exchanging unstructured data, Protobuf is for exchanging structured data. The point of Portability is to define a protocol for exchanging structured messages across languages. What do you propose using on top of Json to define message structure?

Re: Documenting Github PR jenkins trigger phrases

2018-05-10 Thread Andrew Pilloud
It would be great to have the set of "Run {Java,Python,Go} PreCommit" documented in the contributors guide as well. Those match up to the jobs auto run on every PR and are the ones I use most. There is no security, anyone can run them including 'Run Seed Job'. That one seems like a good one to

Re: Merging our two SQL parser configs

2018-05-09 Thread Andrew Pilloud
Haven't heard anything, so I wrote up the change: https://github.com/apache/beam/pull/5325 Andrew On Mon, May 7, 2018 at 3:16 PM Andrew Pilloud <apill...@google.com> wrote: > So we have two incompatible SQL parser configs in beam. One is in > BeamQueryPlanner > <https://gith

Re: Jenkins Post Commit Status to Github

2018-05-09 Thread Andrew Pilloud
: > https://builds.apache.org/job/beam_PostCommit_Python_Verify/4909/console > > On Wed, May 9, 2018 at 12:07 PM Andrew Pilloud <apill...@google.com> > wrote: > >> Post commits are no longer failing on status pushes. Now that I know >> about the seed job, I'll figur

Re: Jenkins Post Commit Status to Github

2018-05-09 Thread Andrew Pilloud
. > > On Wed, May 9, 2018 at 11:10 AM Andrew Pilloud <apill...@google.com> > wrote: > >> I've not heard of seed jobs before, but from what I've been told I need >> to create a PR with a empty '.test-infra/jenkins' folder then type 'Run >> Seed Job' in a comment t

Re: Jenkins Post Commit Status to Github

2018-05-09 Thread Andrew Pilloud
Kenn tells me there is a button he can push to run it. He clicked it. Hopefully that fixes the postcommits. I don't know why Jenkins itself is having high latency but I've seen the same thing over the last few days. On Wed, May 9, 2018 at 11:09 AM Andrew Pilloud <apill...@google.com>

Re: Jenkins Post Commit Status to Github

2018-05-09 Thread Andrew Pilloud
>>>> It seems as though precommits are no longer triggering and trigger >>>> requests like 'Run Java PreCommit' are no longer honored. >>>> >>>> On Wed, May 9, 2018 at 10:22 AM Andrew Pilloud <apill...@google.com> >>>> wrote: &g

Re: Jenkins Post Commit Status to Github

2018-05-09 Thread Andrew Pilloud
I broke all the post commits with this. Sorry! It has been reverted. I'm going to follow up with Apache Infra about getting the right credentials configured on the Jenkins plugin. Andrew On Tue, May 8, 2018 at 1:38 PM Andrew Pilloud <apill...@google.com> wrote: > Yep, mess with t

Re: Jenkins Post Commit Status to Github

2018-05-08 Thread Andrew Pilloud
com> wrote: > I think you want to mess with the groovy scripts in .test-infra/jenkins > > Kenn > > On Mon, May 7, 2018 at 11:12 AM Andrew Pilloud <apill...@google.com> > wrote: > >> The Github branches page shows the status of the latest commit on each >> b

Merging our two SQL parser configs

2018-05-07 Thread Andrew Pilloud
So we have two incompatible SQL parser configs in beam. One is in BeamQueryPlanner which is used by

Jenkins Post Commit Status to Github

2018-05-07 Thread Andrew Pilloud
The Github branches page shows the status of the latest commit on each branch and provides a set of links to the jobs run on that commit. But it doesn't appear Jenkins is publishing status from post commit jobs. This seems like a simple oversight that should be easy to fix. Could someone point me

Re: Graal instead of docker?

2018-05-05 Thread Andrew Pilloud
ai 2018 08:43, "Reuven Lax" <re...@google.com> a écrit : >> >> >> I don't believe we enforce docker anywhere. In fact if someone wanted >> to >> run an all-windows beam cluster, they would probably not use docker for >> their runner (docker runs on W

Re: Pubsub to Beam SQL

2018-05-04 Thread Andrew Pilloud
ed is specific to each source. That way custom >> timestamp option seem like they belong in TBLPROPERTIES. E.g. for KafkaIO, >> it could specify "logAppendTime", "createTime", or "processingTime" etc >> (though I am not sure how user can provide

Re: [PROPOSAL] Preparing 2.5.0 release next week

2018-05-04 Thread Andrew Pilloud
Spanner is also broken, and post commits are failing. I've added the issue as a blocker. https://issues.apache.org/jira/browse/BEAM-4229 Andrew On Fri, May 4, 2018 at 1:24 PM Charles Chen wrote: > I have added https://issues.apache.org/jira/browse/BEAM-4236 as a blocker. > >

Re: [SQL] Reconciling Beam SQL Environments with Calcite Schema

2018-05-04 Thread Andrew Pilloud
Reviews are wrapping up, this will probably merge Monday if I don't hear from anyone else. One more TableProvider API change after review feedback: getTables now returns Map<String, Table> instead of Set. Andrew On Thu, May 3, 2018 at 10:41 AM Andrew Pilloud <apill...@google.com>

Re: Pubsub to Beam SQL

2018-05-03 Thread Andrew Pilloud
ic mechanism to extract and map the event timestamp to > the schema. This is, of course, if we don't automatically add a magic > timestamp field which Beam SQL can populate behind the scenes and add to > the schema. I want to avoid this magic path for now. > > On Thu, May 3, 2018 a

Re: Google Summer of Code Project Intro

2018-05-03 Thread Andrew Pilloud
Hi Kai, Glad to hear someone is putting more work into benchmarking Beam SQL! It would be really cool if we had some of these running as nightly performance test jobs so we would know when there is a performance regression. This might be out of scope of your project, but keep it in mind. I am

Re: Pubsub to Beam SQL

2018-05-03 Thread Andrew Pilloud
This sounds awesome! Is event timestamp something that we need to specify for every source? If so, I would suggest we add this as a first class option on CREATE TABLE rather then something hidden in TBLPROPERTIES. Andrew On Wed, May 2, 2018 at 10:30 AM Anton Kedin wrote: >

Re: [SQL] Reconciling Beam SQL Environments with Calcite Schema

2018-05-03 Thread Andrew Pilloud
Ok, I've finished with this change. Didn't get reviews on the early cleanup PRs, so I've pushed all these changes into the first cleanup PR: https://github.com/apache/beam/pull/5224 Andrew On Tue, May 1, 2018 at 10:35 AM Andrew Pilloud <apill...@google.com> wrote: > I'm just startin

Re: Jenkins: can a job execute concurrently on multiple nodes?

2018-05-02 Thread Andrew Pilloud
These jobs also require Dataflow which has various quotas on resource usage. I hit these while working on the Dataflow Nexmark tests for SQL. I'm not sure what the quota is on the account that Jenkins uses, but the default quota will max out at around 2 concurrent jobs. Andrew On Wed, May 2,

Re: [SQL] Reconciling Beam SQL Environments with Calcite Schema

2018-05-01 Thread Andrew Pilloud
to support things like >> Apache Atlas and HCatalog, IIUC for the "create if needed" logic when using >> Beam SQL to create a derived data set. But I don't think we should build >> out those code paths until we have at least one non-in-memory >> implementation. &

Re: Merge options in Github UI are confusing

2018-04-24 Thread Andrew Pilloud
gt;> Le lundi 16 avril 2018 à 22:19 +, Robert Bradshaw a écrit : >>> >>> >>> +1, though I'll admit I've been an occasional user of the "squash and >>> merge" button when a small PR has a huge number of small, fixup changes >>> piled on i

[SQL] Reconciling Beam SQL Environments with Calcite Schema

2018-04-23 Thread Andrew Pilloud
I'm working on updating our Beam DDL code to use the DDL execution functionality that recently merged into core calcite. This enables us to take advantage of Calcite JDBC as a way to use Beam SQL. As part of that I need to reconcile the Beam SQL Environments with the Calcite Schema (which is

Merge options in Github UI are confusing

2018-04-16 Thread Andrew Pilloud
*The Github UI provides several options for merging a PR hidden behind the “Merge pull request” button. Only the “Create a merge commit” option does what most users expect, which is to merge by creating a new merge commit. This is the option recommended in the Beam committer’s guide, but it is not

Re: SQL in Python SDK

2018-04-13 Thread Andrew Pilloud
Hi Gabor, Are Python UDFs (User-defined functions) something that might work for you? If all you really need to write in Python is your DoFn this is probably your best option. It is still a bit of work but we support Java UDFs today, so all you would need to do is write a Java wrapper to call

Re: Beam7 Outage

2018-04-12 Thread Andrew Pilloud
They all seem flaky over the past few days. I just hit one on beam1: java.io.IOException: Backing channel 'beam1' is disconnected. https://builds.apache.org/job/beam_PreCommit_Java_GradleBuild/4068/console Could there be some load issue from the Gradle changes? Andrew On Thu, Apr 12, 2018 at

Re: New beam contributor experience?

2018-03-14 Thread Andrew Pilloud
To add more to what Anton said, the 'mvn clean verify' step takes hours and fails frequently due to bad tests. I spent the first few days working with beam trying to figure out what was wrong with my system when I was just hitting test flaps. If we're going to gradle that would be a great place to

<    1   2   3