[SQL] Reconciling Beam SQL Environments with Calcite Schema
I'm working on updating our Beam DDL code to use the DDL execution functionality that recently merged into core calcite. This enables us to take advantage of Calcite JDBC as a way to use Beam SQL. As part of that I need to reconcile the Beam SQL Environments with the Calcite Schema (which is calcite's environment). We currently have copies of our tables in the Beam meta/store, Calcite Schema, BeamSqlEnv, and BeamQueryPlanner. I have a pending PR which merges the later two to just use the Calcite Schema copy. Merging the Beam MetaStore and Calcite Schema isn't as simple. I have two options I'm looking for feedback on: 1. Make Calcite Schema authoritative and demote MetaStore to be something more like a Calcite TableFactory. Calcite Schema already implements the semantics of our InMemoryMetaStore. If the Store interface is just over built, this approach would result in a significant reduction in code. This would however eliminate the CRUD part of the interface leaving just the buildBeamSqlTable function. 2. Pass the Beam MetaStore into Calcite wrapped with a class translating to Calcite Schema (like we do already with tables). Instead of copying tables into the Calcite Schema we would pass in Beam meta/store as the source of truth and Calcite would manipulate tables directly in the Beam meta/store. This is a bit more complicated but retains the ability for DDL operations to be processed by a custom MetaStore. Thoughts? Andrew
Review by Apr 26 [Proposal] Fn API : Defining and adding SDK Metrics
Hello, I am created a new thread for this updated document, I will finalize this document on Apr 26, and begin writing PRs next week. I have taken feedback and have incorporated these ideas into the proposal. I have listed some areas for discussion above, https://s.apache.org/beam-fn-api-metrics Today, I will be updating one part of the document. I will be reserarching and proposing a new DistributionMetric format, to be compatible with metrics collection systems such as Stackdriver nad Dropwizard. The existing proto we had is not compatible with these systems, as it specified nothing about bucketing the distribution into a histogram.
Re: Request to become contributor and have BEAM-4096 assigned
Thank you so much! This one seems architecturally not possible but I'll try to help with others. On Mon, 23 Apr 2018, 20:03 Ismaël Mejía,wrote: > Done, assigned the ticket. You now have the contributor role so you can > self-assign other tickets in the future. > Welcome! > > > On Mon, Apr 23, 2018 at 5:34 AM, Jean-Baptiste Onofré > wrote: > >> Welcome aboard ! >> >> Pretty sure a PMC member will do the change in Jira quickly >> (unfortunately I don't have a laptop with me, still on vacation for couple >> of days ) >> >> Regards >> JB >> Le 23 avr. 2018, à 07:03, Jan Peuker a écrit: >>> >>> Hi Beam Developer community, >>> >>> this is Jan, I am a Strategic Cloud Engineer for Google in Singapore and >>> would love to contribute to Apache Beam if possible. >>> >>> Could someone assign me (janpeuker) the Contributor role in JIRA [1] so >>> that I can take BEAM-4096, some minor enhancements in the API to support >>> ValueProviders? >>> >>> The majority of the code work is done and I'm preparing a properly >>> tested PR now. >>> >>> Thank you very much, looking forward to work with you, >>> >>> jan >>> >>> [1] If I understand >>> https://beam.apache.org/contribute/contribution-guide/#jira-issue-tracker >>> correctly >>> >>> -- >>> Jan Peuker | Google Cloud SCE | janpeu...@google.com | Singapore >>> >>> > smime.p7s Description: S/MIME Cryptographic Signature
Re: Request to become contributor and have BEAM-4096 assigned
Done, assigned the ticket. You now have the contributor role so you can self-assign other tickets in the future. Welcome! On Mon, Apr 23, 2018 at 5:34 AM, Jean-Baptiste Onofréwrote: > Welcome aboard ! > > Pretty sure a PMC member will do the change in Jira quickly (unfortunately > I don't have a laptop with me, still on vacation for couple of days ) > > Regards > JB > Le 23 avr. 2018, à 07:03, Jan Peuker a écrit: >> >> Hi Beam Developer community, >> >> this is Jan, I am a Strategic Cloud Engineer for Google in Singapore and >> would love to contribute to Apache Beam if possible. >> >> Could someone assign me (janpeuker) the Contributor role in JIRA [1] so >> that I can take BEAM-4096, some minor enhancements in the API to support >> ValueProviders? >> >> The majority of the code work is done and I'm preparing a properly tested >> PR now. >> >> Thank you very much, looking forward to work with you, >> >> jan >> >> [1] If I understand https://beam.apache.org/ >> contribute/contribution-guide/#jira-issue-tracker correctly >> >> -- >> Jan Peuker | Google Cloud SCE | janpeu...@google.com | Singapore >> >>
performance tests of spark fail
Hi guys, I noticed a failure in the performance tests job for spark (I did not take a look at the others): it seems to be related to a schema update in the bigQuery output. BigQuery error in load operation: Error processing job 'apache-beam-testing:bqjob_r2527a0e444514f2b_0162f128db2b_1': Invalid schema update. Field timestamp has changed type from TIMESTAMP to FLOAT I opened a ticket to track the issue https://issues.apache.org/jira/browse/BEAM-4153 Best Etienne