[SQL] Reconciling Beam SQL Environments with Calcite Schema

2018-04-23 Thread Andrew Pilloud
I'm working on updating our Beam DDL code to use the DDL execution
functionality that recently merged into core calcite. This enables us to
take advantage of Calcite JDBC as a way to use Beam SQL. As part of that I
need to reconcile the Beam SQL Environments with the Calcite Schema (which
is calcite's environment). We currently have copies of our tables in the
Beam meta/store, Calcite Schema, BeamSqlEnv, and BeamQueryPlanner. I have a
pending PR which merges the later two to just use the Calcite Schema copy.
Merging the Beam MetaStore and Calcite Schema isn't as simple. I have two
options I'm looking for feedback on:

1. Make Calcite Schema authoritative and demote MetaStore to be something
more like a Calcite TableFactory. Calcite Schema already implements the
semantics of our InMemoryMetaStore. If the Store interface is just over
built, this approach would result in a significant reduction in code. This
would however eliminate the CRUD part of the interface leaving just the
buildBeamSqlTable function.

2. Pass the Beam MetaStore into Calcite wrapped with a class translating to
Calcite Schema (like we do already with tables). Instead of copying tables
into the Calcite Schema we would pass in Beam meta/store as the source of
truth and Calcite would manipulate tables directly in the Beam meta/store.
This is a bit more complicated but retains the ability for DDL operations
to be processed by a custom MetaStore.

Thoughts?

Andrew


Review by Apr 26 [Proposal] Fn API : Defining and adding SDK Metrics

2018-04-23 Thread Alex Amato
Hello,

I am created a new thread for this updated document, I will finalize this
document on Apr 26, and begin writing PRs next week. I have taken feedback
and have incorporated these ideas into the proposal.

I have listed some areas for discussion above,
https://s.apache.org/beam-fn-api-metrics

Today, I will be updating one part of the document. I will be reserarching
and proposing a new DistributionMetric format, to be compatible with
metrics collection systems such as Stackdriver nad Dropwizard. The existing
proto we had is not compatible with these systems, as it specified nothing
about bucketing the distribution into a histogram.


Re: Request to become contributor and have BEAM-4096 assigned

2018-04-23 Thread Jan Peuker
Thank you so much!

This one seems architecturally not possible but I'll try to help with
others.

On Mon, 23 Apr 2018, 20:03 Ismaël Mejía,  wrote:

> Done, assigned the ticket. You now have the contributor role so you can
> self-assign other tickets in the future.
> Welcome!
>
>
> On Mon, Apr 23, 2018 at 5:34 AM, Jean-Baptiste Onofré 
> wrote:
>
>> Welcome aboard !
>>
>> Pretty sure a PMC member will do the change in Jira quickly
>> (unfortunately I don't have a laptop with me, still on vacation for couple
>> of days )
>>
>> Regards
>> JB
>> Le 23 avr. 2018, à 07:03, Jan Peuker  a écrit:
>>>
>>> Hi Beam Developer community,
>>>
>>> this is Jan, I am a Strategic Cloud Engineer for Google in Singapore and
>>> would love to contribute to Apache Beam if possible.
>>>
>>> Could someone assign me (janpeuker) the Contributor role in JIRA [1] so
>>> that I can take BEAM-4096, some minor enhancements in the API to support
>>> ValueProviders?
>>>
>>> The majority of the code work is done and I'm preparing a properly
>>> tested PR now.
>>>
>>> Thank you very much, looking forward to work with you,
>>>
>>> jan
>>>
>>> [1] If I understand
>>> https://beam.apache.org/contribute/contribution-guide/#jira-issue-tracker
>>> correctly
>>>
>>> --
>>> Jan Peuker | Google Cloud SCE | janpeu...@google.com | Singapore
>>>
>>>
>


smime.p7s
Description: S/MIME Cryptographic Signature


Re: Request to become contributor and have BEAM-4096 assigned

2018-04-23 Thread Ismaël Mejía
Done, assigned the ticket. You now have the contributor role so you can
self-assign other tickets in the future.
Welcome!


On Mon, Apr 23, 2018 at 5:34 AM, Jean-Baptiste Onofré 
wrote:

> Welcome aboard !
>
> Pretty sure a PMC member will do the change in Jira quickly (unfortunately
> I don't have a laptop with me, still on vacation for couple of days )
>
> Regards
> JB
> Le 23 avr. 2018, à 07:03, Jan Peuker  a écrit:
>>
>> Hi Beam Developer community,
>>
>> this is Jan, I am a Strategic Cloud Engineer for Google in Singapore and
>> would love to contribute to Apache Beam if possible.
>>
>> Could someone assign me (janpeuker) the Contributor role in JIRA [1] so
>> that I can take BEAM-4096, some minor enhancements in the API to support
>> ValueProviders?
>>
>> The majority of the code work is done and I'm preparing a properly tested
>> PR now.
>>
>> Thank you very much, looking forward to work with you,
>>
>> jan
>>
>> [1] If I understand https://beam.apache.org/
>> contribute/contribution-guide/#jira-issue-tracker correctly
>>
>> --
>> Jan Peuker | Google Cloud SCE | janpeu...@google.com | Singapore
>>
>>


performance tests of spark fail

2018-04-23 Thread Etienne Chauchot
Hi guys,

I noticed a failure in the performance tests job for spark (I did not take a 
look at the others): it seems to be related
to a schema update in the bigQuery output.

BigQuery error in load operation: Error processing job
'apache-beam-testing:bqjob_r2527a0e444514f2b_0162f128db2b_1': Invalid schema
update. Field timestamp has changed type from TIMESTAMP to FLOAT
I opened a ticket to track the issue 
https://issues.apache.org/jira/browse/BEAM-4153

Best

Etienne