Re: [PROPOSAL] Add support for writing flattened schemas to pubsub

2020-01-28 Thread Brian Hulette
I filed a few jiras to track the follow-up work we discussed here: BEAM-9208 [1] - Add support for mapping columns to pubsub message attributes in flat schemas DDL BEAM-9209 [2] - Add support for mapping columns to pubsub message event_timestamp when using flat schemas DDL BEAM-9210 [3] - Deprecat

Re: [PROPOSAL] Add support for writing flattened schemas to pubsub

2019-11-21 Thread Brian Hulette
A PR is up here [1]. Gleb: If I understand what you're saying, I think it's already implemented the way you're describing - PubsubIOJsonTable [2] is just a thin wrapper that connects PubsubIO with Beam SQL tables. Alex/Kenn: I agree with everything you've said :) The hard-coded event_timestamp is

Re: [PROPOSAL] Add support for writing flattened schemas to pubsub

2019-11-18 Thread Kenneth Knowles
I like Alex's syntax suggestion. Very readable. In addition to tables defined via DDL, we also have a metastore abstraction that currently supports Hive Metastore and Google's Data Catalog. We should think about how something like what Alex describes could be served by these systems. Kenn On Sun,

Re: [PROPOSAL] Add support for writing flattened schemas to pubsub

2019-11-17 Thread Reza Rokni
+1 to reduced boiler plate for basic things folks want to do with SQL. I like Alex use of Option for more advanced use cases. On Sun, 17 Nov 2019 at 20:17, Gleb Kanterov wrote: > Expanding on what Kenn said regarding having fewer dependencies on SQL. > Can the whole thing be seen as extending P

Re: [PROPOSAL] Add support for writing flattened schemas to pubsub

2019-11-17 Thread Gleb Kanterov
Expanding on what Kenn said regarding having fewer dependencies on SQL. Can the whole thing be seen as extending PubSubIO, that would implement most of the logic from the proposal, given column annotations, and then having a thin layer that connects it with Beam SQL tables? On Sun, Nov 17, 2019 at

Re: [PROPOSAL] Add support for writing flattened schemas to pubsub

2019-11-17 Thread Alex Van Boxel
I like it, but I'm worried about the magic event_timestamp injection. Wouldn't explicit injection via option not be a better approach: CREATE TABLE people ( my_timestamp TIMESTAMP *OPTION(ref="pubsub:event_timestamp)*, my_id VARCHAR *OPTION(ref="pubsub:attributes['id_name']")*, name VA

Re: [PROPOSAL] Add support for writing flattened schemas to pubsub

2019-11-16 Thread Kenneth Knowles
Big +1 from me. Nice explanation. This makes a lot of sense. Much simpler to understand with fewer magic strings. It also makes the Beam SQL connector less dependent on newer SQL features that are simply less widespread. I'm not too surprised that Calcite's nested row support lags behind the rest

[PROPOSAL] Add support for writing flattened schemas to pubsub

2019-11-13 Thread Brian Hulette
I've been looking into adding support for writing (i.e. INSERT INTO statements) for the pubsub DDL, which currently only supports reading. This DDL requires the defined schema to have exactly three fields: event_timestamp, attributes, and payload, corresponding to the fields in PubsubMessage (event