[
https://issues.apache.org/jira/browse/BEAM-8743?focusedWorklogId=346134&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-346134
]
ASF GitHub Bot logged work on BEAM-8743:
----------------------------------------
Author: ASF GitHub Bot
Created on: 19/Nov/19 17:50
Start Date: 19/Nov/19 17:50
Worklog Time Spent: 10m
Work Description: TheNeuralBit commented on pull request #10158:
[BEAM-8743] Add support for flat schemas in pubsub
URL: https://github.com/apache/beam/pull/10158
This PR adds support for flat schemas in pubsub topics as discussed in this
[mailing list
thread](https://lists.apache.org/thread.html/bf4c37f21bda194d7f8c40f6e7b9a776262415755cc1658412af3c76@%3Cdev.beam.apache.org%3E).
With this change the following SQL query, which filters messages from one
pubsub topic and writes the result to another, is possible:
```sql
CREATE TABLE people (
event_timestamp TIMESTAMP,
name VARCHAR,
age INTEGER
)
TYPE 'pubsub'
LOCATION 'projects/my-project/topics/my-topic'
CREATE TABLE eligible_voters ...
INSERT INTO eligible_voters (
SELECT
name
FROM people
WHERE age >= 18
)
```
The change maintains support for the original `event_timestamp, payload,
attributes` style of table definition, but if a definition has a schema that
doesn't follow that pattern exactly, the new flattened approach is used.
A summary of the changes:
- Separated out the AutoValue-based configuration out of `PubsubIOJsonTable`
into `PubsubJsonTableProvider.PubsubIOTableConfiguration`. This configuration
is now referenced throughout the affected classes.
- Modified `PubsubMessageToRow` so it can read a flat schema.
- Add `RowToPubsubMessage` which (currently) only supports flat schemas.
Used in `PubsubIOJsonTable#buildIOWriter`.
The changes rely on the `JsonMatcher` added in #10094
Post-Commit Tests Status (on master branch)
------------------------------------------------------------------------------------------------
Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
--- | --- | --- | --- | --- | --- | --- | ---
Go | [](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
| --- | --- | [](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
| --- | --- | [](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
Java | [](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
| [](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
| [](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)
| [](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)<br>[](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)<br>[](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
| [](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
| [](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
| [](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)<br>[](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)
Python | [](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/)<br>[](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/)<br>[](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/)<br>[](https://builds.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/)
| --- | [](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/)<br>[](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/)
| [](https://builds.apache.org/job/beam_PreCommit_Python2_PVR_Flink_Cron/lastCompletedBuild/)<br>[](https://builds.apache.org/job/beam_PostCommit_Python35_VR_Flink/lastCompletedBuild/)
| --- | --- | [](https://builds.apache.org/job/beam_PostCommit_Python_VR_Spark/lastCompletedBuild/)
XLang | --- | --- | --- | [](https://builds.apache.org/job/beam_PostCommit_XVR_Flink/lastCompletedBuild/)
| --- | --- | ---
Pre-Commit Tests Status (on master branch)
------------------------------------------------------------------------------------------------
--- |Java | Python | Go | Website
--- | --- | --- | --- | ---
Non-portable | [](https://builds.apache.org/job/beam_PreCommit_Java_Cron/lastCompletedBuild/)
| [](https://builds.apache.org/job/beam_PreCommit_Python_Cron/lastCompletedBuild/)<br>[](https://builds.apache.org/job/beam_PreCommit_PythonLint_Cron/lastCompletedBuild/)
| [](https://builds.apache.org/job/beam_PreCommit_Go_Cron/lastCompletedBuild/)
| [](https://builds.apache.org/job/beam_PreCommit_Website_Cron/lastCompletedBuild/)
Portable | --- | [](https://builds.apache.org/job/beam_PreCommit_Portable_Python_Cron/lastCompletedBuild/)
| --- | ---
See
[.test-infra/jenkins/README](https://github.com/apache/beam/blob/master/.test-infra/jenkins/README.md)
for trigger phrase, status and link of all Jenkins jobs.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 346134)
Remaining Estimate: 0h
Time Spent: 10m
> Add support for flat schemas in pubsub
> --------------------------------------
>
> Key: BEAM-8743
> URL: https://issues.apache.org/jira/browse/BEAM-8743
> Project: Beam
> Issue Type: Improvement
> Components: dsl-sql
> Reporter: Brian Hulette
> Assignee: Brian Hulette
> Priority: Major
> Fix For: 2.18.0
>
> Time Spent: 10m
> Remaining Estimate: 0h
>
> See
> https://lists.apache.org/thread.html/bf4c37f21bda194d7f8c40f6e7b9a776262415755cc1658412af3c76@%3Cdev.beam.apache.org%3E
--
This message was sent by Atlassian Jira
(v8.3.4#803005)