[beam] branch master updated: Update SQL BigQuery doc

amaliujia Tue, 10 Dec 2019 16:59:16 -0800

This is an automated email from the ASF dual-hosted git repository.

amaliujia pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git



The following commit(s) were added to refs/heads/master by this push:
     new 92e92bc  Update SQL BigQuery doc
     new 11c60b8  Merge pull request #10260 from 11moon11/UpdateBigQueryDoc
92e92bc is described below

commit 92e92bc0b8fb01b9395e6480480a81832a86111f
Author: kirillkozlov <[email protected]>
AuthorDate: Mon Dec 2 16:11:16 2019 -0800

    Update SQL BigQuery doc
---
 .../dsls/sql/extensions/create-external-table.md   | 23 ++++++++++++++++++----
 1 file changed, 19 insertions(+), 4 deletions(-)

diff --git 
a/website/src/documentation/dsls/sql/extensions/create-external-table.md 
b/website/src/documentation/dsls/sql/extensions/create-external-table.md
index 81d7dae..2489bb3 100644
--- a/website/src/documentation/dsls/sql/extensions/create-external-table.md
+++ b/website/src/documentation/dsls/sql/extensions/create-external-table.md
@@ -89,18 +89,33 @@ tableElement: columnName fieldType [ NOT NULL ]
 CREATE EXTERNAL TABLE [ IF NOT EXISTS ] tableName (tableElement [, 
tableElement ]*)
 TYPE bigquery
 LOCATION '[PROJECT_ID]:[DATASET].[TABLE]'
+TBLPROPERTIES '{"method": "DEFAULT"}'
 ```
 
-*   `LOCATION:`Location of the table in the BigQuery CLI format.
-    *   `PROJECT_ID`: ID of the Google Cloud Project
-    *   `DATASET`: BigQuery Dataset ID
-    *   `TABLE`: BigQuery Table ID within the Dataset
+*   `LOCATION`: Location of the table in the BigQuery CLI format.
+    *   `PROJECT_ID`: ID of the Google Cloud Project.
+    *   `DATASET`: BigQuery Dataset ID.
+    *   `TABLE`: BigQuery Table ID within the Dataset.
+*   `TBLPROPERTIES`:
+    *   `method`: Optional. Read method to use. Following options are 
available:
+        *   `DEFAULT`: If no property is set, will be used as default. 
Currently uses `EXPORT`.
+        *   `DIRECT_READ`: Use the BigQuery Storage API.
+        *   `EXPORT`: Export data to Google Cloud Storage in Avro format and 
read data files from that location.
 
 ### Read Mode
 
 Beam SQL supports reading columns with simple types (`simpleType`) and arrays 
of simple
 types (`ARRAY<simpleType>`).
 
+When reading using `EXPORT` method the following pipeline options should be 
set:
+*   `project`: ID of the Google Cloud Project.
+*   `tempLocation`: Bucket to store intermediate data in. Ex: 
`gs://temp-storage/temp`.
+
+When reading using `DIRECT_READ` method, an optimizer will attempt to perform
+project and predicate push-down, potentially reducing the time requited to 
read the data from BigQuery.
+
+More information about the BigQuery Storage API can be found 
[here](https://beam.apache.org/documentation/io/built-in/google-bigquery/#storage-api).
+
 ### Write Mode
 
 if the table does not exist, Beam creates the table specified in location when

[beam] branch master updated: Update SQL BigQuery doc

Reply via email to