alxp1982 commented on code in PR #25301:
URL: https://github.com/apache/beam/pull/25301#discussion_r1124077006


##########
learning/tour-of-beam/learning-content/IO/big-query-io/beam-schema/java-example/Task.java:
##########
@@ -0,0 +1,133 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+/*
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+// beam-playground:
+//   name: beam-schema
+//   description: BiqQueryIO beam-schema example.
+//   multifile: false
+//   context_line: 56
+//   categories:
+//     - Quickstart
+//   complexity: ADVANCED
+//   tags:
+//     - hellobeam
+
+import com.fasterxml.jackson.databind.ObjectMapper;
+import org.apache.beam.sdk.Pipeline;
+import org.apache.beam.sdk.coders.Coder;
+import org.apache.beam.sdk.io.gcp.bigquery.BigQueryIO;
+import org.apache.beam.sdk.io.gcp.bigquery.BigQueryOptions;
+import org.apache.beam.sdk.options.PipelineOptions;
+import org.apache.beam.sdk.options.PipelineOptionsFactory;
+import org.apache.beam.sdk.schemas.Schema;
+import org.apache.beam.sdk.transforms.DoFn;
+import org.apache.beam.sdk.transforms.MapElements;
+import org.apache.beam.sdk.util.StreamUtils;
+import org.apache.beam.sdk.values.PCollection;
+import org.apache.beam.sdk.values.TypeDescriptor;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+import java.io.InputStream;
+import java.io.OutputStream;
+import java.util.Collections;
+import java.util.List;
+
+public class Task {
+
+    private static final Logger LOG = LoggerFactory.getLogger(Task.class);
+
+    public static void main(String[] args) {
+        LOG.info("Running Task");
+        System.setProperty("GOOGLE_APPLICATION_CREDENTIALS", 
"to\\path\\credential.json");
+        PipelineOptions options = 
PipelineOptionsFactory.fromArgs(args).create();
+        options.setTempLocation("gs://bucket");
+        options.as(BigQueryOptions.class).setProject("project-id");
+
+        Pipeline pipeline = Pipeline.create(options);
+
+        Schema inputSchema = Schema.builder()
+                .addField("id", Schema.FieldType.INT32)
+                .addField("name", Schema.FieldType.STRING)
+                .addField("age", Schema.FieldType.INT32)
+                .build();
+
+        /*

Review Comment:
   Why is it commented?



##########
learning/tour-of-beam/learning-content/IO/big-query-io/read-query/description.md:
##########
@@ -0,0 +1,41 @@
+<!--
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+-->
+### BigQuery reading with query
+
+`BigQueryIO` allows you to read from a `BigQuery` table and read the results. 
By default, Beam invokes a `BigQuery` export request when you apply a 
BigQueryIO read transform. readTableRows returns a PCollection of BigQuery 
TableRow objects. Each element in the `PCollection` represents a single row in 
the table. `Integer` values in the `TableRow` objects are encoded as strings to 
match `BigQuery`’s exported JSON format. This method is convenient, but can be 
2-3 times slower in performance compared to `read(SerializableFunction)`.

Review Comment:
   Please make content specific to reading through query mechanism



##########
learning/tour-of-beam/learning-content/IO/big-query-io/beam-schema/description.md:
##########
@@ -0,0 +1,27 @@
+<!--
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+-->
+### BigQuery with beam-schema
+
+The `useBeamSchema` method is a method provided by the BigQueryIO class in 
Apache Beam to specify whether to use Beam's internal schema representation or 
BigQuery's native table schema when reading or writing data to BigQuery.
+
+When you set `useBeamSchema` to true, Beam will use its internal schema 
representation when reading or writing data to BigQuery. This allows for more 
flexibility when working with the data, as Beam's schema representation 
supports more data types and allows for more advanced schema manipulation.
+
+When you set `useBeamSchema` to false, Beam will use the native table schema 
of the BigQuery table when reading or writing data. This can be useful when you 
want to ensure that the data is written to BigQuery in a format that is 
compatible with other tools that read from the same table.
+
+Here is an example of how you might use the useBeamSchema method when reading 
data from a BigQuery table:
+
+```
+p.apply("ReadFromBigQuery",
+    BigQueryIO.write().to("mydataset.outputtable").useBeamSchema())
+```

Review Comment:
   Please provide runnable example description



##########
learning/tour-of-beam/learning-content/IO/big-query-io/read-table/description.md:
##########
@@ -0,0 +1,40 @@
+<!--
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+-->
+### BigQuery reading table
+
+`BigQueryIO` allows you to read from a `BigQuery` table and read the results. 
By default, Beam invokes a `BigQuery` export request when you apply a 
BigQueryIO read transform. readTableRows returns a PCollection of BigQuery 
TableRow objects. Each element in the `PCollection` represents a single row in 
the table. `Integer` values in the `TableRow` objects are encoded as strings to 
match `BigQuery`’s exported JSON format. This method is convenient, but can be 
2-3 times slower in performance compared to `read(SerializableFunction)`.
+
+{{if (eq .Sdk "go")}}
+
+```
+rows := bigqueryio.Read(s, bigquery.TableReference{ProjectID: projectID, 
DatasetID: datasetID, TableID: tableID})
+beam.ParDo0(s, &logOutput{}, rows)
+```
+{{end}}
+{{if (eq .Sdk "java")}}
+```
+ PCollection<MyData> rows =
+        pipeline
+            .apply(
+                "Read from BigQuery query",
+                BigQueryIO.readTableRows().from("tess-372508.fir.xasw")
+```
+{{end}}
+{{if (eq .Sdk "python")}}
+```
+pipeline
+  | 'ReadTable' >> beam.io.ReadFromBigQuery(table=table_spec) \
+  | beam.Map(lambda elem: elem['max_temperature'])
+```
+{{end}}

Review Comment:
   Please add runnable example description



##########
learning/tour-of-beam/learning-content/IO/big-query-io/read-query/description.md:
##########
@@ -0,0 +1,41 @@
+<!--
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+-->
+### BigQuery reading with query

Review Comment:
   Reading BigQuery query results



##########
learning/tour-of-beam/learning-content/IO/big-query-io/read-table/description.md:
##########
@@ -0,0 +1,40 @@
+<!--
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+-->
+### BigQuery reading table

Review Comment:
   Reading BigQuery table



##########
learning/tour-of-beam/learning-content/IO/big-query-io/read-table/description.md:
##########
@@ -0,0 +1,40 @@
+<!--
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+-->
+### BigQuery reading table
+
+`BigQueryIO` allows you to read from a `BigQuery` table and read the results. 
By default, Beam invokes a `BigQuery` export request when you apply a 
BigQueryIO read transform. readTableRows returns a PCollection of BigQuery 
TableRow objects. Each element in the `PCollection` represents a single row in 
the table. `Integer` values in the `TableRow` objects are encoded as strings to 
match `BigQuery`’s exported JSON format. This method is convenient, but can be 
2-3 times slower in performance compared to `read(SerializableFunction)`.

Review Comment:
   `BigQueryIO` allows you to read from a `BigQuery` table and read the 
results. By default, Beam invokes a `BigQuery` export request when you apply a 
BigQueryIO read transform. In Java Beam SDK, readTableRows returns a 
PCollection of BigQuery TableRow objects. Each element in the `PCollection` 
represents a single row in the table. 
   
   > `Integer` values in the `TableRow` objects are encoded as strings to match 
`BigQuery`’s exported JSON format. This method is convenient but has a 
performance impact. Alternatively, you can use `read(SerializableFunction)` 
method to avoid this. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to