TheNeuralBit commented on code in PR #17159:
URL: https://github.com/apache/beam/pull/17159#discussion_r874104866


##########
sdks/python/apache_beam/io/gcp/bigquery_read_it_test.py:
##########
@@ -178,6 +178,31 @@ def test_iobase_source(self):
               query=query, use_standard_sql=True, project=self.project))
       assert_that(result, equal_to(self.TABLE_DATA))
 
+  @pytest.mark.it_postcommit
+  def test_table_schema_retrieve(self):
+    the_table = 
beam.io.gcp.bigquery.bigquery_tools.BigQueryWrapper().get_table(
+        project_id="apache-beam-testing",
+        dataset_id="beam_bigquery_io_test",
+        table_id="dfsqltable_3c7d6fd5_16e0460dfd0")
+    table = the_table.schema
+    utype = beam.io.gcp.bigquery_schema_tools.produce_pcoll_with_schema(table)
+    args = self.args + ["--experiments=save_main_session"]

Review Comment:
   Glad this got the test passing!!
   
   This isn't ideal though, since it would mean users of this feature would 
need to make sure to always save main session or else their pipeline will fail. 
I'm little surprised this worked too - isn't the argument `--save_main_session` 
not `--experiments=save_main_session`?
   
   Regardless, we need to find a solution that will work without 
save_main_session set. It looks like the solution in the DataFrame schema code 
was to create a DoFn with a custom `__reduce__` implementation that avoids 
pickling the user type: 
https://github.com/apache/beam/blob/03c3c3657ea51a60e301a25eef70d006fe8cc0e2/sdks/python/apache_beam/dataframe/schemas.py#L254-L258
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to