[ 
https://issues.apache.org/jira/browse/BEAM-7382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Valentyn Tymofieiev updated BEAM-7382:
--------------------------------------
    Status: Open  (was: Triage Needed)

> Bigquery IO: schema autodetection failing
> -----------------------------------------
>
>                 Key: BEAM-7382
>                 URL: https://issues.apache.org/jira/browse/BEAM-7382
>             Project: Beam
>          Issue Type: Bug
>          Components: sdk-py-core
>            Reporter: Juta Staes
>            Assignee: Pablo Estrada
>            Priority: Major
>
> I am working on writing it tests for bigquery io on the dataflowrunner.
> When testing the schema auto detection I get:
> {code:java}
> ERROR: test_big_query_write_schema_autodetect 
> (apache_beam.io.gcp.bigquery_write_it_test.BigQueryWriteIntegrationTests)*12:41:01*
>  
> ----------------------------------------------------------------------*12:41:01*
>  Traceback (most recent call last):*12:41:01*   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify_PR/src/sdks/python/apache_beam/io/gcp/bigquery_write_it_test.py",
>  line 156, in test_big_query_write_schema_autodetect*12:41:01*     
> write_disposition=beam.io.BigQueryDisposition.WRITE_EMPTY))*12:41:01*   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify_PR/src/sdks/python/apache_beam/pipeline.py",
>  line 426, in __exit__*12:41:01*     self.run().wait_until_finish()*12:41:01* 
>   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify_PR/src/sdks/python/apache_beam/pipeline.py",
>  line 419, in run*12:41:01*     return self.runner.run_pipeline(self, 
> self._options)*12:41:01*   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify_PR/src/sdks/python/apache_beam/runners/dataflow/test_dataflow_runner.py",
>  line 64, in run_pipeline*12:41:01*     
> self.result.wait_until_finish(duration=wait_duration)*12:41:01*   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify_PR/src/sdks/python/apache_beam/runners/dataflow/dataflow_runner.py",
>  line 1322, in wait_until_finish*12:41:01*     (self.state, 
> getattr(self._runner, 'last_error_msg', None)), self)*12:41:01* 
> apache_beam.runners.dataflow.dataflow_runner.DataflowRuntimeException: 
> Dataflow pipeline failed. State: FAILED, Error:*12:41:01* Workflow failed. 
> Causes: S01:create/Read+write/WriteToBigQuery/NativeWrite failed., BigQuery 
> import job "dataflow_job_18059625072014532771-B" failed., BigQuery job 
> "dataflow_job_18059625072014532771-B" in project "apache-beam-testing" 
> finished with error(s): errorResult: No schema specified on job or table., 
> error: No schema specified on job or table.
> {code}
> test code:
> {code:java}
> input_data = [
>     {'number': 1, 'str': 'abc'},
>     {'number': 2, 'str': 'def'},
> ]
> with beam.Pipeline(argv=args) as p:
>   (p | 'create' >> beam.Create(input_data)
>    | 'write' >> beam.io.WriteToBigQuery(
>        output_table,
>        schema=beam.io.gcp.bigquery.SCHEMA_AUTODETECT,
>        create_disposition=beam.io.BigQueryDisposition.CREATE_IF_NEEDED,
>        write_disposition=beam.io.BigQueryDisposition.WRITE_EMPTY))
> {code}
> Is there something wrong with my test or is this a bug?
> link to pr: [https://github.com/apache/beam/pull/8621]
> cc: [~tvalentyn] 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to