Joar Wandborg created BEAM-5457:
-----------------------------------
Summary: BigQuerySource(query=...) in DirectRunner creates temp
dataset in the wrong location
Key: BEAM-5457
URL: https://issues.apache.org/jira/browse/BEAM-5457
Project: Beam
Issue Type: Bug
Components: sdk-py-core
Affects Versions: 2.6.0
Reporter: Joar Wandborg
Assignee: Ahmet Altay
I'm in the EU, if I have a
{code:java}
BigQuerySource(
query="SELECT x, y FROM `my-other-project.mydataset.my_european_table`",
project="myproject",
use_standard_sql=True
){code}
And then run the Pipeline through the DirectRunner I get the following warning
and error:
{noformat}
2018-09-21 11:39:52,620 WARNING root create_temporary_dataset
Dataset myproject:temp_dataset_0bbb28f014a24225b668a67341f4f71e does not exist
so we will create it as temporary with location=None
{noformat}
{noformat}
HttpBadRequestError: HttpError accessing
<https://www.googleapis.com/bigquery/v2/projects/myproject/queries/xyz123?alt=json&maxResults=10000>:
response: <{'status': '400', 'content-length': '354', 'x-xss-protection': '1;
mode=block', 'x-content-type-options': 'nosniff', 'transfer-encoding':
'chunked', 'vary': 'Origin, X-Origin, Referer', 'server': 'ESF',
'-content-encoding': 'gzip', 'cache-control': 'private', 'date': 'Fri, 21 Sep
2018 09:39:55 GMT', 'x-frame-options': 'SAMEORIGIN', 'alt-svc': 'quic=":443";
ma=2592000; v="44,43,39,35"', 'content-type': 'application/json;
charset=UTF-8'}>, content <{
"error": {
"code": 400,
"message": "Cannot read and write in different locations: source: EU,
destination: US",
"errors": [
{
"message": "Cannot read and write in different locations: source: EU,
destination: US",
"domain": "global",
"reason": "invalid"
}
],
"status": "INVALID_ARGUMENT"
}
{noformat}
There's a TODO in the code that looks very related:
[https://github.com/apache/beam/blob/d691a86b8fd082efd0fd71c3cb58b7d61442717d/sdks/python/apache_beam/io/gcp/bigquery.py#L665|https://github.com/apache/beam/blob/d691a86b8fd082efd0fd71c3cb58b7d61442717d/sdks/python/apache_beam/io/gcp/bigquery.py#L665,]
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)