[ 
https://issues.apache.org/jira/browse/BEAM-1909?focusedWorklogId=139423&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-139423
 ]

ASF GitHub Bot logged work on BEAM-1909:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 29/Aug/18 20:43
            Start Date: 29/Aug/18 20:43
    Worklog Time Spent: 10m 
      Work Description: udim commented on a change in pull request #5435: 
[BEAM-1909] Fix BigQuery read transform fails for DirectRunner when querying 
non-US regions
URL: https://github.com/apache/beam/pull/5435#discussion_r213769013
 
 

 ##########
 File path: sdks/python/apache_beam/io/gcp/bigquery.py
 ##########
 @@ -641,19 +642,76 @@ def __init__(self, source, test_bigquery_client=None, 
use_legacy_sql=True,
     else:
       self.query = self.source.query
 
+  def _parse_results(self, mg, project=False):
+    """
+    Extract matched groups from regex match.
+    If project is provided, retrienve 3 matched groups, else retrieve 2 groups.
+    :param mg: matched group
+    :param project: project passed in if not matched in regex
+    """
+    if project:
+      return project, mg.group(1), mg.group(2)
+    else:
+      try:
+        return mg.group(1), mg.group(2), mg.group(3)
+      except IndexError:
+        return (None for _ in range(3))  # No location, not a breaking change
+
+  def _parse_results(self, project_regex, not_project_regex):
+    """
+    Extract matched groups from query given regexes passed into method.
+    Given two regexes, return three items:
+      projectID, datasetID and tableID. If prejectID is not provided in the
+      query, try and get the projet from the Class. Else, return None.
+    :param project_regex: Regex to match the full name of the dataset.
+      i.e. project.dataset.table
+    :param not_project_regex: Regex to match table and dataset when project is
+      not provided.
+      ie. dataset.table
+    :return: project, dataset, table
+    """
+    pm = re.search(project_regex, self.source.query)
+    if pm:
+      return pm.group(1), pm.group(2), pm.group(3)
+    else:
+      npm = re.search(not_project_regex, self.source.query)
+      if npm:
+        if self.source.project:
+          return self.source.project, npm.group(1), npm.group(2)
+    return (None for _ in range(3))  # No matches
+
+  def _parse_query(self):
+    """
+    Parse the query provided to determine the datasetId and Table id.
+
+    The query will have text of the form "FROM `(x).y.z`" or "FROM [(x):y.z]"
+     based on whether legacy or standard sql were provided.
+    """
+    if not self.source.use_legacy_sql:
+      return self._parse_results(
+          r'.*[Ff][Rr][Oo][Mm]\s*`([-\w]+)\.([-\w]+)\.([-\w]+)`',
 
 Review comment:
   Note that `_parse_table_reference` (in this module) can split `(x).y.z` 
style table names.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

    Worklog Id:     (was: 139423)
    Time Spent: 2.5h  (was: 2h 20m)

> BigQuery read transform fails for DirectRunner when querying non-US regions
> ---------------------------------------------------------------------------
>
>                 Key: BEAM-1909
>                 URL: https://issues.apache.org/jira/browse/BEAM-1909
>             Project: Beam
>          Issue Type: Bug
>          Components: sdk-py-core
>            Reporter: Chamikara Jayalath
>            Priority: Major
>          Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> See: 
> http://stackoverflow.com/questions/42135002/google-dataflow-cannot-read-and-write-in-different-locations-python-sdk-v0-5-5/42144748?noredirect=1#comment73621983_42144748
> This should be fixed by creating the temp dataset and table in the correct 
> region.
> cc: [~sb2nov]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to