[
https://issues.apache.org/jira/browse/BEAM-6553?focusedWorklogId=196040&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-196040
]
ASF GitHub Bot logged work on BEAM-6553:
----------------------------------------
Author: ASF GitHub Bot
Created on: 08/Feb/19 00:49
Start Date: 08/Feb/19 00:49
Worklog Time Spent: 10m
Work Description: pabloem commented on pull request #7655: [BEAM-6553] A
Python SDK sink that supports File Loads into BQ
URL: https://github.com/apache/beam/pull/7655#discussion_r254903024
##########
File path: sdks/python/apache_beam/io/gcp/bigquery_file_loads_test.py
##########
@@ -0,0 +1,498 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements. See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+"""Unit tests for BigQuery file loads utilities."""
+
+from __future__ import absolute_import
+
+import json
+import logging
+import os
+import random
+import time
+import unittest
+
+import mock
+from hamcrest.core import assert_that as hamcrest_assert
+from hamcrest.core.core.allof import all_of
+from hamcrest.core.core.is_ import is_
+from nose.plugins.attrib import attr
+
+import apache_beam as beam
+from apache_beam.io.filebasedsink_test import _TestCaseWithTempDirCleanUp
+from apache_beam.io.gcp import bigquery_file_loads as bqfl
+from apache_beam.io.gcp import bigquery
+from apache_beam.io.gcp import bigquery_tools
+from apache_beam.io.gcp.internal.clients import bigquery as bigquery_api
+from apache_beam.io.gcp.tests.bigquery_matcher import BigqueryFullResultMatcher
+from apache_beam.testing.test_pipeline import TestPipeline
+from apache_beam.testing.util import assert_that
+from apache_beam.testing.util import equal_to
+
+try:
+ from apitools.base.py.exceptions import HttpError
+except ImportError:
+ HttpError = None
+
+
+_DESTINATION_ELEMENT_PAIRS = [
+ # DESTINATION 1
+ ('project1:dataset1.table1', '{"name":"beam", "language":"py"}'),
+ ('project1:dataset1.table1', '{"name":"beam", "language":"java"}'),
+ ('project1:dataset1.table1', '{"name":"beam", "language":"go"}'),
+ ('project1:dataset1.table1', '{"name":"flink", "language":"java"}'),
+ ('project1:dataset1.table1', '{"name":"flink", "language":"scala"}'),
+
+ # DESTINATION 3
+ ('project1:dataset1.table3', '{"name":"spark", "language":"scala"}'),
+
+ # DESTINATION 1
+ ('project1:dataset1.table1', '{"name":"spark", "language":"py"}'),
+ ('project1:dataset1.table1', '{"name":"spark", "language":"scala"}'),
+
+ # DESTINATION 2
+ ('project1:dataset1.table2', '{"name":"beam", "foundation":"apache"}'),
+ ('project1:dataset1.table2', '{"name":"flink", "foundation":"apache"}'),
+ ('project1:dataset1.table2', '{"name":"spark", "foundation":"apache"}'),
+]
+
+_NAME_LANGUAGE_ELEMENTS = [
+ json.loads(elm[1])
+ for elm in _DESTINATION_ELEMENT_PAIRS if "language" in elm[1]
+]
+
+
+_DISTINCT_DESTINATIONS = list(
+ set([elm[0] for elm in _DESTINATION_ELEMENT_PAIRS]))
+
+
+_ELEMENTS = list([json.loads(elm[1]) for elm in _DESTINATION_ELEMENT_PAIRS])
+
+
[email protected](HttpError is None, 'GCP dependencies are not installed')
Review comment:
Not in this class, but we still depend on some of the `gcp` dependencies.
This is a simple quick unittest.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 196040)
Time Spent: 4.5h (was: 4h 20m)
> A BigQuery sink thta is SDK-implemented and supports file loads in Python
> -------------------------------------------------------------------------
>
> Key: BEAM-6553
> URL: https://issues.apache.org/jira/browse/BEAM-6553
> Project: Beam
> Issue Type: Improvement
> Components: sdk-py-core
> Reporter: Pablo Estrada
> Assignee: Pablo Estrada
> Priority: Major
> Labels: triaged
> Time Spent: 4.5h
> Remaining Estimate: 0h
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)