[
https://issues.apache.org/jira/browse/BEAM-5959?focusedWorklogId=174357&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-174357
]
ASF GitHub Bot logged work on BEAM-5959:
----------------------------------------
Author: ASF GitHub Bot
Created on: 12/Dec/18 00:17
Start Date: 12/Dec/18 00:17
Worklog Time Spent: 10m
Work Description: chamikaramj commented on a change in pull request
#7050: [BEAM-5959] Reimplement GCS copies with rewrites.
URL: https://github.com/apache/beam/pull/7050#discussion_r240840990
##########
File path: sdks/python/apache_beam/io/gcp/gcsio_integration_test.py
##########
@@ -0,0 +1,178 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements. See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+"""Integration tests for gcsio module.
+
+Runs tests against Google Cloud Storage service.
+Instantiates a TestPipeline to get options such as GCP project name, but
+doesn't actually start a Beam pipeline or test any specific runner.
+
+Options:
+ --kms_key_name=projects/<project-name>/locations/<region>/keyRings/\
+ <key-ring-name>/cryptoKeys/<key-name>/cryptoKeyVersions/<version>
+ Pass a Cloud KMS key name to test GCS operations using customer managed
+ encryption keys (CMEK).
+
+Cloud KMS permissions:
+The project's Cloud Storage service account requires Encrypter/Decrypter
+permissions for the key specified in --kms_key_name.
+"""
+
+from __future__ import absolute_import
+
+import logging
+import unittest
+import uuid
+
+from nose.plugins.attrib import attr
+
+from apache_beam.io.filesystems import FileSystems
+from apache_beam.testing.test_pipeline import TestPipeline
+
+try:
+ from apache_beam.io.gcp import gcsio
+except ImportError:
+ gcsio = None
+
+
[email protected](gcsio is None, 'GCP dependencies are not installed')
+class GcsIOIntegrationTest(unittest.TestCase):
+
+ INPUT_FILE = 'gs://dataflow-samples/shakespeare/kinglear.txt'
+ # Larger than 1MB to test maxBytesRewrittenPerCall.
+ INPUT_FILE_LARGE = (
+ 'gs://dataflow-samples/wikipedia_edits/wiki_data-000000000000.json')
+
+ def setUp(self):
+ self.test_pipeline = TestPipeline(is_integration_test=True)
+ self.runner_name = type(self.test_pipeline.runner).__name__
+ if self.runner_name != 'TestDataflowRunner':
+ # This test doesn't run a pipeline, so it doesn't make sense to try it on
+ # different runners. Running with TestDataflowRunner makes sense since
+ # it will
Review comment:
Incomplete comment.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 174357)
> Add Cloud KMS support to GCS copies
> -----------------------------------
>
> Key: BEAM-5959
> URL: https://issues.apache.org/jira/browse/BEAM-5959
> Project: Beam
> Issue Type: Bug
> Components: io-java-gcp, sdk-py-core
> Reporter: Udi Meiri
> Assignee: Udi Meiri
> Priority: Major
> Time Spent: 3h 40m
> Remaining Estimate: 0h
>
> Beam SDK currently uses the CopyTo GCS API call, which doesn't support
> copying objects that Customer Managed Encryption Keys (CMEK).
> CMEKs are managed in Cloud KMS.
> Items (for Java and Python SDKs):
> - Update clients to versions that support KMS keys.
> - Change copyTo API calls to use rewriteTo (Python - directly, Java -
> possibly convert copyTo API call to use client library)
> - Add unit tests.
> - Add basic tests (DirectRunner and GCS buckets with CMEK).
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)