SamWheating commented on a change in pull request #17446:
URL: https://github.com/apache/airflow/pull/17446#discussion_r718913319
##########
File path:
airflow/providers/google/cloud/operators/cloud_storage_transfer_service.py
##########
@@ -1053,8 +1075,8 @@ def _create_body(self) -> dict:
DESCRIPTION: self.description,
STATUS: GcpTransferJobsStatus.ENABLED,
TRANSFER_SPEC: {
- GCS_DATA_SOURCE: {BUCKET_NAME: self.source_bucket},
- GCS_DATA_SINK: {BUCKET_NAME: self.destination_bucket},
+ GCS_DATA_SOURCE: {BUCKET_NAME: self.source_bucket, PATH:
self.source_path},
Review comment:
Do you think its ok to pass `None` in here? From the docs:
_"Must be an empty string or full path name that ends with a '/'. This field
is treated as an object prefix. As such, it should generally not begin with a
'/'."_
https://cloud.google.com/storage-transfer/docs/reference/rest/v1/TransferSpec#awss3data
If this is the case then maybe the default value for paths should be `""`,
or we should be only adding `PATH` to this dictionary if it is not None.
##########
File path:
airflow/providers/google/cloud/operators/cloud_storage_transfer_service.py
##########
@@ -1053,8 +1075,8 @@ def _create_body(self) -> dict:
DESCRIPTION: self.description,
STATUS: GcpTransferJobsStatus.ENABLED,
TRANSFER_SPEC: {
- GCS_DATA_SOURCE: {BUCKET_NAME: self.source_bucket},
- GCS_DATA_SINK: {BUCKET_NAME: self.destination_bucket},
+ GCS_DATA_SOURCE: {BUCKET_NAME: self.source_bucket, PATH:
self.source_path},
Review comment:
Do you think its ok to pass `None` in here? From the docs:
_"Must be an empty string or full path name that ends with a '/'. This field
is treated as an object prefix. As such, it should generally not begin with a
'/'."_
https://cloud.google.com/storage-transfer/docs/reference/rest/v1/TransferSpec#awss3data
If this is the case then maybe the default value for paths should be `""`,
or we should be only adding `PATH` to this dictionary if it is not `None`.
##########
File path:
airflow/providers/google/cloud/operators/cloud_storage_transfer_service.py
##########
@@ -1053,8 +1075,8 @@ def _create_body(self) -> dict:
DESCRIPTION: self.description,
STATUS: GcpTransferJobsStatus.ENABLED,
TRANSFER_SPEC: {
- GCS_DATA_SOURCE: {BUCKET_NAME: self.source_bucket},
- GCS_DATA_SINK: {BUCKET_NAME: self.destination_bucket},
+ GCS_DATA_SOURCE: {BUCKET_NAME: self.source_bucket, PATH:
self.source_path},
Review comment:
Do you think its ok to pass `None` in here? From the docs:
_"Must be an empty string or full path name that ends with a '/'. This field
is treated as an object prefix. As such, it should generally not begin with a
'/'."_
https://cloud.google.com/storage-transfer/docs/reference/rest/v1/TransferSpec#awss3data
If this is the case then maybe the default value for paths should be `""`,
or we should be only adding `PATH` to this dictionary if it is not `None`,
something like:
```python
body = {
DESCRIPTION: self.description,
STATUS: GcpTransferJobsStatus.ENABLED,
TRANSFER_SPEC: {
GCS_DATA_SOURCE: {BUCKET_NAME: self.source_bucket, PATH:
self.source_path},
GCS_DATA_SINK: {BUCKET_NAME: self.destination_bucket, PATH:
self.destination_path},
},
}
if self.source_path is not None:
body[TRANSFER_SPEC][GCS_DATA_SOURCE][PATH] = self.source_path
```
etc
##########
File path:
airflow/providers/google/cloud/operators/cloud_storage_transfer_service.py
##########
@@ -1053,8 +1075,8 @@ def _create_body(self) -> dict:
DESCRIPTION: self.description,
STATUS: GcpTransferJobsStatus.ENABLED,
TRANSFER_SPEC: {
- GCS_DATA_SOURCE: {BUCKET_NAME: self.source_bucket},
- GCS_DATA_SINK: {BUCKET_NAME: self.destination_bucket},
+ GCS_DATA_SOURCE: {BUCKET_NAME: self.source_bucket, PATH:
self.source_path},
Review comment:
Do you think its ok to pass `None` in here? From the docs:
_"Must be an empty string or full path name that ends with a '/'. This field
is treated as an object prefix. As such, it should generally not begin with a
'/'."_
https://cloud.google.com/storage-transfer/docs/reference/rest/v1/TransferSpec#awss3data
If this is the case then maybe the default value for paths should be `""`,
or we should be only adding `PATH` to this dictionary if it is not `None`,
something like:
```python
body = {
DESCRIPTION: self.description,
STATUS: GcpTransferJobsStatus.ENABLED,
TRANSFER_SPEC: {
GCS_DATA_SOURCE: {BUCKET_NAME: self.source_bucket},
GCS_DATA_SINK: {BUCKET_NAME: self.destination_bucket},
},
}
if self.source_path is not None:
body[TRANSFER_SPEC][GCS_DATA_SOURCE][PATH] = self.source_path
if self.destination_path is not None:
body[TRANSFER_SPEC][GCS_DATA_SINK][PATH] = self.destination_path
```
etc
##########
File path:
airflow/providers/google/cloud/operators/cloud_storage_transfer_service.py
##########
@@ -1053,8 +1075,8 @@ def _create_body(self) -> dict:
DESCRIPTION: self.description,
STATUS: GcpTransferJobsStatus.ENABLED,
TRANSFER_SPEC: {
- GCS_DATA_SOURCE: {BUCKET_NAME: self.source_bucket},
- GCS_DATA_SINK: {BUCKET_NAME: self.destination_bucket},
+ GCS_DATA_SOURCE: {BUCKET_NAME: self.source_bucket, PATH:
self.source_path},
Review comment:
Do you think its ok to pass `None` in here if the source /destination
paths are unspecified? From the docs:
_"Must be an empty string or full path name that ends with a '/'. This field
is treated as an object prefix. As such, it should generally not begin with a
'/'."_
https://cloud.google.com/storage-transfer/docs/reference/rest/v1/TransferSpec#awss3data
If this is the case then maybe the default value for paths should be `""`,
or we should be only adding `PATH` to this dictionary if it is not `None`,
something like:
```python
body = {
DESCRIPTION: self.description,
STATUS: GcpTransferJobsStatus.ENABLED,
TRANSFER_SPEC: {
GCS_DATA_SOURCE: {BUCKET_NAME: self.source_bucket},
GCS_DATA_SINK: {BUCKET_NAME: self.destination_bucket},
},
}
if self.source_path is not None:
body[TRANSFER_SPEC][GCS_DATA_SOURCE][PATH] = self.source_path
if self.destination_path is not None:
body[TRANSFER_SPEC][GCS_DATA_SINK][PATH] = self.destination_path
```
etc
##########
File path:
airflow/providers/google/cloud/operators/cloud_storage_transfer_service.py
##########
@@ -1053,8 +1075,8 @@ def _create_body(self) -> dict:
DESCRIPTION: self.description,
STATUS: GcpTransferJobsStatus.ENABLED,
TRANSFER_SPEC: {
- GCS_DATA_SOURCE: {BUCKET_NAME: self.source_bucket},
- GCS_DATA_SINK: {BUCKET_NAME: self.destination_bucket},
+ GCS_DATA_SOURCE: {BUCKET_NAME: self.source_bucket, PATH:
self.source_path},
Review comment:
Do you think its ok to pass `None` in here if the source / destination
paths are unspecified? From the docs:
_"Must be an empty string or full path name that ends with a '/'. This field
is treated as an object prefix. As such, it should generally not begin with a
'/'."_
https://cloud.google.com/storage-transfer/docs/reference/rest/v1/TransferSpec#awss3data
If this is the case then maybe the default value for paths should be `""`,
or we should be only adding `PATH` to this dictionary if it is not `None`,
something like:
```python
body = {
DESCRIPTION: self.description,
STATUS: GcpTransferJobsStatus.ENABLED,
TRANSFER_SPEC: {
GCS_DATA_SOURCE: {BUCKET_NAME: self.source_bucket},
GCS_DATA_SINK: {BUCKET_NAME: self.destination_bucket},
},
}
if self.source_path is not None:
body[TRANSFER_SPEC][GCS_DATA_SOURCE][PATH] = self.source_path
if self.destination_path is not None:
body[TRANSFER_SPEC][GCS_DATA_SINK][PATH] = self.destination_path
```
etc
##########
File path:
airflow/providers/google/cloud/operators/cloud_storage_transfer_service.py
##########
@@ -1053,8 +1075,8 @@ def _create_body(self) -> dict:
DESCRIPTION: self.description,
STATUS: GcpTransferJobsStatus.ENABLED,
TRANSFER_SPEC: {
- GCS_DATA_SOURCE: {BUCKET_NAME: self.source_bucket},
- GCS_DATA_SINK: {BUCKET_NAME: self.destination_bucket},
+ GCS_DATA_SOURCE: {BUCKET_NAME: self.source_bucket, PATH:
self.source_path},
Review comment:
Do you think its ok to pass `None` in here if the source / destination
paths are unspecified? From the docs:
_"path: Must be an empty string or full path name that ends with a '/'. This
field is treated as an object prefix. As such, it should generally not begin
with a '/'."_
https://cloud.google.com/storage-transfer/docs/reference/rest/v1/TransferSpec#awss3data
If this is the case then maybe the default value for paths should be `""`,
or we should be only adding `PATH` to this dictionary if it is not `None`,
something like:
```python
body = {
DESCRIPTION: self.description,
STATUS: GcpTransferJobsStatus.ENABLED,
TRANSFER_SPEC: {
GCS_DATA_SOURCE: {BUCKET_NAME: self.source_bucket},
GCS_DATA_SINK: {BUCKET_NAME: self.destination_bucket},
},
}
if self.source_path is not None:
body[TRANSFER_SPEC][GCS_DATA_SOURCE][PATH] = self.source_path
if self.destination_path is not None:
body[TRANSFER_SPEC][GCS_DATA_SINK][PATH] = self.destination_path
```
etc
##########
File path:
airflow/providers/google/cloud/operators/cloud_storage_transfer_service.py
##########
@@ -1053,8 +1075,8 @@ def _create_body(self) -> dict:
DESCRIPTION: self.description,
STATUS: GcpTransferJobsStatus.ENABLED,
TRANSFER_SPEC: {
- GCS_DATA_SOURCE: {BUCKET_NAME: self.source_bucket},
- GCS_DATA_SINK: {BUCKET_NAME: self.destination_bucket},
+ GCS_DATA_SOURCE: {BUCKET_NAME: self.source_bucket, PATH:
self.source_path},
Review comment:
Do you think its ok to pass `None` in here if the source / destination
paths are unspecified? From the docs:
_"path: Must be an empty string or full path name that ends with a '/'. This
field is treated as an object prefix. As such, it should generally not begin
with a '/'."_
https://cloud.google.com/storage-transfer/docs/reference/rest/v1/TransferSpec#awss3data
If this is the case then maybe the default value for paths should be `""`,
or we should be only adding `PATH` to this dictionary if it is not `None`,
something like:
```python
body = {
DESCRIPTION: self.description,
STATUS: GcpTransferJobsStatus.ENABLED,
TRANSFER_SPEC: {
GCS_DATA_SOURCE: {BUCKET_NAME: self.source_bucket},
GCS_DATA_SINK: {BUCKET_NAME: self.destination_bucket},
},
}
if self.source_path is not None:
body[TRANSFER_SPEC][GCS_DATA_SOURCE][PATH] = self.source_path
if self.destination_path is not None:
body[TRANSFER_SPEC][GCS_DATA_SINK][PATH] = self.destination_path
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]