[GitHub] yohei1126 commented on a change in pull request #4371: [AIRFLOW-2939][AIRFLOW-3568] fix TypeError on GoogleCloudStorageToS3Operator / S3ToGoogleCloudStorageOperator
yohei1126 commented on a change in pull request #4371: [AIRFLOW-2939][AIRFLOW-3568] fix TypeError on GoogleCloudStorageToS3Operator / S3ToGoogleCloudStorageOperator URL: https://github.com/apache/incubator-airflow/pull/4371#discussion_r244072388 ## File path: airflow/contrib/operators/gcs_to_s3.py ## @@ -101,7 +101,7 @@ def execute(self, context): # Google Cloud Storage and not in S3 bucket_name, _ = S3Hook.parse_s3_url(self.dest_s3_key) existing_files = s3_hook.list_keys(bucket_name) -files = set(files) - set(existing_files) +files = list(set(files) - set(existing_files)) Review comment: If you think `xcom` should support `set`, this is a bug on `xcom` since the current implementation uses `json.dumps` and it only supports JSON serializable ( `set` is not JSON serializable). ``` >>> list = [1,2,3,1] >>> json.dumps(list).encode('UTF-8') b'[1, 2, 3, 1]' >>> s = set(list) >>> json.dumps(s).encode('UTF-8') Traceback (most recent call last): File "", line 1, in File "/Users/01087872/.pyenv/versions/3.7.1/lib/python3.7/json/__init__.py", line 231, in dumps return _default_encoder.encode(obj) File "/Users/01087872/.pyenv/versions/3.7.1/lib/python3.7/json/encoder.py", line 199, in encode chunks = self.iterencode(o, _one_shot=True) File "/Users/01087872/.pyenv/versions/3.7.1/lib/python3.7/json/encoder.py", line 257, in iterencode return _iterencode(o, 0) File "/Users/01087872/.pyenv/versions/3.7.1/lib/python3.7/json/encoder.py", line 179, in default raise TypeError(f'Object of type {o.__class__.__name__} ' TypeError: Object of type set is not JSON serializable ``` This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] yohei1126 commented on a change in pull request #4371: [AIRFLOW-2939][AIRFLOW-3568] fix TypeError on GoogleCloudStorageToS3Operator / S3ToGoogleCloudStorageOperator
yohei1126 commented on a change in pull request #4371: [AIRFLOW-2939][AIRFLOW-3568] fix TypeError on GoogleCloudStorageToS3Operator / S3ToGoogleCloudStorageOperator URL: https://github.com/apache/incubator-airflow/pull/4371#discussion_r244072388 ## File path: airflow/contrib/operators/gcs_to_s3.py ## @@ -101,7 +101,7 @@ def execute(self, context): # Google Cloud Storage and not in S3 bucket_name, _ = S3Hook.parse_s3_url(self.dest_s3_key) existing_files = s3_hook.list_keys(bucket_name) -files = set(files) - set(existing_files) +files = list(set(files) - set(existing_files)) Review comment: If you think `xcom` should support `set`, this is a bug on `xcom` since the current implementation only supports JSON serializable only ( `set` is not JSON serializable). ``` >>> list = [1,2,3,1] >>> json.dumps(list).encode('UTF-8') b'[1, 2, 3, 1]' >>> s = set(list) >>> json.dumps(s).encode('UTF-8') Traceback (most recent call last): File "", line 1, in File "/Users/01087872/.pyenv/versions/3.7.1/lib/python3.7/json/__init__.py", line 231, in dumps return _default_encoder.encode(obj) File "/Users/01087872/.pyenv/versions/3.7.1/lib/python3.7/json/encoder.py", line 199, in encode chunks = self.iterencode(o, _one_shot=True) File "/Users/01087872/.pyenv/versions/3.7.1/lib/python3.7/json/encoder.py", line 257, in iterencode return _iterencode(o, 0) File "/Users/01087872/.pyenv/versions/3.7.1/lib/python3.7/json/encoder.py", line 179, in default raise TypeError(f'Object of type {o.__class__.__name__} ' TypeError: Object of type set is not JSON serializable ``` This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] yohei1126 commented on a change in pull request #4371: [AIRFLOW-2939][AIRFLOW-3568] fix TypeError on GoogleCloudStorageToS3Operator / S3ToGoogleCloudStorageOperator
yohei1126 commented on a change in pull request #4371: [AIRFLOW-2939][AIRFLOW-3568] fix TypeError on GoogleCloudStorageToS3Operator / S3ToGoogleCloudStorageOperator URL: https://github.com/apache/incubator-airflow/pull/4371#discussion_r244071926 ## File path: airflow/contrib/operators/gcs_to_s3.py ## @@ -101,7 +101,7 @@ def execute(self, context): # Google Cloud Storage and not in S3 bucket_name, _ = S3Hook.parse_s3_url(self.dest_s3_key) existing_files = s3_hook.list_keys(bucket_name) -files = set(files) - set(existing_files) +files = list(set(files) - set(existing_files)) Review comment: added traceback on the description. According to the traceback model.py internally uses `json.dumps(value).encode` but it does not support set since set is not JSON serializable This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] yohei1126 commented on a change in pull request #4371: [AIRFLOW-2939][AIRFLOW-3568] fix TypeError on GoogleCloudStorageToS3Operator / S3ToGoogleCloudStorageOperator
yohei1126 commented on a change in pull request #4371: [AIRFLOW-2939][AIRFLOW-3568] fix TypeError on GoogleCloudStorageToS3Operator / S3ToGoogleCloudStorageOperator URL: https://github.com/apache/incubator-airflow/pull/4371#discussion_r244071926 ## File path: airflow/contrib/operators/gcs_to_s3.py ## @@ -101,7 +101,7 @@ def execute(self, context): # Google Cloud Storage and not in S3 bucket_name, _ = S3Hook.parse_s3_url(self.dest_s3_key) existing_files = s3_hook.list_keys(bucket_name) -files = set(files) - set(existing_files) +files = list(set(files) - set(existing_files)) Review comment: added traceback on the description. According to the traceback model.py internally uses json.dumps(value).encode but it does not support set since set is not JSON serializable This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] yohei1126 commented on a change in pull request #4371: [AIRFLOW-2939][AIRFLOW-3568] fix TypeError on GoogleCloudStorageToS3Operator / S3ToGoogleCloudStorageOperator
yohei1126 commented on a change in pull request #4371: [AIRFLOW-2939][AIRFLOW-3568] fix TypeError on GoogleCloudStorageToS3Operator / S3ToGoogleCloudStorageOperator URL: https://github.com/apache/incubator-airflow/pull/4371#discussion_r244070449 ## File path: airflow/contrib/operators/gcs_to_s3.py ## @@ -101,7 +101,7 @@ def execute(self, context): # Google Cloud Storage and not in S3 bucket_name, _ = S3Hook.parse_s3_url(self.dest_s3_key) existing_files = s3_hook.list_keys(bucket_name) -files = set(files) - set(existing_files) +files = list(set(files) - set(existing_files)) Review comment: Actually the error log says it does not support `set`: ` TypeError: 'NoneType' object is not iterable.` but it works with list. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services