Sailesh Mukil has posted comments on this change. Change subject: IMPALA-1878: Support INSERT and LOAD DATA on S3 and between filesystems ......................................................................
Patch Set 20: (4 comments) Thanks Michael. http://gerrit.cloudera.org:8080/#/c/2574/20/tests/query_test/test_insert_parquet.py File tests/query_test/test_insert_parquet.py: Line 138: assert size < BLOCK_SIZE > Not your change, but can you add a meaningful error message for this assert Done Line 139: print size > Not your change, but random prints should be removed. We should use the log Done http://gerrit.cloudera.org:8080/#/c/2574/20/tests/util/filesystem_base.py File tests/util/filesystem_base.py: Line 58: def get_all_file_sizes(self, path): > As we discussed please make sure both FS-specific implementations return li Both return lists of integers. I added a comment to state that. http://gerrit.cloudera.org:8080/#/c/2574/20/tests/util/s3_util.py File tests/util/s3_util.py: Line 78: if 'Contents' in response: : return [t['Size'] for t in response['Contents']] > It seems like we should handle the case where 'Contents' is not in response I think it's just best to have the same behavior as hdfs_util here. Which is to return an empty list if there are no files found. If list_objects() itself fails, it will throw an exception. -- To view, visit http://gerrit.cloudera.org:8080/2574 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I94e15ad67752dce21c9b7c1dced6e114905a942d Gerrit-PatchSet: 20 Gerrit-Project: Impala Gerrit-Branch: cdh5-trunk Gerrit-Owner: Sailesh Mukil <[email protected]> Gerrit-Reviewer: Dan Hecht <[email protected]> Gerrit-Reviewer: Henry Robinson <[email protected]> Gerrit-Reviewer: Michael Brown <[email protected]> Gerrit-Reviewer: Mostafa Mokhtar <[email protected]> Gerrit-Reviewer: Sailesh Mukil <[email protected]> Gerrit-Reviewer: Taras Bobrovytsky <[email protected]> Gerrit-HasComments: Yes
