Michael Brown has posted comments on this change. Change subject: IMPALA-1878: Support INSERT and LOAD DATA on S3 and between filesystems ......................................................................
Patch Set 20: (7 comments) For the recent tagged Python changes you asked me to look at, I replied with 'Done' to your helpful markers and left comments in relevant places. http://gerrit.cloudera.org:8080/#/c/2574/20/tests/query_test/test_insert_parquet.py File tests/query_test/test_insert_parquet.py: Line 136: sizes = self.filesystem_client.get_all_file_sizes(DIR) > Change made as part of fix for IMPALA-3245. Done Line 138: assert size < BLOCK_SIZE Not your change, but can you add a meaningful error message for this assert if it were to fail? You can do it like: assert size < BLOCK_SIZE, "wrong size {0} [etc]".format(size) Line 139: print size Not your change, but random prints should be removed. We should use the logging facility if we need to print this. http://gerrit.cloudera.org:8080/#/c/2574/20/tests/util/filesystem_base.py File tests/util/filesystem_base.py: Line 58: def get_all_file_sizes(self, path): > Added for IMPALA-3245. As we discussed please make sure both FS-specific implementations return lists of numbers, not lists of strings, since I don't see you specifically changing the type. http://gerrit.cloudera.org:8080/#/c/2574/20/tests/util/hdfs_util.py File tests/util/hdfs_util.py: Line 79: def get_all_file_sizes(self, path): > Added for IMPALA-3245. Done http://gerrit.cloudera.org:8080/#/c/2574/20/tests/util/s3_util.py File tests/util/s3_util.py: Line 72: def get_all_file_sizes(self, path): > Added for IMPALA-3245. Done Line 78: if 'Contents' in response: : return [t['Size'] for t in response['Contents']] It seems like we should handle the case where 'Contents' is not in response. If it's only missing due to some bad 'path' value, or some S3 error, it will still help with test failures to know that. As it is the method will just return None. If it returns None, something higher in the stack trying to use the value will produce a more obscure exception. My suggestion is to raise an exception here and include path and response (if not large) in the exception string. -- To view, visit http://gerrit.cloudera.org:8080/2574 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I94e15ad67752dce21c9b7c1dced6e114905a942d Gerrit-PatchSet: 20 Gerrit-Project: Impala Gerrit-Branch: cdh5-trunk Gerrit-Owner: Sailesh Mukil <[email protected]> Gerrit-Reviewer: Dan Hecht <[email protected]> Gerrit-Reviewer: Henry Robinson <[email protected]> Gerrit-Reviewer: Michael Brown <[email protected]> Gerrit-Reviewer: Mostafa Mokhtar <[email protected]> Gerrit-Reviewer: Sailesh Mukil <[email protected]> Gerrit-Reviewer: Taras Bobrovytsky <[email protected]> Gerrit-HasComments: Yes
