Taras Bobrovytsky has posted comments on this change. Change subject: IMPALA-2904: Support INSERT and LOAD DATA on S3 and between filesystems ......................................................................
Patch Set 5: (13 comments) http://gerrit.cloudera.org:8080/#/c/2574/5/tests/common/impala_test_suite.py File tests/common/impala_test_suite.py: Line 120: generic_client Maybe call this cls.filesystem_client? Do we need to keep cls.hdfs_client and cls.s3_client? Are they used anywhere? http://gerrit.cloudera.org:8080/#/c/2574/5/tests/util/filesystem_base.py File tests/util/filesystem_base.py: Line 21: = There should be spaces around = http://gerrit.cloudera.org:8080/#/c/2574/5/tests/util/hdfs_util.py File tests/util/hdfs_util.py: Line 60: """Checks if a particular path exists"" Is this comment correct? Line 65: return super(PyWebHdfsClientWithChmod, self).make_dir(path, permission=permission) I don't think it's necessary to override make_dir (and other similar methods) because all it does is call the parent's make_dir method. Line 91: super(PyWebHdfsClientWithChmod, self) Why not just self.list_dir()? Line 91: file_infos instead of declaring a variable here, why not just return directly? Line 112: def list_dir(self, path): It looks like you are changing the behavior of list_dir compared to PyWebHdfsClient. What is the reason for this (maybe add a comment). Is it appropriate to keep the same method name? Line 119: def exists(self, path): I think this can be omitted because all you're doing is calling the parent's method which happens automatically anyway. http://gerrit.cloudera.org:8080/#/c/2574/5/tests/util/s3_util.py File tests/util/s3_util.py: Line 34: 4 space indent here Line 51: keys = You can put a return here Line 63: response.get('CommonPrefixes'): I think it's better to write: if 'CommonPrefixes' in response: Line 64: response.get('CommonPrefixes') It's more Pythonic to write response['CommonPrefixes']. In general, it's not very Pythonic to use dict.get all the time. Another solution might be to use defaultdict. Line 71: len(tmp) - 2 This can be replaced with tmp[-2]. In python tmp[len(tmp) - a] is equivalent to tmp[-a]. For example tmp[-1] returns the last element. -- To view, visit http://gerrit.cloudera.org:8080/2574 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I94e15ad67752dce21c9b7c1dced6e114905a942d Gerrit-PatchSet: 5 Gerrit-Project: Impala Gerrit-Branch: cdh5-trunk Gerrit-Owner: Sailesh Mukil <[email protected]> Gerrit-Reviewer: Henry Robinson <[email protected]> Gerrit-Reviewer: Michael Brown <[email protected]> Gerrit-Reviewer: Sailesh Mukil <[email protected]> Gerrit-Reviewer: Taras Bobrovytsky <[email protected]> Gerrit-HasComments: Yes
