Sailesh Mukil has posted comments on this change.

Change subject: IMPALA-2904: Support INSERT and LOAD DATA on S3 and between 
filesystems
......................................................................


Patch Set 6:

(19 comments)

http://gerrit.cloudera.org:8080/#/c/2574/6/tests/common/impala_test_suite.py
File tests/common/impala_test_suite.py:

Line 118:     if IS_S3:
> Much shorter and clearer:
Done


http://gerrit.cloudera.org:8080/#/c/2574/6/tests/query_test/test_insert_behaviour.py
File tests/query_test/test_insert_behaviour.py:

Line 110:     ls = self.filesystem_client.ls(part_dir)
> I don't think we need to define the ls variable here, just assert directly
Done


Line 358:       does_exist = self.filesystem_client.exists(path)
> how about:
Done


Line 398:     assert new_ls == ls
> assert ls == new_ls == 1
But we assert the len(new_ls) above. So, assert ls == new_ls == 1 would fail.


http://gerrit.cloudera.org:8080/#/c/2574/6/tests/util/filesystem_base.py
File tests/util/filesystem_base.py:

Line 25: file_data
> is file_data a string?
Yes, I've updated to explicitly call it a string.


http://gerrit.cloudera.org:8080/#/c/2574/6/tests/util/hdfs_util.py
File tests/util/hdfs_util.py:

Line 59:   def make_dir(self, path, permission=755):
> Do we need to override this? Is the only purpose to set the default permiss
I meant to take this out. 755 is default for the parent too. So, this doesn't 
change anything. Done.


Line 102:   def ls(self, path):
> Can you add a more detailed comment? What is the difference between this me
We want ls() to return only the file names instead of a dictionary of file 
statuses. list_dir() returns the latter, which we don't need anywhere in our 
tests save for one or 2 cases. Done.


http://gerrit.cloudera.org:8080/#/c/2574/6/tests/util/s3_util.py
File tests/util/s3_util.py:

Line 15: # S3 access utilities
> This comment does not really add anything.
Done


Line 22:   @classmethod
> Add an empty line above this line.
Done


Line 30:     if not overwrite:
> if not override and self.exists(filename): return False
Done


Line 33: response
> What's the point of this variable, it just disappears? I think we should re
Done


Line 37: othav
> add space
Done


Line 57: =
> path += '/'
Done


Line 69: both
> This should be called files_and_dirs
Done


Line 70:     for d in dirs:
> How about something like:
Done


Line 75:       key = tmp[-1]
> how about simply:
Done


Line 83:     return contents is not None
> how about return response.get('Contents') is not None
Done


Line 90:       keys = self.list_keys(path)
> objects = [{'Key': k} for k in self.list_keys(path)] if recursive else path
Done


Line 92:         objects.append({'Key':k})
> Do we have to add path to the list as well in the recursive case?
No, the 'path' itself will be one of the keys returned by list_keys().


-- 
To view, visit http://gerrit.cloudera.org:8080/2574
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I94e15ad67752dce21c9b7c1dced6e114905a942d
Gerrit-PatchSet: 6
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Sailesh Mukil <[email protected]>
Gerrit-Reviewer: Henry Robinson <[email protected]>
Gerrit-Reviewer: Michael Brown <[email protected]>
Gerrit-Reviewer: Mostafa Mokhtar <[email protected]>
Gerrit-Reviewer: Sailesh Mukil <[email protected]>
Gerrit-Reviewer: Taras Bobrovytsky <[email protected]>
Gerrit-HasComments: Yes

Reply via email to