Michael Brown has posted comments on this change.

Change subject: IMPALA-1878: Support INSERT and LOAD DATA on S3 and between 
filesystems
......................................................................


Patch Set 20:

(7 comments)

For the recent tagged Python changes you asked me to look at, I replied with 
'Done' to your helpful markers and left comments in relevant places.

http://gerrit.cloudera.org:8080/#/c/2574/20/tests/query_test/test_insert_parquet.py
File tests/query_test/test_insert_parquet.py:

Line 136:     sizes = self.filesystem_client.get_all_file_sizes(DIR)
> Change made as part of fix for IMPALA-3245.
Done


Line 138:       assert size < BLOCK_SIZE
Not your change, but can you add a meaningful error message for this assert if 
it were to fail?

You can do it like:
  assert size < BLOCK_SIZE, "wrong size {0} [etc]".format(size)


Line 139:       print size
Not your change, but random prints should be removed. We should use the logging 
facility if we need to print this.


http://gerrit.cloudera.org:8080/#/c/2574/20/tests/util/filesystem_base.py
File tests/util/filesystem_base.py:

Line 58:   def get_all_file_sizes(self, path):
> Added for IMPALA-3245.
As we discussed please make sure both FS-specific implementations return lists 
of numbers, not lists of strings, since I don't see you specifically changing 
the type.


http://gerrit.cloudera.org:8080/#/c/2574/20/tests/util/hdfs_util.py
File tests/util/hdfs_util.py:

Line 79:   def get_all_file_sizes(self, path):
> Added for IMPALA-3245.
Done


http://gerrit.cloudera.org:8080/#/c/2574/20/tests/util/s3_util.py
File tests/util/s3_util.py:

Line 72:   def get_all_file_sizes(self, path):
> Added for IMPALA-3245.
Done


Line 78:     if 'Contents' in response:
       :       return [t['Size'] for t in response['Contents']]
It seems like we should handle the case where 'Contents' is not in response. If 
it's only missing due to some bad 'path' value, or some S3 error, it will still 
help with test failures to know that. As it is the method will just return 
None. If it returns None, something higher in the stack trying to use the value 
will produce a more obscure exception.

My suggestion is to raise an exception here and include path and response (if 
not large) in the exception string.


-- 
To view, visit http://gerrit.cloudera.org:8080/2574
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I94e15ad67752dce21c9b7c1dced6e114905a942d
Gerrit-PatchSet: 20
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Sailesh Mukil <[email protected]>
Gerrit-Reviewer: Dan Hecht <[email protected]>
Gerrit-Reviewer: Henry Robinson <[email protected]>
Gerrit-Reviewer: Michael Brown <[email protected]>
Gerrit-Reviewer: Mostafa Mokhtar <[email protected]>
Gerrit-Reviewer: Sailesh Mukil <[email protected]>
Gerrit-Reviewer: Taras Bobrovytsky <[email protected]>
Gerrit-HasComments: Yes

Reply via email to