[Impala-ASF-CR] IMPALA-6225: Part 2: Query profile date-time strings should have ns precision.

Philip Zeyliger (Code Review) Thu, 07 Dec 2017 13:06:06 -0800

Philip Zeyliger has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8784 )


Change subject: IMPALA-6225: Part 2: Query profile date-time strings should 
have ns precision.
......................................................................


Patch Set 2:

(10 comments)

http://gerrit.cloudera.org:8080/#/c/8784/2/tests/query_test/test_observability.py
File tests/query_test/test_observability.py:

http://gerrit.cloudera.org:8080/#/c/8784/2/tests/query_test/test_observability.py@23
PS2, Line 23: MAX_RETRIES = 60
This is only used once, and only for one of the tests. Please move it right 
where it's used.


http://gerrit.cloudera.org:8080/#/c/8784/2/tests/query_test/test_observability.py@116
PS2, Line 116:   def __test_query_profile_thrift_timestamps(self):
I wouldn't have a function named __x as well as a function named x. It's just 
bound to confuse.

Perhaps "assert_query_profile_thrift_has_correct_timestamps()" if that's what 
you're doing here. In reality, I think this is the test, and I don't understand 
lines 160-163.


http://gerrit.cloudera.org:8080/#/c/8784/2/tests/query_test/test_observability.py@126
PS2, Line 126:     service = ImpalaCluster().get_any_impalad().service
Why is this "any_impalad"? It really has to be the one that you executed the 
query against in the first place, no?


http://gerrit.cloudera.org:8080/#/c/8784/2/tests/query_test/test_observability.py@132
PS2, Line 132:       tree = service.get_thrift_profile(query_id)
Where is get_thrift_profile() defined?


http://gerrit.cloudera.org:8080/#/c/8784/2/tests/query_test/test_observability.py@134
PS2, Line 134:       start_time = tree.nodes[1].info_strings["Start Time"]
Add a comment about why it's nodes[1] here.


http://gerrit.cloudera.org:8080/#/c/8784/2/tests/query_test/test_observability.py@136
PS2, Line 136:       start_time_sub_sec_str = start_time.split('.')[-1]
Add a comment here about what start_time looks like, so it's clear what the 
split is doing.


http://gerrit.cloudera.org:8080/#/c/8784/2/tests/query_test/test_observability.py@142
PS2, Line 142:       assert len(end_time_sub_sec_str) == 9
Add ", end_time" here, so that we can see the failed string if it ever fails.


http://gerrit.cloudera.org:8080/#/c/8784/2/tests/query_test/test_observability.py@147
PS2, Line 147:     # This could happen due to heavy system load. The test is 
then inconclusive.
This is a controlled environment--our tests. Let's be assertive and do the work 
to make the test not flaky.


http://gerrit.cloudera.org:8080/#/c/8784/2/tests/query_test/test_observability.py@152
PS2, Line 152:     print dbg_str
Prefer logging to printing. We do both, but "import logging; logging.info(...)" 
will work ok, I think.

$git grep 'print '
test_aggregation.py:      print key, result_lut[key], 
actual_string,abs(result_lut[key] - int(actual_string))
test_compressed_formats.py:      print error_msg
test_compressed_formats.py:        print "Unexpected error:\n%s", error_msg
test_hdfs_caching.py:      # Test failed, print the metrics
test_hdfs_caching.py:        print "%d %d" % (cached_bytes_before[i], 
cached_bytes_after[i])
test_kudu.py:    print cursor.fetchall() == [(i, )]
test_kudu.py:        # Pytest shows truncated output on failure, so print the 
details just in case.
test_kudu.py:        # Pytest shows truncated output on failure, so print the 
details just in case.
test_kudu.py:      # 'describe' should print the column name in lower case.
test_limit.py:        print str(e)

$git grep '\.info'
test_kudu.py:        LOG.info(table_desc)
test_kudu.py:        LOG.info(table_desc)
test_scanners.py:      LOG.info("test_annotate_utf8_option local file name: " + 
local_file)
test_scanners_fuzz.py:    LOG.info("Using random seed %d", random_seed)
test_scanners_fuzz.py:    LOG.info("Generating corrupted version of %s in %s. 
Local working directory is %s",
test_scanners_fuzz.py:        LOG.info('\n'.join(result.log))
test_scanners_fuzz.py:      LOG.info("corrupt file: Flip byte in {0} at {1} 
from {2} to {3}".format(
test_scanners_fuzz.py:      LOG.info("corrupt file: Truncate {0} to 
{1}".format(path, truncation))


http://gerrit.cloudera.org:8080/#/c/8784/2/tests/query_test/test_observability.py@155
PS2, Line 155:   def test_query_profile_thrift_timestamps(self):
             :     if not self.__test_query_profile_thrift_timestamps():
             :       dbg_str = 'Debug thrift profile not available in ' + 
str(MAX_RETRIES) + ' seconds?'
             :       dbg_str += ' Retrying the test'
             :       assert True, self.__test_query_profile_thrift_timestamps()
Do we really need to run it twice? Why isn't it passing? I'll bet my hat on 
"any_impalad".



--
To view, visit http://gerrit.cloudera.org:8080/8784
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id3421a34cc029ebca551730084c7cbd402d5c109
Gerrit-Change-Number: 8784
Gerrit-PatchSet: 2
Gerrit-Owner: Zoram Thanga <[email protected]>
Gerrit-Reviewer: Michael Ho <[email protected]>
Gerrit-Reviewer: Philip Zeyliger <[email protected]>
Gerrit-Reviewer: Zoram Thanga <[email protected]>
Gerrit-Comment-Date: Thu, 07 Dec 2017 21:04:24 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-6225: Part 2: Query profile date-time strings should have ns precision.

Reply via email to