Philip Zeyliger has posted comments on this change. ( http://gerrit.cloudera.org:8080/8784 )
Change subject: IMPALA-6225: Part 2: Query profile date-time strings should have ns precision. ...................................................................... Patch Set 2: (10 comments) http://gerrit.cloudera.org:8080/#/c/8784/2/tests/query_test/test_observability.py File tests/query_test/test_observability.py: http://gerrit.cloudera.org:8080/#/c/8784/2/tests/query_test/test_observability.py@23 PS2, Line 23: MAX_RETRIES = 60 This is only used once, and only for one of the tests. Please move it right where it's used. http://gerrit.cloudera.org:8080/#/c/8784/2/tests/query_test/test_observability.py@116 PS2, Line 116: def __test_query_profile_thrift_timestamps(self): I wouldn't have a function named __x as well as a function named x. It's just bound to confuse. Perhaps "assert_query_profile_thrift_has_correct_timestamps()" if that's what you're doing here. In reality, I think this is the test, and I don't understand lines 160-163. http://gerrit.cloudera.org:8080/#/c/8784/2/tests/query_test/test_observability.py@126 PS2, Line 126: service = ImpalaCluster().get_any_impalad().service Why is this "any_impalad"? It really has to be the one that you executed the query against in the first place, no? http://gerrit.cloudera.org:8080/#/c/8784/2/tests/query_test/test_observability.py@132 PS2, Line 132: tree = service.get_thrift_profile(query_id) Where is get_thrift_profile() defined? http://gerrit.cloudera.org:8080/#/c/8784/2/tests/query_test/test_observability.py@134 PS2, Line 134: start_time = tree.nodes[1].info_strings["Start Time"] Add a comment about why it's nodes[1] here. http://gerrit.cloudera.org:8080/#/c/8784/2/tests/query_test/test_observability.py@136 PS2, Line 136: start_time_sub_sec_str = start_time.split('.')[-1] Add a comment here about what start_time looks like, so it's clear what the split is doing. http://gerrit.cloudera.org:8080/#/c/8784/2/tests/query_test/test_observability.py@142 PS2, Line 142: assert len(end_time_sub_sec_str) == 9 Add ", end_time" here, so that we can see the failed string if it ever fails. http://gerrit.cloudera.org:8080/#/c/8784/2/tests/query_test/test_observability.py@147 PS2, Line 147: # This could happen due to heavy system load. The test is then inconclusive. This is a controlled environment--our tests. Let's be assertive and do the work to make the test not flaky. http://gerrit.cloudera.org:8080/#/c/8784/2/tests/query_test/test_observability.py@152 PS2, Line 152: print dbg_str Prefer logging to printing. We do both, but "import logging; logging.info(...)" will work ok, I think. $git grep 'print ' test_aggregation.py: print key, result_lut[key], actual_string,abs(result_lut[key] - int(actual_string)) test_compressed_formats.py: print error_msg test_compressed_formats.py: print "Unexpected error:\n%s", error_msg test_hdfs_caching.py: # Test failed, print the metrics test_hdfs_caching.py: print "%d %d" % (cached_bytes_before[i], cached_bytes_after[i]) test_kudu.py: print cursor.fetchall() == [(i, )] test_kudu.py: # Pytest shows truncated output on failure, so print the details just in case. test_kudu.py: # Pytest shows truncated output on failure, so print the details just in case. test_kudu.py: # 'describe' should print the column name in lower case. test_limit.py: print str(e) $git grep '\.info' test_kudu.py: LOG.info(table_desc) test_kudu.py: LOG.info(table_desc) test_scanners.py: LOG.info("test_annotate_utf8_option local file name: " + local_file) test_scanners_fuzz.py: LOG.info("Using random seed %d", random_seed) test_scanners_fuzz.py: LOG.info("Generating corrupted version of %s in %s. Local working directory is %s", test_scanners_fuzz.py: LOG.info('\n'.join(result.log)) test_scanners_fuzz.py: LOG.info("corrupt file: Flip byte in {0} at {1} from {2} to {3}".format( test_scanners_fuzz.py: LOG.info("corrupt file: Truncate {0} to {1}".format(path, truncation)) http://gerrit.cloudera.org:8080/#/c/8784/2/tests/query_test/test_observability.py@155 PS2, Line 155: def test_query_profile_thrift_timestamps(self): : if not self.__test_query_profile_thrift_timestamps(): : dbg_str = 'Debug thrift profile not available in ' + str(MAX_RETRIES) + ' seconds?' : dbg_str += ' Retrying the test' : assert True, self.__test_query_profile_thrift_timestamps() Do we really need to run it twice? Why isn't it passing? I'll bet my hat on "any_impalad". -- To view, visit http://gerrit.cloudera.org:8080/8784 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id3421a34cc029ebca551730084c7cbd402d5c109 Gerrit-Change-Number: 8784 Gerrit-PatchSet: 2 Gerrit-Owner: Zoram Thanga <[email protected]> Gerrit-Reviewer: Michael Ho <[email protected]> Gerrit-Reviewer: Philip Zeyliger <[email protected]> Gerrit-Reviewer: Zoram Thanga <[email protected]> Gerrit-Comment-Date: Thu, 07 Dec 2017 21:04:24 +0000 Gerrit-HasComments: Yes
