Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/17054 )
Change subject: IMPALA-10497: Fix flakiness in test_no_fd_caching_on_cached_data. ...................................................................... Patch Set 1: (5 comments) Thanks for looking into this. My main thought here is that I think the change may be bigger than it needs to be. My theory is that the "Repeat the warm up query 5 times" is the part of the change that makes it work and it may not need the other parts. I would be interested in whether a stripped down version of this with only that part of the change would fix the issue. http://gerrit.cloudera.org:8080/#/c/17054/1/tests/custom_cluster/test_hdfs_fd_caching.py File tests/custom_cluster/test_hdfs_fd_caching.py: http://gerrit.cloudera.org:8080/#/c/17054/1/tests/custom_cluster/test_hdfs_fd_caching.py@179 PS1, Line 179: TestHdfsFdCaching One thing to watch out for here is if you inherit from TestHdfsFdCaching, you may inherit all of its test cases (e.g. test_caching_with_eviction()). I'm not sure how pytest deals with that. http://gerrit.cloudera.org:8080/#/c/17054/1/tests/custom_cluster/test_hdfs_fd_caching.py@184 PS1, Line 184: NUM_ROWS = 10 I believe that the number of rows should not impact the behavior. The differences between 10 and 100 is small enough that it shouldn't impact the IO pattern. They are all in a single parquet files, and we would do an IO for the footer and one IO for each column. http://gerrit.cloudera.org:8080/#/c/17054/1/tests/custom_cluster/test_hdfs_fd_caching.py@190 PS1, Line 190: cluster_size=1 I think this should not impact the caching. Our scheduling algorithm maps files to nodes consistently, so each node should be reading the same file each time. http://gerrit.cloudera.org:8080/#/c/17054/1/tests/custom_cluster/test_hdfs_fd_caching.py@207 PS1, Line 207: Read 5 times to make : # sure the data cache is fully warmed up. : for x in range(5): This makes a lot of sense to me as the explanation. There are times when data won't be written to the cache. In particular, there is a limit on concurrency in writing to the cache. I'm guessing we could hit that and fail to cache something the first time around. http://gerrit.cloudera.org:8080/#/c/17054/1/tests/custom_cluster/test_hdfs_fd_caching.py@224 PS1, Line 224: for x in range(5): I'm fine with either value here, but I also don't see this changing the behavior of the test. -- To view, visit http://gerrit.cloudera.org:8080/17054 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I774f9dfea7dcc107c3c7f2b76db3aaf4b2dd7952 Gerrit-Change-Number: 17054 Gerrit-PatchSet: 1 Gerrit-Owner: Riza Suminto <[email protected]> Gerrit-Reviewer: Bikramjeet Vig <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Joe McDonnell <[email protected]> Gerrit-Reviewer: Riza Suminto <[email protected]> Gerrit-Comment-Date: Wed, 10 Feb 2021 22:59:41 +0000 Gerrit-HasComments: Yes
