[Impala-ASF-CR] IMPALA-10497: Fix flakiness in test no fd caching on cached data.

Joe McDonnell (Code Review) Wed, 10 Feb 2021 14:59:58 -0800

Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17054 )

Change subject: IMPALA-10497: Fix flakiness in
test_no_fd_caching_on_cached_data.
......................................................................

Patch Set 1:

(5 comments)

Thanks for looking into this.

My main thought here is that I think the change may be bigger than it needs to
be. My theory is that the "Repeat the warm up query 5 times" is the part of the
change that makes it work and it may not need the other parts. I would be
interested in whether a stripped down version of this with only that part of
the change would fix the issue.

http://gerrit.cloudera.org:8080/#/c/17054/1/tests/custom_cluster/test_hdfs_fd_caching.py
File tests/custom_cluster/test_hdfs_fd_caching.py:

http://gerrit.cloudera.org:8080/#/c/17054/1/tests/custom_cluster/test_hdfs_fd_caching.py@179
PS1, Line 179: TestHdfsFdCaching
One thing to watch out for here is if you inherit from TestHdfsFdCaching, you
may inherit all of its test cases (e.g. test_caching_with_eviction()). I'm not
sure how pytest deals with that.

http://gerrit.cloudera.org:8080/#/c/17054/1/tests/custom_cluster/test_hdfs_fd_caching.py@184
PS1, Line 184: NUM_ROWS = 10
I believe that the number of rows should not impact the behavior. The
differences between 10 and 100 is small enough that it shouldn't impact the IO
pattern. They are all in a single parquet files, and we would do an IO for the
footer and one IO for each column.

http://gerrit.cloudera.org:8080/#/c/17054/1/tests/custom_cluster/test_hdfs_fd_caching.py@190
PS1, Line 190: cluster_size=1
I think this should not impact the caching. Our scheduling algorithm maps files
to nodes consistently, so each node should be reading the same file each time.

http://gerrit.cloudera.org:8080/#/c/17054/1/tests/custom_cluster/test_hdfs_fd_caching.py@207
PS1, Line 207: Read 5 times to make
: # sure the data cache is fully warmed up.
: for x in range(5):
This makes a lot of sense to me as the explanation. There are times when data
won't be written to the cache. In particular, there is a limit on concurrency
in writing to the cache. I'm guessing we could hit that and fail to cache
something the first time around.

http://gerrit.cloudera.org:8080/#/c/17054/1/tests/custom_cluster/test_hdfs_fd_caching.py@224
PS1, Line 224: for x in range(5):
I'm fine with either value here, but I also don't see this changing the
behavior of the test.

--
To view, visit http://gerrit.cloudera.org:8080/17054
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I774f9dfea7dcc107c3c7f2b76db3aaf4b2dd7952
Gerrit-Change-Number: 17054
Gerrit-PatchSet: 1
Gerrit-Owner: Riza Suminto <[email protected]>
Gerrit-Reviewer: Bikramjeet Vig <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Joe McDonnell <[email protected]>
Gerrit-Reviewer: Riza Suminto <[email protected]>
Gerrit-Comment-Date: Wed, 10 Feb 2021 22:59:41 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-10497: Fix flakiness in test no fd caching on cached data.

Reply via email to