[ https://issues.apache.org/jira/browse/SPARK-35263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17334987#comment-17334987 ]
Apache Spark commented on SPARK-35263: -------------------------------------- User 'xkrogen' has created a pull request for this issue: https://github.com/apache/spark/pull/32389 > Refactor ShuffleBlockFetcherIteratorSuite to reduce duplicated code > ------------------------------------------------------------------- > > Key: SPARK-35263 > URL: https://issues.apache.org/jira/browse/SPARK-35263 > Project: Spark > Issue Type: Improvement > Components: Shuffle, Tests > Affects Versions: 3.1.1 > Reporter: Erik Krogen > Priority: Major > > {{ShuffleFetcherBlockIteratorSuite}} has tons of duplicate code, like: > {code} > val iterator = new ShuffleBlockFetcherIterator( > taskContext, > transfer, > blockManager, > blocksByAddress, > (_, in) => in, > 48 * 1024 * 1024, > Int.MaxValue, > Int.MaxValue, > Int.MaxValue, > true, > false, > metrics, > false) > {code} > It's challenging to tell what the interesting parts are vs. what is just > being set to some default/unused value. > Similarly but not as bad, there are 10 calls like: > {code} > verify(transfer, times(1)).fetchBlocks(any(), any(), any(), any(), any(), > any()) > {code} > and 7 like > {code} > when(transfer.fetchBlocks(any(), any(), any(), any(), any(), > any())).thenAnswer ... > {code} > This can result in about 10% reduction in both lines and characters in the > file: > {code} > # Before > > wc > > core/src/test/scala/org/apache/spark/storage/ShuffleBlockFetcherIteratorSuite.scala > 1063 3950 43201 > core/src/test/scala/org/apache/spark/storage/ShuffleBlockFetcherIteratorSuite.scala > # After > > wc > > core/src/test/scala/org/apache/spark/storage/ShuffleBlockFetcherIteratorSuite.scala > 928 3609 39053 > core/src/test/scala/org/apache/spark/storage/ShuffleBlockFetcherIteratorSuite.scala > {code} > It also helps readability: > {code} > val iterator = createShuffleBlockIteratorWithDefaults( > transfer, > blocksByAddress, > maxBytesInFlight = 1000L > ) > {code} > Now I can clearly tell that {{maxBytesInFlight}} is the main parameter we're > interested in here. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org