Github user srowen commented on a diff in the pull request:
https://github.com/apache/spark/pull/22641#discussion_r223756260
--- Diff:
sql/hive/src/test/scala/org/apache/spark/sql/hive/CompressionCodecSuite.scala
---
@@ -262,7 +261,10 @@ class CompressionCodecSuite extends TestHiveSingleton
with ParquetTest with Befo
}
}
- def checkForTableWithCompressProp(format: String, compressCodecs:
List[String]): Unit = {
+ def checkForTableWithCompressProp(
+ format: String,
+ tableCompressCodecs: List[String],
+ sessionCompressCodecs: List[String]): Unit = {
Seq(true, false).foreach { isPartitioned =>
--- End diff --
Let's see, the two tests before were taking 2:20 and 0:47. Now they take
about 0:43 each. That's clearly a win in the first case, not much in the
second, as expected. I'm OK with this as an improvement, myself.
I wonder if we need all 8 combinations in this triply nested loop though.
Incidentally it could be written more easily as such, but I know this was
existing code:
`for (isPartitioned <- Seq(true, false); convertMetastore <- Seq(true,
false); usingCTAS <- Seq(true, false)) {`
@fjh100456 what do you think? is it important to test all combinations, or
could we get away with setting all true and all false without much of any loss
here?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]