Github user srowen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22641#discussion_r223756260
  
    --- Diff: 
sql/hive/src/test/scala/org/apache/spark/sql/hive/CompressionCodecSuite.scala 
---
    @@ -262,7 +261,10 @@ class CompressionCodecSuite extends TestHiveSingleton 
with ParquetTest with Befo
         }
       }
     
    -  def checkForTableWithCompressProp(format: String, compressCodecs: 
List[String]): Unit = {
    +  def checkForTableWithCompressProp(
    +      format: String,
    +      tableCompressCodecs: List[String],
    +      sessionCompressCodecs: List[String]): Unit = {
         Seq(true, false).foreach { isPartitioned =>
    --- End diff --
    
    Let's see, the two tests before were taking 2:20 and 0:47. Now they take 
about 0:43 each. That's clearly a win in the first case, not much in the 
second, as expected. I'm OK with this as an improvement, myself.
    
    I wonder if we need all 8 combinations in this triply nested loop though. 
Incidentally it could be written more easily as such, but I know this was 
existing code:
    
    `for (isPartitioned <- Seq(true, false); convertMetastore <- Seq(true, 
false); usingCTAS <- Seq(true, false)) {`
    
    @fjh100456 what do you think? is it important to test all combinations, or 
could we get away with setting all true and all false without much of any loss 
here?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to