Github user rxin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10842#discussion_r54345066
  
    --- Diff: 
sql/hive/src/test/scala/org/apache/spark/sql/hive/orc/OrcQuerySuite.scala ---
    @@ -394,4 +394,42 @@ class OrcQuerySuite extends QueryTest with 
BeforeAndAfterAll with OrcTest {
           }
         }
       }
    +
    +  test("SPARK-12417. Orc bloom filter options are not propagated during 
file " +
    +    "generation") {
    +    withTempPath { dir =>
    +      // Create separate sub dir for bloom filter testing
    +      val path = new File(dir, "orc").getCanonicalPath
    +
    +      // Write some data
    +      val data = (0 until 10).map { i =>
    +        val maybeInt = if (i % 2 == 0) None else Some(i)
    +        val nullValue: Option[String] = None
    +        (maybeInt, nullValue)
    +      }
    +
    +      // Dump data to orc
    +      createDataFrame(data).toDF("a", "b")
    +        .write.option("orc.bloom.filter.columns", "*").orc(path)
    +
    +      // Verify if orc bloom filters are present. This can be verified via
    +      // ORC RecordReaderImpl when it is made public. Until then, verify by
    +      // dumping file statistics and checking whether bloom filter was 
added.
    +      new File(path).listFiles().filter(_.getName.endsWith("orc")) map { 
file =>
    +        withTempStream { buf =>
    +          val fileDumpArgs = Array(file.getCanonicalPath)
    +          FileDump.main(fileDumpArgs)
    --- End diff --
    
    this seems really hacky and is depending on some non-public API.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to