Gengliang Wang created SPARK-27162:
--------------------------------------

             Summary: Add new method getOriginalMap in CaseInsensitiveStringMap
                 Key: SPARK-27162
                 URL: https://issues.apache.org/jira/browse/SPARK-27162
             Project: Spark
          Issue Type: Task
          Components: SQL
    Affects Versions: 3.0.0
            Reporter: Gengliang Wang


Currently, DataFrameReader/DataFrameReader supports setting Hadoop 
configurations via method `.option()`. 
E.g.
```
class TestFileFilter extends PathFilter {
  override def accept(path: Path): Boolean = path.getParent.getName != "p=2"
}
withTempPath { dir =>
      val path = dir.getCanonicalPath

      val df = spark.range(2)
      df.write.orc(path + "/p=1")
      df.write.orc(path + "/p=2")
      assert(spark.read.orc(path).count() === 4)

      val extraOptions = Map(
        "mapred.input.pathFilter.class" -> classOf[TestFileFilter].getName,
        "mapreduce.input.pathFilter.class" -> classOf[TestFileFilter].getName
      )
      assert(spark.read.options(extraOptions).orc(path).count() === 2)
    }
```
While Hadoop Configurations are case sensitive, the current data source V2 APIs 
are using `CaseInsensitiveStringMap` in TableProvider. 
To create Hadoop configurations correctly, I suggest adding a method 
`getOriginalMap` in `CaseInsensitiveStringMap`. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to