Github user avulanov commented on a diff in the pull request:

    https://github.com/apache/spark/pull/15382#discussion_r83089960
  
    --- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ---
    @@ -757,7 +758,10 @@ private[sql] class SQLConf extends Serializable with 
CatalystConf with Logging {
     
       def variableSubstituteDepth: Int = getConf(VARIABLE_SUBSTITUTE_DEPTH)
     
    -  def warehousePath: String = new Path(getConf(WAREHOUSE_PATH)).toString
    +  def warehousePath: String = {
    +    val path = new Path(getConf(WAREHOUSE_PATH))
    +    FileSystem.get(path.toUri, new 
Configuration()).makeQualified(path).toString
    --- End diff --
    
    `resolveURI` tries to use `new URI`, encounters a `URISyntaxException` due 
to the white space symbol and falls back to: `new 
File(path).getAbsoluteFile().toURI()`. The latter does not deal with schema. 
This is why the first example looks weird:
    ```
    scala> resolveURI("file:///C:/My Programs/path")
    (Spark)res28: java.net.URI = 
file:/c:/dis/dev/spark-2.0.0-preview-bin-hadoop2.7/bin/fil
    e:/C:/My%20Programs/path
    (Scala) res1: java.net.URI = 
file:/C:/Users/ulanov/file:/C:/My%20Programs/path
    -----no space----
    scala> resolveURI("file:///C:/MyPrograms/path")
    (Spark)java.net.URI = file:///C:/MyPrograms/path
    (Scala) java.net.URI = file:///C:/MyPrograms/path
    ```
    Second example is OK.
    Third example works fine with white space on Windows (adds home on Linux) 
but breaks without it:
    ````
    scala> resolveURI("C:/My Programs/path")
    res41: java.net.URI = file:/C:/My%20Programs/path
    ----no space---
    scala> resolveURI("C:/MyPrograms/path")
    res42: java.net.URI = C:/MyPrograms/path
    ```
    Fourth works fine both with both. There is one more subtle thing, the 
letter drive becomes lower case in Spark.
    ```
    scala>  resolveURI("/My Programs/path")
    (Spark)res31: java.net.URI = file:/c:/My%20Programs/path
    (Scala) res4: java.net.URI = file:/C:/My%20Programs/path
    ```
    A character in the string should not be a reason to execute a particular 
branch of code. We should rather check the schema and work from this.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to