[jira] [Commented] (SPARK-15899) file scheme should be used correctly
[ https://issues.apache.org/jira/browse/SPARK-15899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15556174#comment-15556174 ] Daniel Barclay commented on SPARK-15899: Regarding the reference to [RFC 1738|https://tools.ietf.org/html/rfc1738]: RFC 1738 has been "obsoleted" and updated by several other RFCs. (See the links at the top of the linked-to page.) (Make sure you're considering the current specifications of URIs and schemes.) > file scheme should be used correctly > > > Key: SPARK-15899 > URL: https://issues.apache.org/jira/browse/SPARK-15899 > Project: Spark > Issue Type: Bug > Components: Spark Core >Reporter: Kazuaki Ishizaki >Assignee: Alexander Ulanov > Fix For: 2.0.1, 2.1.0 > > > [A RFC|https://www.ietf.org/rfc/rfc1738.txt] defines file scheme as > {{file://host/}} or {{file:///}}. > [Wikipedia|https://en.wikipedia.org/wiki/File_URI_scheme] > [Some code > stuffs|https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala#L58] > use different prefix such as {{file:}}. > It would be good to prepare a utility method to correctly add {{file://host}} > or {{file://} prefix. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-15899) file scheme should be used correctly
[ https://issues.apache.org/jira/browse/SPARK-15899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15416977#comment-15416977 ] Apache Spark commented on SPARK-15899: -- User 'avulanov' has created a pull request for this issue: https://github.com/apache/spark/pull/14600 > file scheme should be used correctly > > > Key: SPARK-15899 > URL: https://issues.apache.org/jira/browse/SPARK-15899 > Project: Spark > Issue Type: Bug > Components: Spark Core >Reporter: Kazuaki Ishizaki >Assignee: Alexander Ulanov > Fix For: 2.1.0 > > > [A RFC|https://www.ietf.org/rfc/rfc1738.txt] defines file scheme as > {{file://host/}} or {{file:///}}. > [Wikipedia|https://en.wikipedia.org/wiki/File_URI_scheme] > [Some code > stuffs|https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala#L58] > use different prefix such as {{file:}}. > It would be good to prepare a utility method to correctly add {{file://host}} > or {{file://} prefix. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-15899) file scheme should be used correctly
[ https://issues.apache.org/jira/browse/SPARK-15899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15411190#comment-15411190 ] xuyifei commented on SPARK-15899: - really appreciate, it works,( sorry reply too late > file scheme should be used correctly > > > Key: SPARK-15899 > URL: https://issues.apache.org/jira/browse/SPARK-15899 > Project: Spark > Issue Type: Bug > Components: Spark Core >Reporter: Kazuaki Ishizaki >Assignee: Alexander Ulanov > > [A RFC|https://www.ietf.org/rfc/rfc1738.txt] defines file scheme as > {{file://host/}} or {{file:///}}. > [Wikipedia|https://en.wikipedia.org/wiki/File_URI_scheme] > [Some code > stuffs|https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala#L58] > use different prefix such as {{file:}}. > It would be good to prepare a utility method to correctly add {{file://host}} > or {{file://} prefix. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-15899) file scheme should be used correctly
[ https://issues.apache.org/jira/browse/SPARK-15899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15411052#comment-15411052 ] Bruno C Faria commented on SPARK-15899: --- I have used System.setProperty("spark.sql.warehouse.dir","file:///C:/temp") within my project (with Maven) using Scala IDE in Windows and it's also worked. > file scheme should be used correctly > > > Key: SPARK-15899 > URL: https://issues.apache.org/jira/browse/SPARK-15899 > Project: Spark > Issue Type: Bug > Components: Spark Core >Reporter: Kazuaki Ishizaki >Assignee: Alexander Ulanov > > [A RFC|https://www.ietf.org/rfc/rfc1738.txt] defines file scheme as > {{file://host/}} or {{file:///}}. > [Wikipedia|https://en.wikipedia.org/wiki/File_URI_scheme] > [Some code > stuffs|https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala#L58] > use different prefix such as {{file:}}. > It would be good to prepare a utility method to correctly add {{file://host}} > or {{file://} prefix. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-15899) file scheme should be used correctly
[ https://issues.apache.org/jira/browse/SPARK-15899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15408717#comment-15408717 ] skaarthik commented on SPARK-15899: --- Confirming that overriding the setting ("--conf spark.sql.warehouse.dir=file:///C:/temp") as pointed out by [~arsenvlad] is a good workaround for this issue > file scheme should be used correctly > > > Key: SPARK-15899 > URL: https://issues.apache.org/jira/browse/SPARK-15899 > Project: Spark > Issue Type: Improvement > Components: Spark Core >Reporter: Kazuaki Ishizaki >Priority: Minor > > [A RFC|https://www.ietf.org/rfc/rfc1738.txt] defines file scheme as > {{file://host/}} or {{file:///}}. > [Wikipedia|https://en.wikipedia.org/wiki/File_URI_scheme] > [Some code > stuffs|https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala#L58] > use different prefix such as {{file:}}. > It would be good to prepare a utility method to correctly add {{file://host}} > or {{file://} prefix. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-15899) file scheme should be used correctly
[ https://issues.apache.org/jira/browse/SPARK-15899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15408362#comment-15408362 ] Arsen Vladimirskiy commented on SPARK-15899: Are you using the pre-compiled binaries? The applied fix is probably not there yet. I tried with spark-2.0.0-bin-without-hadoop.tgz distro and got the same error. To workaround the issue for now, I tried adding "spark.sql.warehouse.dir=file:///C:/Experiment/spark-2.0.0-bin-without-hadoop/spark-warehouse" config setting to the conf/spark-defaults.conf file with the "file:///" scheme and it seems to let me read files via the SparkSession (e.g val data = spark.read.format("csv").option("header","false").load("C:/Temp/test.csv")) > file scheme should be used correctly > > > Key: SPARK-15899 > URL: https://issues.apache.org/jira/browse/SPARK-15899 > Project: Spark > Issue Type: Improvement > Components: Spark Core >Reporter: Kazuaki Ishizaki >Priority: Minor > > [A RFC|https://www.ietf.org/rfc/rfc1738.txt] defines file scheme as > {{file://host/}} or {{file:///}}. > [Wikipedia|https://en.wikipedia.org/wiki/File_URI_scheme] > [Some code > stuffs|https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala#L58] > use different prefix such as {{file:}}. > It would be good to prepare a utility method to correctly add {{file://host}} > or {{file://} prefix. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-15899) file scheme should be used correctly
[ https://issues.apache.org/jira/browse/SPARK-15899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15407253#comment-15407253 ] xuyifei commented on SPARK-15899: - The Problem is still exist on windows, when i use spark2.0 to read files but it failed by exception: `Exception in thread "main" java.lang.IllegalArgumentException: java.net.URISyntaxException: Relative path in absolute URI: file:E:/hkl/spark-warehouse` > file scheme should be used correctly > > > Key: SPARK-15899 > URL: https://issues.apache.org/jira/browse/SPARK-15899 > Project: Spark > Issue Type: Improvement > Components: Spark Core >Reporter: Kazuaki Ishizaki >Priority: Minor > > [A RFC|https://www.ietf.org/rfc/rfc1738.txt] defines file scheme as > {{file://host/}} or {{file:///}}. > [Wikipedia|https://en.wikipedia.org/wiki/File_URI_scheme] > [Some code > stuffs|https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala#L58] > use different prefix such as {{file:}}. > It would be good to prepare a utility method to correctly add {{file://host}} > or {{file://} prefix. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-15899) file scheme should be used correctly
[ https://issues.apache.org/jira/browse/SPARK-15899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15345630#comment-15345630 ] Apache Spark commented on SPARK-15899: -- User 'avulanov' has created a pull request for this issue: https://github.com/apache/spark/pull/13868 > file scheme should be used correctly > > > Key: SPARK-15899 > URL: https://issues.apache.org/jira/browse/SPARK-15899 > Project: Spark > Issue Type: Improvement > Components: Spark Core >Reporter: Kazuaki Ishizaki >Priority: Minor > > [A RFC|https://www.ietf.org/rfc/rfc1738.txt] defines file scheme as > {{file://host/}} or {{file:///}}. > [Wikipedia|https://en.wikipedia.org/wiki/File_URI_scheme] > [Some code > stuffs|https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala#L58] > use different prefix such as {{file:}}. > It would be good to prepare a utility method to correctly add {{file://host}} > or {{file://} prefix. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-15899) file scheme should be used correctly
[ https://issues.apache.org/jira/browse/SPARK-15899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15345620#comment-15345620 ] Kazuaki Ishizaki commented on SPARK-15899: -- I think so. As [~sowen] proposed, we may need a utility method to do {{file:///}}. If we do this in the all of places in Spark, the annoying issue is that there are many usages of {{file:}} schema. > file scheme should be used correctly > > > Key: SPARK-15899 > URL: https://issues.apache.org/jira/browse/SPARK-15899 > Project: Spark > Issue Type: Improvement > Components: Spark Core >Reporter: Kazuaki Ishizaki >Priority: Minor > > [A RFC|https://www.ietf.org/rfc/rfc1738.txt] defines file scheme as > {{file://host/}} or {{file:///}}. > [Wikipedia|https://en.wikipedia.org/wiki/File_URI_scheme] > [Some code > stuffs|https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala#L58] > use different prefix such as {{file:}}. > It would be good to prepare a utility method to correctly add {{file://host}} > or {{file://} prefix. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-15899) file scheme should be used correctly
[ https://issues.apache.org/jira/browse/SPARK-15899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15345611#comment-15345611 ] Alexander Ulanov commented on SPARK-15899: -- `user.dir` on Windows starts with a letter: scala> System.getProperty("user.dir") res0: String = C:\Program Files (x86)\scala\bin On Linux it starts with a slash: scala> System.getProperty("user.dir") res0: String = /home/hduser It seems that java.io.File could convert it to a proper URI: Windows: scala> new File("c:/myfile").toURI res6: java.net.URI = file:/c:/myfile Linux: scala> new File("/home/myfile").toURI res3: java.net.URI = file:/home/myfile We can remove "file:" from https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala#L58 and add toURI conversion in https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala#L694 > file scheme should be used correctly > > > Key: SPARK-15899 > URL: https://issues.apache.org/jira/browse/SPARK-15899 > Project: Spark > Issue Type: Improvement > Components: Spark Core >Reporter: Kazuaki Ishizaki >Priority: Minor > > [A RFC|https://www.ietf.org/rfc/rfc1738.txt] defines file scheme as > {{file://host/}} or {{file:///}}. > [Wikipedia|https://en.wikipedia.org/wiki/File_URI_scheme] > [Some code > stuffs|https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala#L58] > use different prefix such as {{file:}}. > It would be good to prepare a utility method to correctly add {{file://host}} > or {{file://} prefix. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-15899) file scheme should be used correctly
[ https://issues.apache.org/jira/browse/SPARK-15899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15345355#comment-15345355 ] Reynold Xin commented on SPARK-15899: - Does this work if we just always do file:///? > file scheme should be used correctly > > > Key: SPARK-15899 > URL: https://issues.apache.org/jira/browse/SPARK-15899 > Project: Spark > Issue Type: Improvement > Components: Spark Core >Reporter: Kazuaki Ishizaki >Priority: Minor > > [A RFC|https://www.ietf.org/rfc/rfc1738.txt] defines file scheme as > {{file://host/}} or {{file:///}}. > [Wikipedia|https://en.wikipedia.org/wiki/File_URI_scheme] > [Some code > stuffs|https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala#L58] > use different prefix such as {{file:}}. > It would be good to prepare a utility method to correctly add {{file://host}} > or {{file://} prefix. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-15899) file scheme should be used correctly
[ https://issues.apache.org/jira/browse/SPARK-15899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15328056#comment-15328056 ] Sean Owen commented on SPARK-15899: --- OK, I had assumed that absolute paths on Windows would have to be specified like {{/c:/paths/to}}. I know that normal Windows paths are {{c:\paths\to}} of course, but this is a somewhat different context. OK, so what is the value of the System property user.dir on Windows? does it start with c: and not /c: ? Yeah, then I see that this needs some special handling. You're right, we could have a utility method that just adds a {{/}} at the start if none is present, but that would silently turn relative paths into non-relative. Can we use the {{File}} or {{Paths}} API to really do this right? it should give a URI object whose string rep is, I hope, exactly what's desired. > file scheme should be used correctly > > > Key: SPARK-15899 > URL: https://issues.apache.org/jira/browse/SPARK-15899 > Project: Spark > Issue Type: Improvement > Components: Spark Core >Reporter: Kazuaki Ishizaki >Priority: Minor > > [A RFC|https://www.ietf.org/rfc/rfc1738.txt] defines file scheme as > {{file://host/}} or {{file:///}}. > [Wikipedia|https://en.wikipedia.org/wiki/File_URI_scheme] > [Some code > stuffs|https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala#L58] > use different prefix such as {{file:}}. > It would be good to prepare a utility method to correctly add {{file://host}} > or {{file://} prefix. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-15899) file scheme should be used correctly
[ https://issues.apache.org/jira/browse/SPARK-15899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15327825#comment-15327825 ] Kazuaki Ishizaki commented on SPARK-15899: -- When I added the two extra slashes, it works on Linux. But, it does not work on Windows. An exception has thrown. (/) Linux: {{file://}} + {{/path/to}} (x) Windows: {{file://}} + {{c:/paths/to}} This is because the difference of the original path format between platforms. I noticed that we have to add the three extra slashes (e.g. {{file:///}}) on Windows while we need to add the two extra slashes. > file scheme should be used correctly > > > Key: SPARK-15899 > URL: https://issues.apache.org/jira/browse/SPARK-15899 > Project: Spark > Issue Type: Improvement > Components: Spark Core >Reporter: Kazuaki Ishizaki >Priority: Minor > > [A RFC|https://www.ietf.org/rfc/rfc1738.txt] defines file scheme as > {{file://host/}} or {{file:///}}. > [Wikipedia|https://en.wikipedia.org/wiki/File_URI_scheme] > [Some code > stuffs|https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala#L58] > use different prefix such as {{file:}}. > It would be good to prepare a utility method to correctly add {{file://host}} > or {{file://} prefix. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-15899) file scheme should be used correctly
[ https://issues.apache.org/jira/browse/SPARK-15899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15327824#comment-15327824 ] Kazuaki Ishizaki commented on SPARK-15899: -- When I added the two extra slashes, it works on Linux. But, it does not work on Windows. An exception has thrown. (/) Linux: {{file://}} + {{/path/to}} (x) Windows: {{file://}} + {{c:/paths/to}} This is because the difference of the original path format between platforms. I noticed that we have to add the three extra slashes (e.g. {{file:///}}) on Windows while we need to add the two extra slashes. > file scheme should be used correctly > > > Key: SPARK-15899 > URL: https://issues.apache.org/jira/browse/SPARK-15899 > Project: Spark > Issue Type: Improvement > Components: Spark Core >Reporter: Kazuaki Ishizaki >Priority: Minor > > [A RFC|https://www.ietf.org/rfc/rfc1738.txt] defines file scheme as > {{file://host/}} or {{file:///}}. > [Wikipedia|https://en.wikipedia.org/wiki/File_URI_scheme] > [Some code > stuffs|https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala#L58] > use different prefix such as {{file:}}. > It would be good to prepare a utility method to correctly add {{file://host}} > or {{file://} prefix. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-15899) file scheme should be used correctly
[ https://issues.apache.org/jira/browse/SPARK-15899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15326540#comment-15326540 ] Sean Owen commented on SPARK-15899: --- I don't think that's what the reference says. A host follows the second slash, and may be empty, in all cases. > file scheme should be used correctly > > > Key: SPARK-15899 > URL: https://issues.apache.org/jira/browse/SPARK-15899 > Project: Spark > Issue Type: Improvement > Components: Spark Core >Reporter: Kazuaki Ishizaki >Priority: Minor > > [A RFC|https://www.ietf.org/rfc/rfc1738.txt] defines file scheme as > {{file://host/}} or {{file:///}}. > [Wikipedia|https://en.wikipedia.org/wiki/File_URI_scheme] > [Some code > stuffs|https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala#L58] > use different prefix such as {{file:}}. > It would be good to prepare a utility method to correctly add {{file://host}} > or {{file://} prefix. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-15899) file scheme should be used correctly
[ https://issues.apache.org/jira/browse/SPARK-15899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15326537#comment-15326537 ] Kazuaki Ishizaki commented on SPARK-15899: -- Thank you for your comments. It is a little bit complicated since there is a difference among platforms as described [here|https://en.wikipedia.org/wiki/File_URI_scheme#Examples]. On Linux, we have to make a prefix {{file://}} (add the two extra slashes) to {{file:}} since a file path that we get is {{/path/to}}. On Windows, we have to make a prefix {{file:///}} (add the three extra slashes) to {{file:}} since a file path that we get is {{c:/windows}}. How will we address this? > file scheme should be used correctly > > > Key: SPARK-15899 > URL: https://issues.apache.org/jira/browse/SPARK-15899 > Project: Spark > Issue Type: Improvement > Components: Spark Core >Reporter: Kazuaki Ishizaki >Priority: Minor > > [A RFC|https://www.ietf.org/rfc/rfc1738.txt] defines file scheme as > {{file://host/}} or {{file:///}}. > [Wikipedia|https://en.wikipedia.org/wiki/File_URI_scheme] > [Some code > stuffs|https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala#L58] > use different prefix such as {{file:}}. > It would be good to prepare a utility method to correctly add {{file://host}} > or {{file://} prefix. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org