[GitHub] spark pull request #17973: [SPARK-20731][SQL] Add ability to change or omit ...

HyukjinKwon Sat, 13 May 2017 06:08:23 -0700

Github user HyukjinKwon commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17973#discussion_r116358020
  
    --- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/csv/CSVSuite.scala
 ---
    @@ -622,6 +622,31 @@ class CSVSuite extends QueryTest with SharedSQLContext 
with SQLTestUtils {
         }
       }
     
    +  test("save tsv with tsv suffix") {
    +    withTempDir { dir =>
    +      val csvDir = new File(dir, "csv").getCanonicalPath
    +      val cars = spark.read
    +        .format("csv")
    +        .option("header", "true")
    +        .load(testFile(carsFile))
    +
    +      cars.coalesce(1).write
    +        .option("header", "true")
    +        .option("fileExtension", ".tsv")
    +        .option("delimiter", "\t")
    --- End diff --
    
    I would like to suggest to leave this out if there is no better reason for 
now. Downside of this is, it looks this allows arbitrary name and it does not 
gurantee the extention is, say, tsv when the delmiter is a tab. It is purely up 
to the user.
    
    I added those extentions long ago and one of the motivation was auto 
detection of datasource like Haddop does (which we ended up with not adding it 
yet due to the cost of listing files and etc).



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request #17973: [SPARK-20731][SQL] Add ability to change or omit ...

Reply via email to