[GitHub] spark pull request: [Spark-11522][SQL] input_file_name() returns "...

yhuai Sat, 14 Nov 2015 19:15:29 -0800

Github user yhuai commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9542#discussion_r44866968
  
    --- Diff: 
sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveUDFSuite.scala 
---
    @@ -356,6 +356,66 @@ class HiveUDFSuite extends QueryTest with 
TestHiveSingleton {
     
         sqlContext.dropTempTable("testUDF")
       }
    +
    +  test("SPARK-11522 select input_file_name from non-parquet table"){
    +
    +    // EXTERNAL OpenCSVSerde table pointing to LOCATION
    +
    +    val location1 = 
Utils.getSparkClassLoader.getResource("data/files/csv_table").getFile
    +    sql(s"""CREATE EXTERNAL TABLE csv_table(page_id INT, impressions INT)
    +         ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde'
    +         WITH SERDEPROPERTIES (
    +           \"separatorChar\" = \",\",
    +           \"quoteChar\"     = \"\\\"\",
    +           \"escapeChar\"    = \"\\\\\")
    +         LOCATION '$location1'""")
    +
    +    val answer1 = sql("select input_file_name() from 
csv_table").head().getString(0)
    +    assert(answer1.contains(location1))
    +    assert(sql("select input_file_name() from 
csv_table").distinct().collect().length == 2)
    +    sql("DROP TABLE csv_table")
    +
    +    // EXTERNAL pointing to LOCATION
    +
    +    val location2 = 
Utils.getSparkClassLoader.getResource("data/files/external_t5").getFile
    +    sql(s"""CREATE EXTERNAL table external_t5 (c1 int, c2 int)
    +        row format delimited fields terminated by ','
    +        location '$location2'""")
    +
    +    val answer2 = sql("SELECT input_file_name() as file FROM 
external_t5").head().getString(0)
    +    assert(answer2.contains("external_t5"))
    +    assert(sql("SELECT input_file_name() as file FROM external_t5")
    +      .distinct().collect().length == 1)
    +    sql("DROP TABLE external_t5")
    +
    +   // External parquet pointing to LOCATION
    +
    +    val location3 = 
Utils.getSparkClassLoader.getResource("data/files/external_parquet").getFile
    +    sql(s"""CREATE EXTERNAL table external_parquet(c1 int, c2 int)
    +        stored as parquet
    +        LOCATION '$location3'""")
    +
    +    val answer3 = sql("SELECT input_file_name() as file FROM 
external_parquet")
    +      .head().getString(0)
    +    assert(answer3.contains("external_parquet"))
    +    assert(sql("SELECT input_file_name() as file FROM external_parquet")
    +      .distinct().collect().length == 1)
    +    sql("DROP TABLE external_parquet")
    +
    +    // Non-External parquet pointing to /tmp/...
    --- End diff --
    
    Seems we do not need to say where it points to since it is a managed table.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request: [Spark-11522][SQL] input_file_name() returns "...

Reply via email to