[GitHub] spark issue #15190: [SPARK-17620][SQL] Determine Serde by hive.default.filef...

dilipbiswal Tue, 11 Oct 2016 11:10:02 -0700

Github user dilipbiswal commented on the issue:

    https://github.com/apache/spark/pull/15190
  
    @yhuai We will use Parquet format in your example. We look at ```SQL 
spark.sql.sources.default ``` configuration to decide on the format to use ?
    
    Here is the output for your perusal.
    
    ``` SQL
    spark-sql> set spark.sql.hive.convertCTAS=true;
    spark.sql.hive.convertCTAS  true
    Time taken: 3.309 seconds, Fetched 1 row(s)
    spark-sql> set hive.default.fileformat=orc;
    hive.default.fileformat     orc
    Time taken: 0.053 seconds, Fetched 1 row(s)
    spark-sql> CREATE TABLE IF NOT EXISTS test select 1 from foo;
    spark-sql> describe formatted test;
    ...
    # Storage Information               
    SerDe Library:      
org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe     
    InputFormat:        
org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat   
    OutputFormat:       
org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat  
    ```
    
    Now change ```spark.sql.sources.default=orc```
    ```SQL
    spark-sql> set spark.sql.sources.default=orc;
    spark.sql.sources.default   orc
    spark-sql> CREATE TABLE IF NOT EXISTS test2 select 1 from foo;
    Time taken: 0.451 seconds
    spark-sql> describe formatted test2;
    ...
    # Storage Information               
    SerDe Library:      org.apache.hadoop.hive.ql.io.orc.OrcSerde       
    InputFormat:        org.apache.hadoop.hive.ql.io.orc.OrcInputFormat 
    OutputFormat:       org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat        
    ``` 
    
    Please let me know if you have any further questions.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark issue #15190: [SPARK-17620][SQL] Determine Serde by hive.default.filef...

Reply via email to