Github user dilipbiswal commented on the issue:
https://github.com/apache/spark/pull/15190
@yhuai We will use Parquet format in your example. We look at ```SQL
spark.sql.sources.default ``` configuration to decide on the format to use ?
Here is the output for your perusal.
``` SQL
spark-sql> set spark.sql.hive.convertCTAS=true;
spark.sql.hive.convertCTAS true
Time taken: 3.309 seconds, Fetched 1 row(s)
spark-sql> set hive.default.fileformat=orc;
hive.default.fileformat orc
Time taken: 0.053 seconds, Fetched 1 row(s)
spark-sql> CREATE TABLE IF NOT EXISTS test select 1 from foo;
spark-sql> describe formatted test;
...
# Storage Information
SerDe Library:
org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe
InputFormat:
org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat
OutputFormat:
org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat
```
Now change ```spark.sql.sources.default=orc```
```SQL
spark-sql> set spark.sql.sources.default=orc;
spark.sql.sources.default orc
spark-sql> CREATE TABLE IF NOT EXISTS test2 select 1 from foo;
Time taken: 0.451 seconds
spark-sql> describe formatted test2;
...
# Storage Information
SerDe Library: org.apache.hadoop.hive.ql.io.orc.OrcSerde
InputFormat: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
OutputFormat: org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat
```
Please let me know if you have any further questions.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]