[ 
https://issues.apache.org/jira/browse/SPARK-27555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hui WANG updated SPARK-27555:
-----------------------------
    Description: 
I already seen https://issues.apache.org/jira/browse/SPARK-17620 

and https://issues.apache.org/jira/browse/SPARK-18397

and I check source code of Spark for the change of  set 
"spark.sql.hive.covertCTAS=true" and then spark will use 
"spark.sql.sources.default" which is parquet as storage format in "create table 
as select" scenario.

But my case is just create table without select. When I set  
hive.default.fileformat=parquet in hive-site.xml or set  
spark.hadoop.hive.default.fileformat=parquet in spark-defaults.conf, after 
create a table, when i check the hive table, it still use textfile fileformat.

 

It seems HiveSerDe gets the value of the hive.default.fileformat parameter from 
SQLConf

The parameter values in SQLConf are copied from SparkContext's SparkConf at 
SparkSession initialization, while the configuration parameters in 
hive-site.xml are loaded into SparkContext's hadoopConfiguration parameters by 
SharedState, And all the config with "spark.hadoop" conf are setted to 
hadoopconfig, so the configuration does not take effect.

 

 

 

  was:
I already seen https://issues.apache.org/jira/browse/SPARK-17620 

and https://issues.apache.org/jira/browse/SPARK-18397

and I check source code of Spark for the change of  set 
"spark.sql.hive.covertCTAS=true" and then spark will use 
"spark.sql.sources.default" which is parquet as storage format in "create table 
as select" scenario.

But my case is just create table without select. When I set  
hive.default.fileformat=parquet in hive-site.xml or set  
spark.hadoop.hive.default.fileformat=parquet in spark-defaults.conf, after 
create a table, when i check the hive table, it still use textfile fileformat.

 

It seems HiveSerDe gets the value of the hive.default.fileformat parameter from 
SQLConf

The parameter values in SQLConf are copied from SparkContext's SparkConf at 
SparkSession initialization, while the configuration parameters in 
hive-site.xml are loaded into SparkContext's hadoopConfiguration parameters by 
SharedState, And all the config with "spark.hadoop" conf are setted to 
hadoopconfig, so the configuration does not take effect.

 

 

Try1:

vi /opt/spark/conf/spark-defaults.conf

hive.default.fileformat parquetfile

Then open spark-sql, It tells me

 

 

Try2:

vi /opt/spark/conf/spark-defaults.conf

spark.hadoop.hive.default.fileformat parquetfile

The open spark-sql

 

 

And Then, I set hive.default.fileformat directly in current spark-sql repl,

 

 

 

Try3:

Edit hive-site.xml directly.

vi hive-site.xml

 

The open spark-sql

 


> cannot create table by using the hive default fileformat in both 
> hive-site.xml and spark-defaults.conf
> ------------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-27555
>                 URL: https://issues.apache.org/jira/browse/SPARK-27555
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.3.2
>            Reporter: Hui WANG
>            Priority: Major
>
> I already seen https://issues.apache.org/jira/browse/SPARK-17620 
> and https://issues.apache.org/jira/browse/SPARK-18397
> and I check source code of Spark for the change of  set 
> "spark.sql.hive.covertCTAS=true" and then spark will use 
> "spark.sql.sources.default" which is parquet as storage format in "create 
> table as select" scenario.
> But my case is just create table without select. When I set  
> hive.default.fileformat=parquet in hive-site.xml or set  
> spark.hadoop.hive.default.fileformat=parquet in spark-defaults.conf, after 
> create a table, when i check the hive table, it still use textfile fileformat.
>  
> It seems HiveSerDe gets the value of the hive.default.fileformat parameter 
> from SQLConf
> The parameter values in SQLConf are copied from SparkContext's SparkConf at 
> SparkSession initialization, while the configuration parameters in 
> hive-site.xml are loaded into SparkContext's hadoopConfiguration parameters 
> by SharedState, And all the config with "spark.hadoop" conf are setted to 
> hadoopconfig, so the configuration does not take effect.
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to