[jira] [Updated] (SPARK-44166) Enable dynamicPartitionOverwrite in SaveAsHiveFile for insert overwrite

Pralabh Kumar (Jira) Sat, 24 Jun 2023 04:58:03 -0700


     [ 
https://issues.apache.org/jira/browse/SPARK-44166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Pralabh Kumar updated SPARK-44166:
----------------------------------
    Description: 
Currently in InsertIntoHiveTable.scala , there is no way to pass 
dynamicPartitionOverwrite to true , when calling  saveAsHiveFile . When 
dynamicPartitioOverwrite is true , spark will use 
built-in FileCommitProtocol instead of Hadoop FileOutputCommitter , which is 
more performant. 

 

Here is the solution . 

When inserting overwrite into Hive table

 

Current code 

 
{code:java}
val writtenParts = saveAsHiveFile(
  sparkSession = sparkSession,
  plan = child,
  hadoopConf = hadoopConf,
  fileFormat = fileFormat,
  outputLocation = tmpLocation.toString,
  partitionAttributes = partitionColumns,
  bucketSpec = bucketSpec,
  options = options)
       {code}
 

 

Proposed code. 

 

 

 

 

  was:
Currently in InsertIntoHiveTable.scala , there is no way to pass 
dynamicPartitionOverwrite to true , when calling  saveAsHiveFile . When 
dynamicPartitioOverwrite is true , spark will use 
built-in FileCommitProtocol instead of Hadoop FileOutputCommitter , which is 
more performant. 


> Enable dynamicPartitionOverwrite in SaveAsHiveFile for insert overwrite
> -----------------------------------------------------------------------
>
>                 Key: SPARK-44166
>                 URL: https://issues.apache.org/jira/browse/SPARK-44166
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 3.4.1
>            Reporter: Pralabh Kumar
>            Priority: Minor
>
> Currently in InsertIntoHiveTable.scala , there is no way to pass 
> dynamicPartitionOverwrite to true , when calling  saveAsHiveFile . When 
> dynamicPartitioOverwrite is true , spark will use 
> built-in FileCommitProtocol instead of Hadoop FileOutputCommitter , which is 
> more performant. 
>  
> Here is the solution . 
> When inserting overwrite into Hive table
>  
> Current code 
>  
> {code:java}
> val writtenParts = saveAsHiveFile(
>   sparkSession = sparkSession,
>   plan = child,
>   hadoopConf = hadoopConf,
>   fileFormat = fileFormat,
>   outputLocation = tmpLocation.toString,
>   partitionAttributes = partitionColumns,
>   bucketSpec = bucketSpec,
>   options = options)
>        {code}
>  
>  
> Proposed code. 
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Updated] (SPARK-44166) Enable dynamicPartitionOverwrite in SaveAsHiveFile for insert overwrite

Reply via email to