[jira] [Commented] (PHOENIX-2745) The spark savemode not work correctly

Josh Mahonin (JIRA) Fri, 04 Mar 2016 08:57:06 -0800

    [ 
https://issues.apache.org/jira/browse/PHOENIX-2745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15180137#comment-15180137
 ]


Josh Mahonin commented on PHOENIX-2745:
---------------------------------------

Very interesting [~lichenglingl]. Do you have any examples of other data 
sources exhibiting this 'drop' -> 'reload' behaviour for SaveMode.Overwrite?

When I'd first written the integration, 'Overwrite' seemed to me the most 
correct, based on this documentation of the SaveMode class [1]:
"if data/table already exists, existing data is expected to be overwritten by 
the contents of the DataFrame"

However, if other data sources use 'Append' to the same effect, it might be 
best to use that as the default behaviour. Follow-up work would then be to look 
at doing a DROP or DELETE in the case of SaveMode.Overwrite.

[1] 
https://spark.apache.org/docs/1.6.0/api/java/org/apache/spark/sql/SaveMode.html

> The spark savemode not work correctly
> -------------------------------------
>
>                 Key: PHOENIX-2745
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-2745
>             Project: Phoenix
>          Issue Type: Bug
>    Affects Versions: 4.6.0
>            Reporter: lichenglin
>
> When saving a dataframe to spark with the mode SaveMode.Overwrite 
> spark will drop the table  first and load the new dataframe 
> but phoinex just replace the old data to the new data according to the 
> primary key 
> the old datas still exsits.
> the overwrite actually work as append



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (PHOENIX-2745) The spark savemode not work correctly

Reply via email to