[jira] [Commented] (IGNITE-10314) Spark dataframe will get wrong schema if user executes add/drop column DDL

Ray (JIRA) Mon, 19 Nov 2018 01:52:12 -0800


    [ 
https://issues.apache.org/jira/browse/IGNITE-10314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16691485#comment-16691485
 ]


Ray commented on IGNITE-10314:
------------------------------

Currently, when user performs add/remove column DDL, the QueryEntity will not 
change.

This result in Spark getting wrong schema because Spark relies on QueryEntity 
to construct data frame schema.

After [~vozerov]'s reply in dev list, 
[http://apache-ignite-developers.2346864.n4.nabble.com/Schema-in-CacheConfig-is-not-updated-after-DDL-commands-Add-drop-column-Create-drop-index-td38002.html.]

This behavior is by design, so I decide to fix this issue from the Spark side.

 

So I propose this solution, instead of getting schema by QueryEntity I want to 
get schema by a SQL select command.

[~NIzhikov], what do you think about this solution?

If you think this solution OK then I'll start implementing.

> Spark dataframe will get wrong schema if user executes add/drop column DDL
> --------------------------------------------------------------------------
>
>                 Key: IGNITE-10314
>                 URL: https://issues.apache.org/jira/browse/IGNITE-10314
>             Project: Ignite
>          Issue Type: Bug
>          Components: spark
>    Affects Versions: 2.3, 2.4, 2.5, 2.6, 2.7
>            Reporter: Ray
>            Assignee: Ray
>            Priority: Critical
>             Fix For: 2.8
>
>
> When user performs add/remove column in DDL,  Spark will get the old/wrong 
> schema.
>  
> Analyse 
> Currently Spark data frame API relies on QueryEntity to construct schema, but 
> QueryEntity in QuerySchema is a local copy of the original QueryEntity, so 
> the original QueryEntity is not updated when modification happens.
>  
> Solution
> Get the schema using sql, get rid of QueryEntity.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (IGNITE-10314) Spark dataframe will get wrong schema if user executes add/drop column DDL

Reply via email to