[
https://issues.apache.org/jira/browse/IGNITE-10314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16691485#comment-16691485
]
Ray commented on IGNITE-10314:
------------------------------
Currently, when user performs add/remove column DDL, the QueryEntity will not
change.
This result in Spark getting wrong schema because Spark relies on QueryEntity
to construct data frame schema.
After [~vozerov]'s reply in dev list,
[http://apache-ignite-developers.2346864.n4.nabble.com/Schema-in-CacheConfig-is-not-updated-after-DDL-commands-Add-drop-column-Create-drop-index-td38002.html.]
This behavior is by design, so I decide to fix this issue from the Spark side.
So I propose this solution, instead of getting schema by QueryEntity I want to
get schema by a SQL select command.
[~NIzhikov], what do you think about this solution?
If you think this solution OK then I'll start implementing.
> Spark dataframe will get wrong schema if user executes add/drop column DDL
> --------------------------------------------------------------------------
>
> Key: IGNITE-10314
> URL: https://issues.apache.org/jira/browse/IGNITE-10314
> Project: Ignite
> Issue Type: Bug
> Components: spark
> Affects Versions: 2.3, 2.4, 2.5, 2.6, 2.7
> Reporter: Ray
> Assignee: Ray
> Priority: Critical
> Fix For: 2.8
>
>
> When user performs add/remove column in DDL, Spark will get the old/wrong
> schema.
>
> Analyse
> Currently Spark data frame API relies on QueryEntity to construct schema, but
> QueryEntity in QuerySchema is a local copy of the original QueryEntity, so
> the original QueryEntity is not updated when modification happens.
>
> Solution
> Get the schema using sql, get rid of QueryEntity.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)